evident in prosodic synthesis. Though much of this work is motivated,
in the first instance, by an interest in the physical aspects of
speech production, it is clearly also leading toward an increased
understanding of phonetic capacity.
In the future, it is desirable, first of all, that synthesis by rule
become a tool for the study of phonological capacity and competence.
In practice, this would mean the development, according to the
principles of modern generative grammar, of phonological descriptions
of languages, the outputs of which would be converted to speech by a
synthesis system. Such enterprises would be valuable for two reasons:
first, they would tend to correct the present rather off-hand
conceptions of phonology entertained by many of those who are now doing
synthesis by rule. Second, they would, on the other hand, compel the
phonologist to relate his descriptive rules to some clearly-defined
concept of phonetic capacity, and permit him to test utterances
produced by his phonological rules against the intuitions of the
native speaker. When the generative grammarian is doing syntax, it
is quite natural for him to offer examples in a form such that any
native speaker can determine their grammaticality; the grammarian
should be able to operate on the same basis when he is doing phonology,
and he can, if he uses synthesis by rule. It would also seem desirable
to increase considerably the number of dialects and languages for which
rules for synthesis have been written. Most of the work thus far has
been done in English and Japanese; many other languages should be
synthesized as part of a general effort to explore the different
versions of phonological competence.
The use of synthesis by rule to study phonological competence and
capacity will, of course, compel attention to the central problem
of phonetic capacity, that of enumerating and defining the universal
set of phonetic features, and in the process giving increased
psychological meaning to the notion 'feature'. This means, in practice,
the development of systems in which phonetic feature matrices are the
input to the part of the program which simulates phonetic capacity.
Though various feature-like entities have played a part in several of
the systems we have discussed, none of these systems really represents
a consistent attempt to synthesize speech using what the phonologist
would regard as phonetic features. Many problems must still be worked
out. For example, a feature involving a particular articulator can be
equated with a gesture of the articulator toward a particular target,
but the synthesis by rule system must somehow define just what it is
that accounts for the psychological unity of this gesture, regardless
of the original position of the articulator. For manner features, the
problem is still more acute: the system must explain how features such
as 'stop-continuant' can be given a plausible unitary definition in
terms of phonetic capacity, even though physically quite different
articulations may be used for the production of the various stops and
continuants. If the feature is defined in part by feedback of some
kind, this must be part of the synthesis system.
Nor is the matter of the translation from the discrete to the continuous
as yet
|