NMAH | Smithsonian Speech Synthesis History Project (im

evident in prosodic synthesis. Though much of this work is motivated, in the first instance, by an interest in the physical aspects of speech production, it is clearly also leading toward an increased understanding of phonetic capacity.

In the future, it is desirable, first of all, that synthesis by rule become a tool for the study of phonological capacity and competence. In practice, this would mean the development, according to the principles of modern generative grammar, of phonological descriptions of languages, the outputs of which would be converted to speech by a synthesis system. Such enterprises would be valuable for two reasons: first, they would tend to correct the present rather off-hand conceptions of phonology entertained by many of those who are now doing synthesis by rule. Second, they would, on the other hand, compel the phonologist to relate his descriptive rules to some clearly-defined concept of phonetic capacity, and permit him to test utterances produced by his phonological rules against the intuitions of the native speaker. When the generative grammarian is doing syntax, it is quite natural for him to offer examples in a form such that any native speaker can determine their grammaticality; the grammarian should be able to operate on the same basis when he is doing phonology, and he can, if he uses synthesis by rule. It would also seem desirable to increase considerably the number of dialects and languages for which rules for synthesis have been written. Most of the work thus far has been done in English and Japanese; many other languages should be synthesized as part of a general effort to explore the different versions of phonological competence.

The use of synthesis by rule to study phonological competence and capacity will, of course, compel attention to the central problem of phonetic capacity, that of enumerating and defining the universal set of phonetic features, and in the process giving increased psychological meaning to the notion 'feature'. This means, in practice, the development of systems in which phonetic feature matrices are the input to the part of the program which simulates phonetic capacity. Though various feature-like entities have played a part in several of the systems we have discussed, none of these systems really represents a consistent attempt to synthesize speech using what the phonologist would regard as phonetic features. Many problems must still be worked out. For example, a feature involving a particular articulator can be equated with a gesture of the articulator toward a particular target, but the synthesis by rule system must somehow define just what it is that accounts for the psychological unity of this gesture, regardless of the original position of the articulator. For manner features, the problem is still more acute: the system must explain how features such as 'stop-continuant' can be given a plausible unitary definition in terms of phonetic capacity, even though physically quite different articulations may be used for the production of the various stops and continuants. If the feature is defined in part by feedback of some kind, this must be part of the synthesis system.

Nor is the matter of the translation from the discrete to the continuous as yet

	SSSHP Contents \| Labs
Smithsonian Speech Synthesis History Project
National Museum of American History \| Archives Center
Smithsonian Institution \| Privacy \| Terms of Use