NMAH | Smithsonian Speech Synthesis History Project (im

handled really adequately by the systems we have discussed. Having recognized that the targets for each articulator must be separately described, we must now try to account in some principled way for the coordination of the movements of the various articulators towards their targets. Present systems, in which each phone is dealt with in turn, and is affected by the preceding and following phone but no others (with the exception of arrangements in some systems to nasalize several preceding phones), are too restrictive to account for the fact that coarticulation may extend over several phones (Kozhevnikov and Chistovich 1965). The assumption of these systems is that the changing of targets for the various articulators is synchronized phone by phone -- an assumption which works empirically after a fashion, but masks the real problem of how and to what extent the movements of articulators are synchronized, and how much account phonetic capacity must take of the synchronization process. It has frequently been suggested that the syllable, which certainly seems to have psychological reality -- and therefore some role in phonetic capacity -- is the unit of coarticulation. Clearly there is a need of a synthesis by rule system which explores this possibility.

Another area which needs a great deal of further attention is the nature of the demarcation between phonetic capacity and phonological competence. We want to reflect this separation as clearly as possible in a synthesis by rule system; unfortunately, it is not always possible to distinguish in particular cases between 'intrinsic' allophones (belonging to phonetic capacity) and 'extrinsic' allophones (belonging to phonological competence) (Wang and Fillmore 1961). Moreover, Tatham (1969b) has argued that since such an 'intrinsic' difference as front v. back [k] can be distinctive in some languages, we have to provide for countermanding, in special cases, of a normal rule of phonetic capacity by phonological competence. This is actually, as Tatham points out, a problem relating to 'markedness', an issue involving the relationship between the two components which has concerned phonologists from Troubetskoy (1939: 79) and the Prague School to Chomsky and Halle (1968: 402ff.).

Finally, we can also look forward to increasing our understanding of the elusive matter of phonetic skill through synthesis by rule. The systems we have discussed all assume an ideal or at least typical speaker with a consistent style; questions of phonetic skill are avoided. But given some reasonably satisfactory representation of other components, we can begin to derive auxiliary sets of rules representing phonetic skill, and consisting of a series of modifications to the parts of the system representing phonological competence and phonetic capacity. Suppose, for instance, that we wish to investigate the productive and perceptual factors of speaker variation. These are matters in part of the physical characteristics of the speaker (and so will involve adjustments of the synthesizer itself) but also of phonetic skill. At present the preferred methods of study are subjective ratings of speakers and examination of spectrograms. But it should be possible, using synthesis by rule, to try to mimic speakers and to study listeners' perceptions of such mimicry under

	SSSHP Contents \| Labs
Smithsonian Speech Synthesis History Project
National Museum of American History \| Archives Center
Smithsonian Institution \| Privacy \| Terms of Use