KLATT 1987, p. 766 |
shown at the bottom in Fig. 28. They were quite successful for this vowel, which is fortunate, because the general technique is essentially identical to the synthesis strategies used in Klattalk and other phonemic rule programs to generate consonant-vowel-consonant formant patterns. These data should be augmented for other vowels, but the analysis task is formidable, so only partial data are available on some vowels in symmetrical CVC contexts (Stevens and House, 1963). Similarly, the same kinds of studies should be performed for other speakers, and at several syllable durations and degrees of stress. Of particular interest are rules that modify vowels in sentence contexts depending on consonantal context, stress, and duration. From what little is known, it appears that the vowel space shrinks when one goes from words spoken in isolation to sentences (Fant et al., 1974; Shearme and Holmes, 1961), but it is not clear whether vowels tend to neutralize toward schwa, or simply accede to articulatory demands of adjacent consonants (Lindblom, 1963). It is possible that some of the subjective impression of unnaturalness and "foreign dialect" of synthesis-by-rule systems can be attributed to insufficient attention to details of this sort, both known and those yet to be discovered. The formant transitions for a CV syllable depend to some extent on the nature of the phonetic segment that precedes the consonant. Öhman (1966) has published data on formant motions for [b,d,g] in different VCV environments that demonstrate significant interactions (Fig. 29), and Martin and Bunnell (1982) have shown that listeners expect these coarticulatory shifts -- subjects show reaction time deficits when the formant shifts are not present.
Text-to-speech systems have only begun to simulate the details of the
phenomena noted in this section (Coker et al., 1973; Klatt, 1976b).
Klattalk now contains a separate subroutine for allophonic substitution
rules, as well as many detailed parameter adjustment rules in the part
of the program concerned with drawing parameter values for phonetic
sequences. Taken in total, these rules characterize all of the
allophonic variants and word-boundary cues described here, although
the rules are simplified and generalized beyond the available data
in such a way that they probably do not adequately represent the
environments where the rule should be
|
KLATT 1987, p. 766 |
SSSHP Contents | Labs | |
Smithsonian Speech Synthesis History Project | |
National Museum of American History | Archives Center | |
Smithsonian Institution | Privacy | Terms of Use |