NMAH | Smithsonian Speech Synthesis History Project (im

Flanagan, however, has made impressive progress with his computer simulations of vocal tract excitation (Flanagan and Landgraf 1968; Flanagan and Cherry 1969). Voicing is represented by the output of a system consisting of two masses, corresponding to the vocal cords, oscillating so as to vary the cross-sectional area of the passage between them, corresponding to the glottis. At one end of the passage is a source of air varying in pressure, representing the lungs; at the other end is a vocal tract analog. In response to the subglottal air pressure, the displacement of each mass increases, as does the air flow through the glottis. But this increase in air flow results in an increase in negative Bernouilli pressure between the two masses, reducing the displacement, so that oscillation occurs. The frequency of the oscillation varies with the subglottal pressure, with the size of the two masses and their stiffness (vocal-cord tension), and with the acoustic impedance of the vocal tract analog, which depends on the phone being synthesized. Thus the model allows simulation of the separate roles of lung pressure and cord tension in determining Fo and takes account of interaction with the supraglottal tract. The same model serves to simulate frication, which occurs at a constriction in the supraglottal tract when the constriction is sufficiently narrow and the pressure behind sufficiently great. When both glottal and fricative excitation are present, as in a voiced fricative, the pattern of pitch-synchronous bursts in the noise is simulated by the model.

4.6 Synthesis Using Phonological Rules

As we have just seen, a substantial amount of research effort in speech synthesis by rule has been concerned with what we have called phonetic capacity. Other components -- phonetic skill, phonological competence and capacity -- have received relatively little attention. There are various reasons for this. The quality of speech synthesized by rule has only quite recently been good enough to serve as a vehicle for research in these other components; moreover, many of those engaged in synthesis by rule have been content to operate with a fairly rough and ready view of phonology, because they are more interested in the physical aspects of speech, either acoustic or articulatory, than with phonetic capacity as such, or its relationship to other components. A few years ago this might have mattered much less; but recent impressive developments in generative phonology make it important that synthesis by rule display greater sophistication in this area if linguists are to take it seriously.

A few scattered efforts have been made. In our own work (Mattingly 1968a), we have drawn a distinction between the synthesis by rule program with the associated hardware, representing the universal aspects of speech (phonological and phonetic capacity) and the rules of a particular language or dialect (phonological competence) which were an input to the program. In practice this distinction is not made consistently: certain matters are handled in the rules which more properly belong in the program, and conversely.

	SSSHP Contents \| Labs
Smithsonian Speech Synthesis History Project
National Museum of American History \| Archives Center
Smithsonian Institution \| Privacy \| Terms of Use