Flanagan, however, has made impressive progress with his computer
simulations of vocal tract excitation (Flanagan and Landgraf 1968;
Flanagan and Cherry 1969). Voicing is represented by the output of
a system consisting of two masses, corresponding to the vocal cords,
oscillating so as to vary the cross-sectional area of the passage
between them, corresponding to the glottis. At one end of the passage
is a source of air varying in pressure, representing the lungs; at
the other end is a vocal tract analog. In response to the subglottal
air pressure, the displacement of each mass increases, as does the air
flow through the glottis. But this increase in air flow results in an
increase in negative Bernouilli pressure between the two masses,
reducing the displacement, so that oscillation occurs. The frequency
of the oscillation varies with the subglottal pressure, with the size
of the two masses and their stiffness (vocal-cord tension), and with
the acoustic impedance of the vocal tract analog, which depends on the
phone being synthesized. Thus the model allows simulation of the
separate roles of lung pressure and cord tension in determining Fo
and takes account of interaction with the supraglottal tract. The
same model serves to simulate frication, which occurs at a constriction
in the supraglottal tract when the constriction is sufficiently narrow
and the pressure behind sufficiently great. When both glottal and
fricative excitation are present, as in a voiced fricative, the pattern
of pitch-synchronous bursts in the noise is simulated
by the model.
4.6 Synthesis Using Phonological Rules
As we have just seen, a substantial amount of research effort in
speech synthesis by rule has been concerned with what we have called
phonetic capacity. Other components -- phonetic skill, phonological
competence and capacity -- have received relatively little attention.
There are various reasons for this. The quality of speech synthesized
by rule has only quite recently been good enough to serve as a vehicle
for research in these other components; moreover, many of those
engaged in synthesis by rule have been content to operate with a
fairly rough and ready view of phonology, because they are more
interested in the physical aspects of speech, either acoustic or
articulatory, than with phonetic capacity as such, or its relationship
to other components. A few years ago this might have mattered much
less; but recent impressive developments in generative phonology make
it important that synthesis by rule display greater sophistication in
this area if linguists are to take it seriously.
A few scattered efforts have been made. In our own work (Mattingly
1968a), we have drawn a distinction between the synthesis by rule
program with the associated hardware, representing the universal
aspects of speech (phonological and phonetic capacity) and the rules
of a particular language or dialect (phonological competence) which
were an input to the program. In practice this distinction is not
made consistently: certain matters are handled in the rules which
more properly belong in the program, and conversely.
|