of the stop-fricative attribute; F2 and F3 loci and the weighting
factor, on the place-of-articulation attribute. Given the boundary
value, steady-state values and transition times, transitions are
calculated as by Holmes et al.
Rao and Thosar resort to stored values for vowel spectra and vowel
durations; Kim (1966), however, proposes that even these matters
can be systematically treated. For example, his translation from
distinctive feature values to formant frequencies is made by defining
the features in terms of 'degrees' of difference from
the
frequencies. From the value assigned to one degree, and
the
frequencies, the frequencies of other vowels are calculated by means
of such rules as 'if High, -2d'. The formant frequency values
determined in this way agree well with the data in the literature.
However, since the degree values are not predicted on any principled
basis, but are arrived at inductively by an averaging procedure
applied to this same data, the agreement is hardly surprising and
does not represent any interesting advance over stored values.
Several of these systems have been empirically successful in that
they have proved capable of consistently producing intelligible
speech. They also have enough theoretical plausibility to be used
in investigations of other components. One could, for example, use
them to test phonological rules proposed for a language (Mattingly
1971). But they are still inadequate because their working assumption
is that phonetic capacity can be adequately described at the acoustic
level. If this were so, a simple and consistent correspondence would
hold between phonetic features and acoustic events. But in fact the
correspondence is only partial. On the one hand, certain regularities
are observable, which can be exploited in a synthesis-by-rule system,
as Liberman et al. (1959) pointed out: F1 and F2 transitions and the
type of acoustic activity during stop closure provide a basis for a
purely acoustic classification of labial, dental and velar voiced
stops, voiceless stops and nasals. On the other hand, the cues for
a particular feature, regarded simply from an acoustic standpoint,
are a rather arbitrary collection of events. There seems no special
reason why a fall in F1, a 60-150 msec. gap, a burst, and a rise of F1
should all be cues for a stop consonant, and no obvious connection
between the locus frequency and the burst frequency of a stop at the
same place of articulation. These cues only make sense in articulatory
terms. Still, the apparent arbitrariness of the cues should not in
itself discourage the formulation of acoustic rules for features. A
more serious difficulty is that in many cases features cannot be
independently defined at the acoustic level. Thus the voiced- voiceless
distinction is cued in one way for stops and in another for fricatives.
The frequencies at which noise is found in a fricative do not
correspond to the frequencies of either the locus or the burst of a
stop at a similar point of articulation. The frequencies of the
first and second formants are sufficient to distinguish the non-retroflex
vowels, but the range of F1 variation seems to be influenced by the
F2 value: the vowels are not distributed regularly in F1/F2 space.
Because of these difficulties most of the acoustic synthesis by rule
systems provide only for a regular relationship between
|