several muscles exerting parallel forces are grouped under one
parameter. Though the neuromotor commands for e.g. lip closure are
similar for the different manner classes of labial sounds (Harris
et al. 1965), the relationship between this gesture and the
neuromotor commands which produce it is not a simple one. This suggests
that the connection between the phonetic feature corresponding to lip
closure and the neuromotor commands may not be simple, either; perhaps
the realization of some value of a phonetic feature as a unitary
psychological gesture may actually involve a complex neuromotor
program. This view is reinforced by the recent finding of MacNeilage
and DeClerk (1969) that coarticulation appears even in
electromyographic data.
4.5 Synthesis of Excitational and Prosodic Features
Our discussion so far has been concerned with the synthesis of
segmental phones and with supraglottal articulation and its acoustic
consequences. A synthesis by rule scheme also has to take into account
excitational, prosodic and demarcative features, the associated
glottal and subglottal events, and the acoustic correlates of these
events.
Both resonance and vocal tract analog synthesizers provide periodic
and noisy excitation sources, periodic excitation being used for
vowels, sonorants and voiced stops; noisy excitation for [h],
aspiration, and frication. In resonance synthesizers, separate circuits
(either fixed filters or variable frequency resonators) are
ordinarily provided for shaping high-frequency frication; in vocal
tract analog synthesizers, noise is inserted at various segments in
the tract, depending on the place of articulation of the fricative.
With such facilities the different kinds of excitation are readily
simulated; the only problem is to write rules for the changes from
one excitation source to another. This aspect of synthesis by rule
has not been taken very seriously; usually the duration of the
excitation appropriate for a phone is identical with the nominal
duration of the phone itself. In the case of voiceless stops, however,
this approach requires including part of the transition to the
following vowel in the stop, as was done by Holmes et al.
(1964). Another solution is to specify, as a characteristic of the
voiceless consonant, the appropriate amount of devoicing of the
following phone, as we have done (Mattingly 1968a). What is really
required, however, is a rule specifying voice-onset time negatively
or positively relative to the instant of release, as the work of
Lisker and Abramson (1967) suggests. For medial and final voiced
consonant and consonant clusters, increased duration of the preceding
vowel is well known to be an important cue (Kenyon 1950:63; Denes 1955)
and some systems have taken account of it, e.g. Mattingly (1968a),
Rabiner (1969).
Rather more attention has been given to prosodic and demarcative
features such as stress, accent, intonation, juncture and pause,
which interact with inherent properties
|