devices suggests that either something is still missing from the
voicing source models, or that we do not yet know how to control
them properly.
A number of recent glottal waveform models produce source spectra
that include zeros (see Fujisaki, 1986 for a review). Flanagan (1972,
pp. 232-245) describes the expected locations of voicing source
spectral zeros as a function of various assumptions about the nature
of the glottal volume velocity waveform. Many different types of
waveshapes imply the existence of zeros; the only requirement is that
there be well-defined open and closing times. If a source spectral
zero is near in frequency to a formant, the formant will be reduced
in amplitude or even completely obliterated. Source spectral zeros
are present in the glottal waveform models of Fant
et al. (1985) and
in Klattalk, but the depth of the spectral notches is only a few
decibels. Flanagan shows that the frequency locations and depth of
spectral notches induced by source zeros depend on relatively small
changes to critical aspects of the source waveform, such as symmetry.
It may be that the dull, lifeless quality of synthetic voices is
due in part to the absence of small period-to-period changes to
the zero pattern. Holmes (1973) was able to synthesize a nearly
perfect imitation of a male voice without resorting to this level
of detail in modeling the source, but he may have mimicked the most
important effects of source changes by ensuring that the amplitudes
of individual formant spectral peaks followed changes observed in the
natural utterance.
Naturalness is a particular problem when trying to
synthesize a convincing imitation of a female voice (Carrell,
1984). Simple scaling procedures [formants multiplied by a
factor of 1.15 (Peterson and Barney, 1952), fundamental
frequency by a factor of 1.7, glottal open quotient slightly
greater than for a male voice] do not result in a particularly
female voice quality (example 9 of the
Appendix). The glottal source
model is not quite right; nonuniform formant scaling appears to
be required (Fant, 1975), and it may also be
|