- Copying the same sentence using the second generation of Gunnar
Fant's OVE cascade formant synthesizer, 1962.
Gunnar Fant attempted to match a natural recording using OVE II (Fant
and Martony, 1962). Demonstrated at the 1962 Stockholm Speech
Communication Conference. Compare with the PAT version of the same
utterance, above. (text)
- Comparison of synthesis and a natural sentence, using OVE II, by
John Holmes, 1961.
Holmes (1961) of the Joint Speech Unit of the British Post Office used
the OVE II synthesizer to generate a close copy of a natural
sentence. (text)
- Comparison of synthesis and a natural sentence, John Holmes using
his parallel formant synthesizer, 1973.
Holmes did essentially the same thing in 1973, using a more complex
parallel formant synthesizer of his own design (Holmes, 1973).
Demonstrated at the 1972 IEEE Conference on Speech Communication and
Processing, Boston. (text)
- Attempt to scale the DECtalk male voice to make it sound
female.
The DECtalk "Perfect Paul" male voice has been modified by scaling fo
by a factor of 1.7 (ap = 204, pr = 170), by scaling all formant
frequencies by a factor of 0.85 (hs = 85) and removing the fifth formant
(f5 = 2500, b5 = 2048), by increasing the open quotient of the glottal
waveform using the "richness" variable (ri = 0), and by decreasing the
output level slightly to avoid overloads (lo = 81). These
manipulations are not sufficient to turn Paul into a convincing female
speaker. (text)
- Comparison of synthesis and a natural sentence, female voice,
Dennis Klatt, 1986b.
A synthetic copy of a female speaker producing (1) a sentence and (2)
an utterance in which each syllable of "Steve eats candy cane" is
replaced by
is
compared with the original recording (Klatt, 1986b).
(text)
- The DAVO articulatory synthesizer developed by George Rosen at
M.I.T., 1958.
The DAVO ("Dynamic Analog of the VOcal tract") circuit designed by
Rosen (1958) at M.I.T., augmented by a nasal tract designed by Hecker
(1962), was controlled by a tape recording of control signals created
by hand by Kenneth Stevens and Arthur House. The demonstration occurred
at the fall meeting of the Acoustical Society of America in 1961.
(text)
- Sentences produced by an articulatory model, James Flanagan and
Kenzo Ishizaka, 1976.
Flanagan and Ishizaka (1976) of the AT&T Bell Telephone Laboratories
used an articulatory synthesizer to generate two sentences, using
control data derived from the Coker et al. (1973)
text-to-speech system.
A two-mass model of the vocal cords was employed, and turbulence noise
was injected automatically whenever the Reynolds number became large
at the larynx, or at a constricted section of the vocal tract.
(text)
- Linear-prediction analysis and resynthesis of speech at a low-bit
rate in the Texas Instruments Speak-'n-Spell toy, Richard Wiggins,
1980.
Wiggins (1980) designed a low-cost linear-prediction synthesis chip to
take advantage of the ability of linear prediction to represent
critical spectral and temporal aspects of speech waveforms
efficiently. (text)
- Comparison of synthesis and a natural recording, automatic
analysis-resynthesis using multipulse linear prediction, Bishnu
Atal, 1982.
Atal of the AT&T Bell Laboratories demonstrated a new formulation of
linear prediction, known as multipulse LPC (Atal and Remde, 1982) at
the 1982 Paris ICASSP. (text)
Part B: Segmental synthesis by rule
The first synthesis-by-rule programs concentrated on the
development of rules for phonemic synthesis, and did not
include rules for the automatic specification of phoneme
durations and fundamental frequency. Since prosody was
specified by hand to match a natural recording, these
demonstrations sound significantly better than they would
if all information had been derived by rule.
- Creation of a sentence from rules in the head of Pierre Delattre,
using the Haskins Pattern Playback, 1959.
A stylized spectrogram of the desired sentence was painted on a
transparent plastic plate by Pierre Delattre, and then played by the
Haskins Pattern Playback. (text)
- Output from the first computer-based phonemic synthesis-by-rule
program, created by John Kelly and Louis Gerstman, 1961.
Kelly and Gerstman (1961, 1962) of the AT&T Bell Laboratories
demonstrated the first phonemic synthesis-by-rule program in 1961 at
a meeting of the Acoustical Society of America.
(text)
- Elegant rule program for British English by John Holmes, Ignatius
Mattingly, and John Shearme, 1964.
Holmes et al. (1964) of the Joint Speech Research Unit in England
demonstrated an impressive phonemic synthesis-by-rule program for
British English at the fall meeting of the Acoustical Society of
America in Ann Arbor, 1963. (text)
- Formant synthesis using diphone concatenation, by Rex Dixon and
David Maxey, 1968.
Dixon and Maxey (1968) of IBM at Research Triangle Park demonstrated
a diphone concatenation method for construction of control parameter
time functions for a formant synthesizer at the 1967 M.I.T. Conference
on Speech Communication and Processing.
(text)
- Rules to control a low-dimensionality articulatory model, by Cecil
Coker, 1968.
Coker (1968) of AT&T Bell Laboratories created a method of generating
speech from an articulatory model. The system was demonstrated at the
1967 M.I.T. Conference on Speech Communication and Processing.
(text)
Part C: Synthesis by rule of segments and sentence prosody
The next synthesis-by-rule programs include a complete set of rules
for going from phonemes, stress marks, and some syntactic information
to an output speech waveform.
|