BELL TELEPHONE LABORATORIES (BTL) - continued
1. DEVELOPMENT OF TECHNIQUES
CONTENTS:
The Vocoder, The Voder
Sound Spectrograph, Visible-Speech Synthesis
Formant Synthesis, Synthesis by Segment Assembly
Vocal-Tract Models, Vocal-Cord Models
Linear Predictive Coding (LPC)
------------------------------------------------------------- Top
PROJECT: THE VOCODER
The Vocoder, 10-filter analysis of 250-3000 Hz speech band plus
fundamental frequency tracker. Reconstruction of speech with hiss
and buzz source. The first vocoder.
1939 Dudley, H., "The Vocoder", Bell Labs. Rec., 17, 122-126
(1939). (I,K)
SSSHP 120.4 Tape: "University of Michigan Examples," 12/61.
("Mary had ..., Silly Willy", various settings)
7" reel, fair quality, need copy of master
(Maxey Tape T61.2) Another copy on SSSHP 119.3.
1940 Dudley, H., "The carrier nature of speech", Bell System
Tech. J., 19, 495-515 (1940). Photographs of Vocoder and
Voder, additional information. (B)
------------------------------------------------------------- Top
PROJECT: THE VODER
Keyboard-operated 10-filter Vocoder. Demonstrated at 1939 New
York's World's Fair, 1940 San Francisco World's Fair. US Patent
No. 2,121,142. (SSSHP 53-57 Photos. Five 8x10 photographs of
the Voder exhibits at the 1939 and 1940 World's Fairs.)
Artifact: one of the Voders demonstrated in October 1961, at
retirement of H.W. Dudley. (SPEECH ANALYSIS SYNTHESIS AND
PERCEPTION, J.L. Flanagan, Springer-Verlag, NY, 1965, p.211)
[Ed: Voders could not be found by Lucent Technologies or
AT&T historians in 2006.]
1939 Dudley, H., R.R. Riesz, and S.S.A. Watkins, "A synthetic
speaker", J. Franklin Inst., Philadelphia, 227, 739-764
(1939). (B,I,K)
SSSHP 51 Tape: "AT&T BELL LABS VODER FROM WORLD'S FAIR
EXHIBITS (New York, San Francisco) 1939-40 era"
Unknown narrator introduces Mr. Garrett, to explain the
Voder, and Miss Helen Harper, to operate it.
(syn, 4 inflections: "She saw me."; syn, 3 voices:
"Greetings everybody."; syn, 2 voices: "Mary had a
little lamb. It's fleece was white as snow. And
everywhere that Mary went, the lamb was sure to go.")
(syn: "ha, ha, ha"; syn, creaky voice: "Yes, I feel
very old."; syn, monotone and vibrato: "a"; syn,
musical scales: "a";
(syn, song Auld Lang Syne: "Should auld acquaintance be
forgot, and ... days of auld lang syne.")
Cassette, good quality, copy of BTL tape
SSSHP 119.2 Tape: "Lawrence - PAT / U of M, 12-15-61"
(syn: "She likes..."; "Don't tell the boys."; "The
Judge paid the check."
7" reel, good quality (Maxey Tape T61.1)
SSSHP 120.3 Tape: "University of Michigan Examples," 12/61.
("The judge paid the check.")
7" reel, fair quality, need copy of master
(Maxey Tape T61.2)
SSSHP 32.1 Tape: Demo for "Review of text-to-speech
conversion for English," D. H. Klatt, JASA 82.3, 9/87.
(syn: "Good evening radio audience. Good afternoon
radio audience.")
Cassette, Klatt MIT A/D and D/A
------------------------------------------------------------- Top
PROJECT: SOUND SPECTROGRAPH
Invention of sound spectrograph during World War II by R.K.
Potter. Studies of correspondence between speech sounds and
events in the acoustic spectrum, noise bursts, formant movements.
Basis for spectral synthesis, conversion of spectral patterns to
sound (R.K. Potter). Also see Pattern Playback, by F.S. Cooper
(Haskins Laboratories).
1946 Koenig, W.H., H.K. Dunn, and L.Y. Lacey, "The sound
spectrograph", J. Acoust. Soc. Amer. 18, 19-49 (1946). (I,K)
1947 Potter, R.K., G.A. Kopp, and H.C. Green, VISIBLE SPEECH, van
Nostran Co., New York (1947). (K)
1948 Young, R.W., "Review of U.S. Patent 2,432,123, Translation
of visual symbols, R.K. Potter, assignor (9 December 1947)",
J. Acoust Soc Amer, 20, 888-889 (1948). (I,K)
1952 Peterson, G.E., and H.L. Barney, "Control methods used in a
study of the vowels", J. Acoust. Soc. Amer., 24, 175-184
(1952). (I,K)
------------------------------------------------------------- Top
PROJECT: VISIBLE-SPEECH SYNTHESIZER
Speech synthesis from manually created patterns similar to those
of the sound spectrograph. See Haskins Laboratories for later
machine.
1948 Schott, L. O., "A playback for visible speech," Bell Lab.
Record, 26, 333-339, August 1948.
Tape?
1951 Demonstration at centennial ceremonies of A.G. Bell's birth.
Tape?
------------------------------------------------------------- Top
PROJECT: FORMANT SYNTHESIS
Manual Control. RLC manually-tuned circuits for producing
vowels, diphthongs, and liquids. "Additional arrangements" for
circuit adjustments in rapid succession for words.
1922 Stewart, J.Q., "An electrical analogue of the vocal organs",
Nature, 110, 311-312 (1922). Words "mama, Anna, wow-wow, yi-
yi". (B,K)
Wire recording? (of reconstruction?)
1924 Fletcher, H., Demonstration for the New York Electrical
Society, February, 1924. Vowels, words "mamma, papa".
Wire recording?
Parallel formant synthesizer.
1956 Borst, J.M., "Use of spectrograms for speech analysis and
synthesis", J. Audio Engr. Soc., 4, 13-23 (1956). Parallel
formant arrangement. (I)
Tape ?
1957 Flanagan, J.L., "Note on the design of 'terminal-analog'
speech synthesizers", J. Acoust. Soc. Amer., 29, 306-310
(1957). Mathematics of formant synthesizer design. (B,I,K)
Synthesis by Rule. First phonemic synthesis-by-rule program.
Series 3-formant synthesizer excited by either impulse train or
noise. Input was string of phonetic-state specifications. (K)
1961 Kelly, J.L., and L.J. Gerstman, "An artificial talker driven
from a phonetic input", J. Acoust Soc. Amer., 33, Suppl.
1, S35 (A), (1961). Unpublished program. See Gerstman's work
at Haskins Laboratories (MATTINGLY 1968, pp.40-42). (I,K)
Copy of program ?
Record: "Synthesized Speech", Bell Telephone Labs, 1961.
33 1/3 rpm flexible diskette (In SSSHP USA BTL file
"Phonograph Records")
SSSHP 123 Tape: "BTL Demo Record/Hamlet, Daisy," 4" reel,
7.5 ips. (Maxey Tape T61.5)
("He saw the cat., Hamlet... thanks for listening.")
copy of above "Synthesized Speech", stylus noise
**** Need copy of master tape. ****
SSSHP 32.16 Tape: Demo to accompany "Review of Text-to-speech
conversion for English," D.H. Klatt, JASA 82.3, 9/87.
("To be,or not to be,... by opposing, end them.")
Cassette, Klatt MIT A/D and D/A
Kelly, J., and L. Gerstman, "Synthesis of speech from code
signals," U.S. Patent 3,158,685 (1964). (K)
1963 Flanagan, J.L., C.H. Coker, and C.M. Bird, "Digital computer
simulation of a formant-vocoder speech synthesizer", 15th
Ann. Meeting Audio Engr. Soc., Preprint 307 (1963). Input
was specification of synthesizer control values at about 40
or 50 times per second. Control values came from manual
analysis of sound spectrograms. Purpose was to determine
synthesizer design most applicable to automatic formant
analysis. (B)
Tape? ("Men strive but seldom become rich." and other
samples)
1965 Denes, P.B., " 'One-line' computing in speech research", J.
Acoust. Soc. Amer., 38, 934 (A), (1965). (I)
Coker, C.H., and P. Cummiskey, "On-line computer control of
a formant synthesizer", J. Acoust. Soc. Amer., 38, 940 (A),
(1965). Hardware synthesizer controlled from DDP-24 via D/A
converters. Operational amplifier implementation of formant
circuits. Synched chopper for gain control. Controlled
three formants and bandwidths. (I)
Tape ?
1968 Gold, B., and L.R. Rabiner, "Analysis of digital and analog
formant synthesizers," IEEE Trans. on Audio & Electro.,
AU-16, No. 1, March, 1968, pp 81-94. Important for its
comparison of higher-pole correction in analog and simulated
series formant synthesizers.
Tape ?
1968 Rabiner, L.R., "Digital-formant synthesizer for
speech-synthesis studies", J. Acoust. Soc. Amer., 43,
822-828 (1968). FORTRAN IV simulation of serial formant
synthesizer. (B, I)
1968 Rabiner, L.R., "Speech synthesis by rule: an acoustic domain
approach", (1) Ph.D. thesis, MIT, Cambridge MA, June 1967.
(2) 1967 Conf. on Speech Communication and Processing, MIT,
Cambridge, MA, Nov 6-8, 1967. (3) Bell Syst. Tech J. 47,
17-38 (1968). Critically-damped 2nd order smoothing filters
of formant transitions, variable time constants. (K) Simulated
synthesizer using BLODI compiler on IBM 7094
SSSHP 81.7a Tape: "SPEECH ANALYSIS/SYNTHESIS DEMONSTRATION,
COPY NO. 2-9, T 67.2".
(9 sen:"The fourth chapter swings...3 fresh perch.";
8-sen story:"This is the story about a man... Bob enjoys
his life.")
7"reel, high quality, IBM's copy of master
**** use for master ****
1969 Rabiner, L.R., "A model for synthesizing speech by rule",
IEEE Trans. Audio Electroacoust., AU-17, 7-13 (1969). (B,I)
Tape ? ("What does Bob do?", VCV intelligibility test
tapes, sentences for intelligibility tests)
1971 Rabiner, L.R., L.B. Jackson, R.W. Schafer, and C.H. Coker,
"A hardware realization of a digital formant speech
synthesizer", IEEE Trans. Commun. Tech., COM-19, 1016-1020
(1971). TTL circuit implementation of digital series formant
synthesizer. Up to 12.8 K samples/sec, 24-bit processing. (B)
No tape recording.
------------------------------------------------------------- Top
PROJECT: SPEECH SYNTHESIS BY SEGMENT ASSEMBLY (1950 - 1953)
Speech synthesis by assembling segments of human speech.
Techniques included the use of a special tape editing machine
that could electronically copy a specified segment of speech.
1953 Harris, C.M, "A study of the building blocks in speech", J.
Acoust. Soc. Amer., 25, 962-969 (1953). Refers to test tapes
prepared for listening panels. (B,I,K)
Tape?
------------------------------------------------------------- Top
PROJECT: VOCAL TRACT MODELS
Static Model. Transmission line model of vocal tract based on
measurements of x-ray photographs of the human vocal tract and
implemented as series of L/C tuned circuits. Vowel sounds only.
1950 Dunn, H.K., "The calculation of vowel resonances, and an
electrical vocal tract", J. Acoust. Soc. Amer., 22, 740-753
(1950). Photograph, spectrograms of 11 vowels. (B,I,K)
Tape ?
Dynamic Model. First simulated analog vocal tract synthesizer.
Synthesis by rule from stored tables of area functions for each
phonetic segment and a linear interpolation scheme. (K)
Copy of program? (6/89 Flanagan says it is not available)
1962 Kelly, J.L., Jr., and C.C. Lochbaum, "Speech synthesis",
Proc. Fourth Intern. Congr. Acoust., Paper G42,1-4 (1962)
(B,K)
Proc. Speech Comm. Seminar, Paper F7, Speech Transm. Lab.,
Royal Inst. of Tech., Stockholm (1962). (I)
Record: "Synthesized Speech", Bell Telephone Labs, 1961.
33 1/3 rpm flexible diskette (In SSSHP USA BTL file
"Phonograph Records")
SSSHP 123 Tape: "BTL Demo Record/Hamlet, Daisy," 4" reel,
7.5 ips. (Maxey Tape T61.5)
("Bicycle Built For Two" song with synthetic piano by
Max V. Mathews")
copy of above "Synthesized Speech", stylus noise
**** Need copy of master tape. ****
Tape ? (Dennis Klatt contacted Lochbaum in 1987 but no tape
could be found. 6/89 Flanagan says tape cannot be
located.)
------------------------------------------------------------- Top
PROJECT: VOCAL CORD MODELS
Simulation model of the vibration of the vocal cords.
1968 Flanagan, J.L., and L.L. Landgraf, "Self-oscillating source
for vocal-tract synthesizers", IEEE Trans. Audio
Electroacoust., AU-16, 57-64 (1968). Single-mass model. (B,I,K)
Tape ? (/ia/)
1969 Flanagan, J.L., and L. Cherry, "Excitation of vocal-tract
synthesizers", J. Acoust. Soc. Amer., 45, 764-769 (1969). (I)
Tape ?
1972 Ishizaka, K., and J.L. Flanagan, "Synthesis of voiced sounds
from a two-mass model of the vocal cords," Bell Sys. Tech.
J. 51, 1233-1268 (1972). (B,K)
Tape ? (/i,e,a,o/)
1975 Flanagan, J.L., K. Ishizaka, and K.L. Shipley, "Synthesis of
speech from a dynamic model of the vocal cords and vocal
tract," Bell Syst. Tech. J. 54, 485-506 (1975). Use of model
with Coker/Umeda articulatory synthesizer. (K) Photocopy in
SSSHP USA BTL file.
16-mm movie of model operation (copy?)
1976 Flanagan, J.L., and K. Ishizaka, "Automatic generation of
voiceless excitation to a vocal cord - vocal tract speech
synthesizer", IEEE Trans. Acoust., Speech and Signal Proc.
ASSP-24, 163-170 (1976). (K)
SSSHP 50: "J.L. Flanagan and K.I. Ishizaka, Vocal Cord-Vocal
Vocal Tract Synthesizer, ASA Meeting, Austin, TX,
4-10-75". Some print-through echos.
(syn, syllables x3: "uhpa', a'buh, ha, ha'ha")
(syn, x2, all noise gen active: "She saw the house.")
(syn, x2, no glottal noise gen: "She saw the house.")
(syn, x2, no constr. noise gen.: "She saw the house.")
(syn, x2, no noise gen: "She saw the house.")
(syn, x2, all noise gen: "This is a tes(t).")
(syn, x2, /t/ aspirated: "This is a test.")
(syn, x2: "We were away a year ago.")
Cassette, good quality, copy of Flanagan tape
**** use for master ****
SSSHP 91.17 Tape: "MIT - DEMO TAPE 1, 10/90"
(syn, x3: VCV's; syn, x2: "She saw the house. This is
a test. We were away a year ago. I am a computer.")
7" reel, 7.5 ips, good quality, copy of Klatt tape
SSSHP 32.12 Tape: Demo to accompany "Review of Text-to-speech
conversion for English," D.H. Klatt, JASA 82.3, 9/87.
("She saw the house. This is the test.")
Cassette, Klatt MIT A/D and D/A of SSSHP 91.17
1978 Flanagan, J.L., and K. Ishizaka, "Computer model to
characterize the air volume displaced by the vibrating vocal
cords," J. Acoust. Soc. Amer. 63, 1558-1563 (1978). (K)
Tape ?
1978 Flanagan, J.L., K. Ishizaka and K.L. Shipley, "Signal models
for low bit-rate coding of speech," J. Acoust. Soc. Am. 68(3),
780-791 (Sept. 1980). Preliminary study of a combined model
of vocal source and vocal tract. (Photocopy in SSSHP USA BTL
file.)
------------------------------------------------------------- Top
PROJECT: LINEAR PREDICTIVE CODING OF SPEECH
1971 Atal, B.S., and S.L. Hanauer, "Speech analysis and synthesis
by linear prediction of the speech wave", J. Acoust. Soc.
Amer., 50, 637-655 (1971). Demonstration diskette. (B)
(Reprint and diskette in "Phonograph Records", SSSHP USA
BTL file.)
Tape ? ("May we all learn a yellow lion roar. It's time we
rounded up that herd of Asian cattle. We were away a
year ago. Why do I owe you a letter.", various
parameter variations)
1975 Atal, B.S. and M.R. Schroeder, "Recent advances in
predictive coding: applications to speech synthesis," in
SPEECH COMMUNICATION, ed by G. Fant, Almqvist and Wiksell,
Uppsala Sweden, Vol. I, 27-31 (1975). Errors in formant
frequency upon resynthesis. (K)
1982 Atal, B.S., and J.R. Remde, "A new model of LPC excitation
for producing natural-sounding speech at low bit rates",
Proc. ICASSP-82, 614-617 (1982). Multipulse LPC. (K) Two
speakers.
SSSHP 32.14 Tape: Demo to accompany "Review of Text-to-speech
conversion for English," D.H. Klatt, JASA 82.3, 9/87.
(syn/natural: "Where is Dennis sitting? This field of
beets is ripe and ready.")
Cassette, Klatt MIT A/D and D/A of BTL tape
|