The Vocoder,         The Voder

Sound Spectrograph,  Visible-Speech Synthesis

Formant Synthesis,   Synthesis by Segment Assembly

Vocal-Tract Models,  Vocal-Cord Models

Linear Predictive Coding (LPC)

The Vocoder, 10-filter analysis of 250-3000 Hz speech band plus
fundamental frequency tracker. Reconstruction of speech with hiss
and buzz source. The first vocoder.

1939 Dudley, H., "The Vocoder", Bell Labs. Rec., 17, 122-126
     (1939).  (I,K)

     SSSHP 120.4 Tape: "University of Michigan Examples," 12/61.
          ("Mary had ..., Silly Willy", various settings)
          7" reel, fair quality, need copy of master
          (Maxey Tape T61.2) Another copy on SSSHP 119.3.

1940 Dudley, H., "The carrier nature of speech", Bell System
     Tech. J., 19, 495-515 (1940). Photographs of Vocoder and
     Voder, additional information. (B)

Keyboard-operated 10-filter Vocoder. Demonstrated at 1939 New
York's World's Fair, 1940 San Francisco World's Fair. US Patent
No. 2,121,142.  (SSSHP 53-57 Photos.  Five 8x10 photographs of
the Voder exhibits at the 1939 and 1940 World's Fairs.)

Artifact: one of the Voders demonstrated in October 1961, at
     retirement of H.W. Dudley. (SPEECH ANALYSIS SYNTHESIS AND
     PERCEPTION, J.L. Flanagan, Springer-Verlag, NY, 1965, p.211)
     [Ed: Voders could not be found by Lucent Technologies or
      AT&T historians in 2006.]

1939 Dudley, H., R.R. Riesz, and S.S.A. Watkins, "A synthetic
     speaker", J. Franklin Inst., Philadelphia, 227, 739-764
     (1939).  (B,I,K)

          EXHIBITS (New York, San Francisco) 1939-40 era"
          Unknown narrator introduces Mr. Garrett, to explain the
          Voder, and Miss Helen Harper, to operate it.

          (syn, 4 inflections: "She saw me."; syn, 3 voices:
          "Greetings everybody."; syn, 2 voices: "Mary had a
          little lamb.  It's fleece was white as snow.  And
          everywhere that Mary went, the lamb was sure to go.")

          (syn: "ha, ha, ha"; syn, creaky voice: "Yes, I feel
          very old."; syn, monotone and vibrato: "a"; syn,
          musical scales: "a";

          (syn, song Auld Lang Syne: "Should auld acquaintance be
          forgot, and ...  days of auld lang syne.")
          Cassette, good quality, copy of BTL tape

     SSSHP 119.2 Tape: "Lawrence - PAT / U of M, 12-15-61"
          (syn: "She likes..."; "Don't tell the boys."; "The
          Judge paid the check."
          7" reel, good quality (Maxey Tape T61.1)

     SSSHP 120.3 Tape: "University of Michigan Examples," 12/61.
          ("The judge paid the check.")
          7" reel, fair quality, need copy of master
          (Maxey Tape T61.2)

     SSSHP 32.1 Tape: Demo for "Review of text-to-speech
          conversion for English," D. H. Klatt, JASA 82.3, 9/87.
          (syn: "Good evening radio audience. Good afternoon
          radio audience.")
          Cassette, Klatt MIT A/D and D/A

Invention of sound spectrograph during World War II by R.K.
Potter.  Studies of correspondence between speech sounds and
events in the acoustic spectrum, noise bursts, formant movements.
Basis for spectral synthesis, conversion of spectral patterns to
sound (R.K. Potter).  Also see Pattern Playback, by F.S. Cooper
(Haskins Laboratories).

1946 Koenig, W.H., H.K. Dunn, and L.Y. Lacey, "The sound
     spectrograph", J. Acoust. Soc. Amer. 18, 19-49 (1946). (I,K)

1947 Potter, R.K., G.A. Kopp, and H.C. Green, VISIBLE SPEECH, van
     Nostran Co., New York (1947).  (K)

1948 Young, R.W., "Review of U.S. Patent 2,432,123, Translation
     of visual symbols, R.K. Potter, assignor (9 December 1947)",
     J. Acoust Soc Amer, 20, 888-889 (1948).  (I,K)

1952 Peterson, G.E., and H.L. Barney, "Control methods used in a
     study of the vowels", J. Acoust. Soc. Amer., 24, 175-184
     (1952).  (I,K)

Speech synthesis from manually created patterns similar to those
of the sound spectrograph. See Haskins Laboratories for later

1948 Schott, L. O., "A playback for visible speech," Bell Lab.
     Record, 26, 333-339, August 1948.


1951 Demonstration at centennial ceremonies of A.G. Bell's birth.


Manual Control. RLC manually-tuned circuits for producing 
vowels, diphthongs, and liquids. "Additional arrangements" for 
circuit adjustments in rapid succession for words.

1922 Stewart, J.Q., "An electrical analogue of the vocal organs",
     Nature, 110, 311-312 (1922). Words "mama, Anna, wow-wow, yi-
     yi". (B,K)

     Wire recording? (of reconstruction?)

1924 Fletcher, H., Demonstration for the New York Electrical
     Society, February, 1924. Vowels, words "mamma, papa".

     Wire recording?

Parallel formant synthesizer.

1956 Borst, J.M., "Use of spectrograms for speech analysis and
     synthesis", J. Audio Engr. Soc., 4, 13-23 (1956). Parallel
     formant arrangement.  (I)

     Tape ?

1957 Flanagan, J.L., "Note on the design of 'terminal-analog'
     speech synthesizers", J. Acoust. Soc. Amer., 29, 306-310
     (1957). Mathematics of formant synthesizer design.  (B,I,K)

Synthesis by Rule. First phonemic synthesis-by-rule program. 
Series 3-formant synthesizer excited by either impulse train or 
noise. Input was string of phonetic-state specifications. (K)

1961 Kelly, J.L., and L.J. Gerstman, "An artificial talker driven
     from a phonetic input", J. Acoust Soc. Amer., 33, Suppl.
     1, S35 (A), (1961). Unpublished program. See Gerstman's work
     at Haskins Laboratories (MATTINGLY 1968, pp.40-42).  (I,K)

     Copy of program ?

     Record: "Synthesized Speech", Bell Telephone Labs, 1961.
          33 1/3 rpm flexible diskette (In SSSHP USA BTL file
          "Phonograph Records")

     SSSHP 123 Tape: "BTL Demo Record/Hamlet, Daisy," 4" reel,
          7.5 ips. (Maxey Tape T61.5)
          ("He saw the cat., Hamlet... thanks for listening.") 
           copy of above "Synthesized Speech", stylus noise
               **** Need copy of master tape. ****

     SSSHP 32.16 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, 9/87.
          ("To be,or not to be,... by opposing, end them.")
          Cassette, Klatt MIT A/D and D/A

     Kelly, J., and L. Gerstman, "Synthesis of speech from code
     signals," U.S. Patent 3,158,685 (1964).  (K)

1963 Flanagan, J.L., C.H. Coker, and C.M. Bird, "Digital computer
     simulation of a formant-vocoder speech synthesizer", 15th
     Ann. Meeting Audio Engr. Soc., Preprint 307 (1963). Input
     was specification of synthesizer control values at about 40
     or 50 times per second. Control values came from manual
     analysis of sound spectrograms. Purpose was to determine
     synthesizer design most applicable to automatic formant
     analysis.  (B)

     Tape?   ("Men strive but seldom become rich." and other

1965 Denes, P.B., " 'One-line' computing in speech research", J.
     Acoust. Soc. Amer., 38, 934 (A), (1965).  (I)

     Coker, C.H., and P. Cummiskey, "On-line computer control of
     a formant synthesizer", J. Acoust. Soc. Amer., 38, 940 (A),
     (1965).  Hardware synthesizer controlled from DDP-24 via D/A
     converters.  Operational amplifier implementation of formant
     circuits.  Synched chopper for gain control.  Controlled
     three formants and bandwidths.  (I)

     Tape ?

1968 Gold, B., and L.R. Rabiner, "Analysis of digital and analog
     formant synthesizers," IEEE Trans. on Audio & Electro.,
     AU-16, No.  1, March, 1968, pp 81-94.  Important for its
     comparison of higher-pole correction in analog and simulated
     series formant synthesizers.

     Tape ?

1968 Rabiner, L.R., "Digital-formant synthesizer for
     speech-synthesis studies", J. Acoust. Soc. Amer., 43,
     822-828 (1968). FORTRAN IV simulation of serial formant
     synthesizer.  (B, I)

1968 Rabiner, L.R., "Speech synthesis by rule: an acoustic domain
     approach", (1) Ph.D. thesis, MIT, Cambridge MA, June 1967.
     (2) 1967 Conf. on Speech Communication and Processing, MIT,
     Cambridge, MA, Nov 6-8, 1967.  (3) Bell Syst. Tech J. 47,
     17-38 (1968).  Critically-damped 2nd order smoothing filters
     of formant transitions, variable time constants.  (K) Simulated
     synthesizer using BLODI compiler on IBM 7094

          COPY NO. 2-9, T 67.2".
          (9 sen:"The fourth chapter swings...3 fresh perch.";
          8-sen story:"This is the story about a man... Bob enjoys
          his life.")
          7"reel, high quality, IBM's copy of master
                  ****  use for master  ****

1969 Rabiner, L.R., "A model for synthesizing speech by rule",
     IEEE Trans. Audio Electroacoust., AU-17, 7-13 (1969).  (B,I)

     Tape ?  ("What does Bob do?", VCV intelligibility test
              tapes, sentences for intelligibility tests)

1971 Rabiner, L.R., L.B. Jackson, R.W. Schafer, and C.H. Coker,
     "A hardware realization of a digital formant speech
     synthesizer", IEEE Trans. Commun. Tech., COM-19, 1016-1020
     (1971). TTL circuit implementation of digital series formant
     synthesizer. Up to 12.8 K samples/sec, 24-bit processing. (B)
     No tape recording.

Speech synthesis by assembling segments of human speech.
Techniques included the use of a special tape editing machine
that could electronically copy a specified segment of speech.

1953 Harris, C.M, "A study of the building blocks in speech", J.
     Acoust. Soc. Amer., 25, 962-969 (1953). Refers to test tapes
     prepared for listening panels.  (B,I,K)


Static Model. Transmission line model of vocal tract based on 
measurements of x-ray photographs of the human vocal tract and 
implemented as series of L/C tuned circuits. Vowel sounds only.

1950 Dunn, H.K., "The calculation of vowel resonances, and an
     electrical vocal tract", J. Acoust. Soc. Amer., 22, 740-753
     (1950). Photograph, spectrograms of 11 vowels. (B,I,K)

     Tape ?

Dynamic Model. First simulated analog vocal tract synthesizer.  
Synthesis by rule from stored tables of area functions for each 
phonetic segment and a linear interpolation scheme.  (K)

     Copy of program?  (6/89 Flanagan says it is not available)

1962 Kelly, J.L., Jr., and C.C. Lochbaum, "Speech synthesis",
     Proc. Fourth Intern. Congr. Acoust., Paper G42,1-4 (1962)

     Proc. Speech Comm. Seminar, Paper F7, Speech Transm. Lab.,
     Royal Inst. of Tech., Stockholm (1962).  (I)

     Record: "Synthesized Speech", Bell Telephone Labs, 1961.
          33 1/3 rpm flexible diskette (In SSSHP USA BTL file
          "Phonograph Records")

     SSSHP 123 Tape: "BTL Demo Record/Hamlet, Daisy," 4" reel,
          7.5 ips. (Maxey Tape T61.5)
          ("Bicycle Built For Two" song with synthetic piano by
            Max V. Mathews") 
           copy of above "Synthesized Speech", stylus noise
               **** Need copy of master tape. ****

     Tape ?  (Dennis Klatt contacted Lochbaum in 1987 but no tape
              could be found. 6/89 Flanagan says tape cannot be

Simulation model of the vibration of the vocal cords.

1968 Flanagan, J.L., and L.L. Landgraf, "Self-oscillating source
     for vocal-tract synthesizers", IEEE Trans. Audio
     Electroacoust., AU-16, 57-64 (1968). Single-mass model. (B,I,K)

     Tape ?  (/ia/)

1969 Flanagan, J.L., and L. Cherry, "Excitation of vocal-tract
     synthesizers", J. Acoust. Soc. Amer., 45, 764-769 (1969). (I)

     Tape ?

1972 Ishizaka, K., and J.L. Flanagan, "Synthesis of voiced sounds
     from a two-mass model of the vocal cords," Bell Sys. Tech.
     J. 51, 1233-1268 (1972).  (B,K)

     Tape ? (/i,e,a,o/)

1975 Flanagan, J.L., K. Ishizaka, and K.L. Shipley, "Synthesis of
     speech from a dynamic model of the vocal cords and vocal
     tract," Bell Syst. Tech. J. 54, 485-506 (1975). Use of model
     with Coker/Umeda articulatory synthesizer. (K)  Photocopy in
     SSSHP USA BTL file.

     16-mm movie of model operation  (copy?)

1976 Flanagan, J.L., and K. Ishizaka, "Automatic generation of
     voiceless excitation to a vocal cord - vocal tract speech
     synthesizer", IEEE Trans. Acoust., Speech and Signal Proc.
     ASSP-24, 163-170 (1976).  (K)

     SSSHP 50: "J.L. Flanagan and K.I. Ishizaka, Vocal Cord-Vocal
          Vocal Tract Synthesizer, ASA Meeting, Austin, TX,
          4-10-75". Some print-through echos.
          (syn, syllables x3:  "uhpa', a'buh, ha, ha'ha")

          (syn, x2, all noise gen active:  "She saw the house.")
          (syn, x2, no glottal noise gen:  "She saw the house.")
          (syn, x2, no constr. noise gen.: "She saw the house.")
          (syn, x2, no noise gen:          "She saw the house.")
          (syn, x2, all noise gen:         "This is a tes(t).")
          (syn, x2, /t/ aspirated:         "This is a test.")

          (syn, x2: "We were away a year ago.")
          Cassette, good quality, copy of Flanagan tape
                   ****  use for master  ****

     SSSHP 91.17 Tape: "MIT - DEMO TAPE 1, 10/90"
          (syn, x3: VCV's; syn, x2: "She saw the house.  This is
          a test.  We were away a year ago.  I am a computer.")
          7" reel, 7.5 ips, good quality, copy of Klatt tape

     SSSHP 32.12 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, 9/87.
          ("She saw the house. This is the test.")
          Cassette, Klatt MIT A/D and D/A of SSSHP 91.17

1978 Flanagan, J.L., and K. Ishizaka, "Computer model to
     characterize the air volume displaced by the vibrating vocal
     cords," J. Acoust. Soc. Amer. 63, 1558-1563 (1978).  (K)

     Tape ?

1978 Flanagan, J.L., K. Ishizaka and K.L. Shipley, "Signal models
     for low bit-rate coding of speech," J. Acoust. Soc. Am. 68(3),
     780-791 (Sept. 1980). Preliminary study of a combined model
     of vocal source and vocal tract. (Photocopy in SSSHP USA BTL

1971 Atal, B.S., and S.L. Hanauer, "Speech analysis and synthesis
     by linear prediction of the speech wave", J. Acoust. Soc.
     Amer., 50, 637-655 (1971). Demonstration diskette.  (B)
     (Reprint and diskette in "Phonograph Records", SSSHP USA
     BTL file.)

     Tape ? ("May we all learn a yellow lion roar. It's time we
            rounded up that herd of Asian cattle. We were away a
            year ago.  Why do I owe you a letter.", various
            parameter variations)

1975 Atal, B.S. and M.R. Schroeder, "Recent advances in
     predictive coding: applications to speech synthesis," in
     SPEECH COMMUNICATION, ed by G. Fant, Almqvist and Wiksell,
     Uppsala Sweden, Vol. I, 27-31 (1975).  Errors in formant
     frequency upon resynthesis.  (K)

1982 Atal, B.S., and J.R. Remde, "A new model of LPC excitation
     for producing natural-sounding speech at low bit rates",
     Proc. ICASSP-82, 614-617 (1982). Multipulse LPC. (K) Two

     SSSHP 32.14 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, 9/87.
          (syn/natural: "Where is Dennis sitting? This field of
          beets is ripe and ready.")
          Cassette, Klatt MIT A/D and D/A of BTL tape
