SSSHP Contents | Labs | Abbr. | Index
 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY (MIT)


Dept. of Electrical Engineering and
Research Laboratory of Electronics
Cambridge, MA 02139

(Acoustics Laboratory until 1956)
 
 
ARTIFACTS: All early hardware synthesizers were destroyed.
 
 
CONTENTS:

HISTORY

GENERAL/SURVEYS

SPEECH ANALYSIS (1949 -    )

EVT - ELECTRICAL VOCAL TRACT (1951 - 1957)

DAVO - DYNAMIC ANALOG VOCAL TRACT (1956 -    )

POVO - FORMANT SPEECH SYNTHESIS (1952 - 1968)

FORMANT SYNTHESIS FROM PHONETIC INPUT (1970 - 1988)

KLATTALK TEXT TO SPEECH SYSTEM (1980 - 1989)

MITALK TEXT TO SPEECH SYSTEM (mid 1960s - 1979)

BIOGRAPHIES


------------------------------------------------------------- Top
HISTORY

The Research Laboratory of Electronics is the postwar 
derivative of the World War II Radiation Laboratory that
contributed so much to the development of radar and other 
military needs. Allowing the multidisciplined RLE to continue
as an entity separate from the Electrical Engineering and 
Physics departments was a radical idea in 1946, but the RLE
has more than proven its value in the intervening years. Just
one of many fields of research in RLE has been the study of
human speech and hearing. ("RLE: The First 50 Years," John 
Mattill, Technology Review, MIT, pp. MIT 8-10, Feb/Mar 1997)

"Research on speech synthesis at MIT began around 1951, when
Gunnar Fant was a visitor at the Acoustics Laboratory. Under
supervision of Gunnar Fant and Kenneth Stevens, an electrical
analog of the vocal tract (the EVT, below) was developed by
Stanley Kasowski."  (KNS)

(Quoted material from "KNS" is from a personal communication 
from K.N. Stevens to H.D. Maxey, October, 1990, in SSSHP USA 
MIT file.)


------------------------------------------------------------- Top
GENERAL/SURVEYS


1987 Klatt, D.H., "Review of text-to-speech conversion for
     English," J. Acoust. Soc. Amer., 82.3, Sept. 1987, 737-793.
     (K - one of the three histories on which the SSSHP is based).
     Online, this site.

     SSSHP 32 Tape: "Text-to-Speech History, D. Klatt, ASA Demo, 
          Copy 2, 2/87". 
               (36 samples of speech synthesis)
          Cassette, good quality, Klatt MIT A/D and D/A
          (see Transcription of Recordings for technical details
          of D/A conversion.)



------------------------------------------------------------- Top
PROJECT: SPEECH ANALYSIS (1949 -     )


1961 Stevens, K.N. and A.S. House, "An acoustical theory of vowel
     production and some of its implications," J. Speech and
     Hearing Res. 4, 303-320 (1961).  Predict amplitude of formants
     from frequencies.  (K)


1963 Stevens, K.N. and A.S. House, "Perturbation of vowel
     articulation by consonantal context: an acoustical study",
     J. Speech and Hearing Res. 6, 111-128 (1963). Best data on
     formant bandwidths.  (K)


1972 Stevens, K.N., "The quantal nature of speech: evidence from
     articulatory-acoustic data," in HUMAN COMMUNICATION: A
     UNIFIED VIEW, ed by E.E. David and P.B. Denes, McGraw-Hill,
     New York, 51-66 (1972).  (K) Quantal theory of consonant
     place of articulation.


1974 Klatt, D.H., and R.A. Stefanski, "How does a mynah bird
     imitate human speech?", J. Acoust. Soc. Am., 55.4, April
     1974, 822-832. Broad-band spectrograms and computer
     generated spectra of a trained Indian Hill mynah bird,
     named Ig-wog, and its tutor.  See SSSHP USA MIT file.

     SSSHP 93.1 Tape: "MIT - Demo Tape 3, 10/90"
     (tutor at low level, bird, tutor, bird, 1:48 min: "I'd like
     a grape.  Learning what to say.  You're looking well.
     Hello.  Who are you?  Ig-ig-wog.  I just saw a zebra.")
     7" reel, good quality, copied from Klatt collection


Grammar of a language represented by a set of rules:


1965 Chomsky, N., ASPECTS OF THE THEORY OF SYNTAX, MIT Press,
     Cambridge, Mass. (1965).  (I)


1968 Chomsky, N., LANGUAGE AND MIND, Harcourt, New York (1968)
     (I)


1968 Chomsky, N., and M. Halle, THE SOUND PATTERN OF ENGLISH,
     Harper, New York (1968).  (I,K)


Models for prosodic rules


1967 Lieberman, P., INTONATION, PERCEPTION AND LANGUAGE, MIT
     Press, Cambridge, Mass.


------------------------------------------------------------- Top
PROJECT: EVT - ELECTRICAL VOCAL TRACT (1951 - 1957)


Thirty-five section RLC transmission-line model. Manual switching
of fixed inductors and capacitors. Vowel sounds only.

"This model consisted of about 35 LC sections, each of which
could be shifted through a range of values of L/C (keeping the
product LC constant) to simulate a range of cross-sectional areas
for each 1/2-cm section. Excitation of the model was provided by
a pulse generator whose output was shaped by an RC filter
circuit. This analog device was used to synthesize a range of
steady-state vowels, and it provided data for several papers that
examined the relation between vocal-tract shape and formant
frequencies."  (KNS)


1952 Kasowski, S.E., A Speech Sound Synthesizer, S.M. Thesis,
     Dept. of Elect. Engr., MIT, 1952.


1953 Stevens, K.N., S. Kasowski, and C.G.M. Fant, "An electrical
     analog of the vocal tract", J. Acoust. Soc. Amer., 25,
     734-742 (1953). Photograph, spectrograms. (B,I,K)


1955 Stevens, K.N., and A.S. House, "Development of a
     quantitative description of vowel articulation", J. Acoust.
     Soc. Amer., 27, 484-493 (1955)  (B,I,K)

     SSSHP 90.9 Tape: "MIT-Machines That Talk, Spring 1960"
          (vowel sounds)
          7" reel, good quality, copy of RLE tape 7-36

     SSSHP 93.4 Tape: "MIT - Demo Tape 3, 10/90"
          (syn: 14 vowels)
          7" reel, good quality, selections from RLE tape 7-6

     Additional copies of six vowels on SSSHP 119.4, 120.5.


Nasal consonants by the EVT. "A later addition to the vocal-tract
analog was a circuit to simulate the nasal cavities, also
consisting of a series of LC sections. With this addition, the
analog device could be adjusted to synthesize nasal consonants
and vowels with various amounts of coupling to the nasal
cavities."  (KNS)


1956 House, A.S., and K.N. Stevens, "Analog studies of the
     nasalization of vowels," J. Speech and Hearing Disorders,
     21, 578-585.


1957 House, A.S., "Analog studies of nasal consonants," J. Speech
     and Hearing Disorders, 22, 190-204.

     SSSHP 93.5 Tape: "MIT - Demo Tape 3, 10/90"
          (syn: 14 vowels and nasal consonants)
          7" reel, good quality, selections from RLE tape 7-11


------------------------------------------------------------- Top
PROJECT: DAVO - DYNAMIC ANALOG VOCAL TRACT (1956 -19   )


"A few years subsequent to the development of the static vocal
tract analog (EVA), George Rosen designed and constructed an
electronic vocal tract analog in which the various LC sections
could be controlled by voltages in a time-varying manner."  (KNS)

An electronically controlled vocal tract synthesizer, first
controlled from a matrix of preset values for six vocal tract
configurations.  A phrase was assembled by splicing tape
recordings of short segments (phoneme-pairs) generated by the
synthesizer.


1958 Rosen, G., "Dynamic analog speech synthesizer", J. Acoust.
     Soc. Amer., 30, 201-209 (1958).  (I,K) Variable L and C
     (Miller effect), consonant-vowels from potentiometer matrix
     selected by switch.


1960 Rosen, G., Dynamic Analog Speech Synthesizer, Sc.D. thesis,
     Dept. of Elect. Engr., MIT, 1960.

     SSSHP 90.10 Tape: "MIT - Machines That Talk, Spring 1960"
          (syn: "a, sh, sha", CVs, segments and complete "The
          voice of DAVO")
          7" reel, good quality, copy RLE tape 7-36


Later modifications added a nasal tract called "DANA", Dynamic
Analog NAsal tract.

"A later addition to this dynamic analog of the vocal tract was a
component to simulate the nasal cavities.  Synthesis of some
phrases and of a song using the dynamic analog synthesizer was
demonstrated at a meeting of the Acoustical Society of America in
1961.  The control signals for this synthesis demonstration were
derived from theory and by trial and error." (KNS)


1961 House and Stevens


1961 Hecker, M.H.L., Construction and Evaluation of a Dynamic
     Analog of the Nasal Cavities, SM thesis, M.I.T., 1961.


1962 Hecker, M.H.L., "Studies of nasal consonants with an
     articulatory speech synthesizer," J. Acoust. Soc. Amer. 34,
     179-188 (1962).  Side branch to simulate nasal tract.
     Controlled by a tape recording of control signals created by
     hand by Kenneth Stevens and Arthur House.  Demo at fall
     meeting  Acoust. Soc. America in 1961. (K)

1961 SSSHP 91.1 Tape: "MIT - Demo Tape 1, 10/90"
          (syn: "This is the voice of DAVO at MIT."; Alphabet
          Song: "A, B, C, ...")
          7" reel, good quality, copy of spliced addition to RLE
          tape 7-36

     SSSHP 127.10a Tape: "Machines That Talk, MIT, 2/62"
          (syn: same as SSSHP 91.1 plus "Tech is Hell")

     SSSHP 32.11 Tape: Demo to accompany "Review of Text-to-
          speech conversion for English," D.H. Klatt, JASA 82.3,
          Sept. 1987.
          (syn: "This is the voice of DAVO at MIT."; Alphabet
          Song: " A, B, C ...")
          Cassette, Klatt MIT A/D and D/A

     SSSHP 83.10 Tape: Some Reminiscences on Speech Research,
          F.S. Cooper. Plastic diskette in IEEE Trans. A&E,
          AU-21.3, 6/73.
          ("This is the voice of DAVO, at MIT")
          Cassette from plastic diskette, some stylus noise


------------------------------------------------------------- Top
PROJECT: POVO FORMANT SPEECH SYNTHESIS (1952 - 1968)


POVO, POle VOice Analog formant speech synthesizer.  Vacuum tube,
cascaded formants, straight-line segment function generator driven
from Teletype punched paper tape. Short segments, only. Phrases by
tape splicing of segments.

"Formant synthesis of speech at MIT was begun in 1952, and was
influenced by the ideas of Gunnar Fant, who had recently spent
some time at the Acoustics Laboratory. The initial formant
synthesizer was called POVO, and it was reported, in various
stages of its development, in several progress reports of the
Acoustic Laboratory at MIT, and in papers presented at the
Acoustical Society of America. In its early manifestations, the
parameters of this synthesizer could be set to fixed values, and
steady-state vowels could be synthesized.

In later version, the parameters were manipulated by the openings
and closings of relays that were controlled by a punched paper
tape reader. At this stage, the synthesizer was only able to
produce vocalic sounds, and could synthesize phrases like, "Where
are you?, and "We were away." There were also some initial
attempts to produce fricative consonants. Nasal consonants were
synthesized by Nakata in 1958."  (KNS)


1952 Stevens, K.N., "The perception of sounds shaped by resonant
     circuits", Sc.D. Thesis, MIT, Cambridge MA, 1952.


1952 Stevens, K.N., "Vowel synthesis by variable resonant
     circuits, Acoustics Laboratory Quarterly Progress Report,
     M.I.T., Oct-Dec 1952, p17. Perception experiment, trying
     many different combinations of formant frequencies.

     SSSHP 93.3 Tape: MIT - Demo Tape 3, 10/90.
          (syn: 12 vowels from Set 15F, 10 vowels from Set 16A)
          7" reel, good quality, selections from RLE tape 7-2


1956 Stevens, K.N., "Synthesis of speech by electrical analog
     devices", J. Audio Eng. Soc., 4, 2-8, Jan 1956. Relays
     controlled from punched paper tape.

     SSSHP 90.6 Tape: "MIT - Machines That Talk, Spring 1960)
          (syn, various settings: "Where are you?"; "Far, far
          far away.")
          7" reel, good quality, copy RLE tape 7-36

     SSSHP 120.6 Tape: "Univ. of Michigan - Examples," Comm. Sci.
          Dept., Dec 1961. (Maxey Tape T61.2)
          ("Where are you? We are far, far away.", variations)
          Another copy on SSSHP 119.5
             

1959 Nakata, K., "Synthesis and perception of nasal consonants",
     J. Acoust. Soc. Amer., Vol 31, No. 6, June 1959, pp. 661-
     666. POVO description.


1961 Heinz, J.M., and K.N. Stevens, "On the properties of
     voiceless fricative consonants," J. Acoust. Soc. Amer. 33,
     No. 5, May 1961, 589-596 (1961). (K)


"A system for computer control of the synthesizer was developed
in 1962 by W.L. Henke, in an SM thesis. The control signals were
stored in digital form and were read out through digital to
analog converters to yield time-varying formant frequencies and
other synthesis parameters.

New synthesis hardware was developed in 1965 by Tomlinson, who
used solid state circuits to develop the synthesizer called
SPASS. The cascaded all-pole formant synthesizer was realized
with analog computer techniques, and the circuit parameters were
controlled with digital signals from a small computer. Control
signals could be specified with a 'light pen' (on a cathode ray
tube display). A number of phrases and syllables were synthesized
with SPASS."  (KNS)


1963 Henke, W.L., "Computer control of a terminal analog speech
     synthesizer", S.M. Thesis, Dept of Elect. Engr., MIT. Input
     was a list of control-signal break points. The TX-0 computer
     provided control data every millisecond to a D/A converter
     that produced analog control signals for POVO.


     Artifact: TX-0 computer is in Boston Computer Museum as of
     1990.


1965 Tomlinson, R.S., An Improved Solid State Terminal-Analog
     Speech Synthesizer, M.S. Thesis, MIT, June 1965. Series
     formant synthesizer implemented with operational amplifiers,
     digital control from on-line PDP1. Operator used light pen on
     CRT display. (Copy with SSSHP books.)

     Tomlinson, R.S., "SPASS - an improved terminal-analog speech
     synthesizer", JASA 38, 940, 1965. (I)

     SSSHP 91.2 Tape: "MIT Demo Tape 1, 10/90
          (syn, x2: "buh, duh, guh."; "Where are you?")
          7" reel, good quality, copied from RLE tape 7-62


"A substantial further advance in speech synthesis with a formant
synthesizer was made with the doctoral thesis of L.R. Rabiner.
He implemented a complete synthesis by rule system, using
computers available at Bell Laboratories at Murray Hill, New
Jersey.  The input to the synthesizer was a discrete phonetic
string, indications of stress, and where sentence and word
boundaries and pauses occur.  Rabiner developed extensive tables
of formant targets, source characteristics, and temporal
characteristics for each phonetic unit, as well as procedures for
generating fundamental frequency contours.  A number of short and
long sentences were synthesized by rule, and their intelligibility
was evaluated." (KNS)


1968 Rabiner, L. R., Speech Synthesis by Rule: An Acoustic
     Domain Approach., Ph.D. thesis, M.I.T., 1968. See SSSHP USA
     BTL file for synthesis examples.


1968 Gold, B., and L.R. Rabiner, "Analysis of digital and analog
     formant synthesizers," IEEE Trans. Audio and Electro. AU-16,
     81-94 (1968).  (K) Comparison of spectra. A digital
     simulation of a formant synthesizer creates a somewhat
     different spectrum because of the image poles in the D/A
     process.


------------------------------------------------------------- Top
PROJECT: FORMANT SYNTHESIS FROM PHONETIC INPUT (1970 - 1988)


Synthesis by rule from phonetic input, simulated on a digital
computer. The computer simulated the previous POVO synthesizer by
solving a system of difference equations.

"Formant synthesis was picked up by Dennis Klatt at MIT in the
late 1960's, and the steady progress of speech synthesis by rule
in the period 1970 to 1986 has been chronicled by Klatt in his
1987 review paper. The MIT activity in this area, leading to
MITalk, was described in a book by Allen, Hunnicutt, and Klatt, in
1987."  (KNS)


1970 Klatt, D.H., "Synthesis of stop consonants in initial
     position," J. Acoust. Soc. Amer. 47, Suppl. 1, S93 (1970).
     95% correct for CV nonsense syllables.  (K)


1971 Klatt, D.H., "A theory of segmental duration in English,"
     paper for 82nd Meeting of ASA, Oct 19-22, 1971. Synthesis by
     rule from phonetic spelling with stress marks, durations by
     rule.  First attempt at synthesis by rule. Paper contains a
     listing of phonetic input for demonstration tape. Copy in
     SSSHP96 Reprints.

     SSSHP 91.5 Tape: "MIT - Demo Tape 1, 10/90". Demo tape for
          above paper, dated 11/15/71.
          (syn, 13 sen, 1:52 min: "You are listening to a
          demonstration of ... acoustic cues.")
          7" reel, good quality, copied from Klatt collection


1972 Klatt, D.H., "Acoustic theory of terminal analog speech
     synthesis", Proc. ICASSP-72, 131-135 (1972).  (K) Proposed
     hybrid synthesizer (cascade and parallel formants)


1974 Maeda, S., "A characterization of fundamental frequency
     contours of speech," RLE QPR 114, MIT, 193-211 (1974). Basis
     for Klattalk F0 gestures.  (K)


1975 Klatt, D.H., "Voice onset time, frication, and aspiration in
     work-initial consonant clusters," J. Speech Hearing Res. 18,
     686-706 (1975). Measurements.  (K)


1976 Klatt, D.H., "Structure of a phonological rule component for
     a speech synthesis by rule program," IEEE Trans. Acoust. Sp.
     and Signal Proc. ASSP-24, 391-398 (1976). Segment durations,
     F0 contour, and allophonic variation by rule.  (K)

     SSSHP 93.2 Tape: "MIT - Demo Tape 3, 10/90". Source tape for
          SSSHP 32.21.
          (syn, 26 sec: "T'was the night before Christmas ... long
          winter's nap.")
          7" reel, good quality, copied from Klatt collection

     SSSHP 32.21 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, Sept.
          1987.
          ("T'was the night before Xmas ... danced in their
          heads")
          Cassette, Klatt MIT A/D and D/A


1976 Letter-to-phoneme rules (Hunnicutt) and phoneme-to-speech
     program (Klatt) licensed to Telesensory Systems Inc. for a
     reading machine for the blind (see SSSHP USA Telesensory
     Systems outline).


1979 Klatt, D.H., "Synthesis by rule of segmental durations in
     English sentences," in FRONTIERS OF SPEECH COMMUNICATION
     RESEARCH, ed. by B. Lindbolm and S. Ohman, Academic Press,
     New York, 287-300 (1979).  (K)


     Klatt, D.H., "Synthesis by rule of consonant-vowel
     syllables", Speech Comm Group Working Papers 3, MIT,
     Cambridge, MA, 93-104 (1979). Modified locus theory.  (K)

     SSSHP 91.13 Tape: "MIT - Demo Tape 1, 10/90". Phonemic input
          with stress markings.
          (1:29 min: "This recording is a demonstration of speech
          synthesis by rule and automatic text to speech
          conversion ... developed at MIT.")
          7" reel, good quality, copied from Klatt tape, "Klatt
          Synthesis-by Rule Program 6/79"


1980 Klatt, D.H., "Software for a cascade/parallel formant
     synthesizer", J. Acoust. Soc. Amer. 67, 971-995 (1980).  (K)
     Publication of these FORTRAN programs enabled many speech
     laboratories without the engineering support to build and
     maintain hardware synthesis systems, to do speech synthesis
     experiments. The programs were subsequently improved by Diane
     Kewley-Port and made available to researchers (see SSSHP USA
     Indiana University page.)


Synthesis of Japanese.


1984 Aoki, C., D.H. Klatt, and H. Kawasaki, "Analysis and
     synthesis of Japanese Speech," presentation to the 107th
     meeting of the Acoust. Soc. of America, Norfolk, VA, May 10,
     1984. Spectrographic analysis of three native speakers and
     preliminary set of synthesis rules for DECTalk synthesizer.
     See SSSHP 96 Reprints.

     SSSHP 92.7 Tape: "MIT - Demo Tape 2, 10/90". Demo of DECTalk
          1.8 at Workshop on Digital Signal Processing at
          Marrakech, and early sample of Japanese.
          (male Japanese syn: " ... arimasen.")
          7" reel, good quality, copied from Klatt collection


     Klatt, D.H., and C. Aoki, "Synthesis by rule of Japanese,"
     J. Acoust. Soc. Amer., 76 Suppl. 1, S2.  (K) Preliminary
     synthesis results. Paper has translation of Japanese
     synthesis demonstration. See SSSHP 96 Reprints.

     SSSHP 92.9 Tape: "MIT - Demo Tape 2, 10/90".
          (male syn narration, 47 sec: "You are about to hear a
          demonstration of Japanese DECTalk. As ... five
          seconds."; male Japanese syn, 5 sen, 43 sec: "Kore wa
          syo ... sarete iru.")
          7" reel, good quality, copied from Klatt collection


Synthesis of a female voice


1986 Unsuccessful modification of DECtalk 3.0 male voice to
     simulate a female voice.

     SSSHP 32.9 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, Sept.
          1987.
          (2 sen: "I am the standard ... a female voice")
          Cassette, Klatt MIT A/D and D/A


1986 Klatt, D.H., "Detailed spectral analysis of a female voice,"
     J. Acoust. Soc. Amer., 80 Suppl. 1, S97 (1986).  (K)
     Synthetic male narration, Diane B.'s ("DB") natural voice,
     and synthetic female copy of Diane B.'s voice.

     SSSHP 92.5 Tape: "MIT - Demo Tape 2, 10/90"
          (syn narration: "Analysis and synthesis of a female
          voice, Demo 1. Diane B. sustaining a vowel at nominally
          ..."; female syn, x2: vowels, "?a-?a-?a-?a-?a".)
          7" reel, good quality, copied from Klatt collection

     SSSHP 32.10 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, Sept.
          1987.
          (female syn, natural speech: "?a-?a-?a-?a-?a".)
          Cassette, Klatt MIT A/D and D/A


------------------------------------------------------------- Top
PROJECT: KLATTALK TEXT TO SPEECH SYSTEM (1980 - 1989)


A laboratory text to speech system from the combination of
Hunnicutt letter-to-phoneme rules, 6000-word exception dictionary,
parser, Klatt synthesis by rule.  Formant data for 1 male and 1
female voice, others by scaling.  (K)


1980 Hunnicutt, S., "Grapheme-to-phoneme rules, a review," STL
     QPSR 2-3, RIT, Stockholm, 38-60 (1980).  (K)


Proposal for a commercial product for computer voice from a
phonetic specification.


1980 Klatt, D.H., "The Klatt-Talk KT-1 Phonemic Synthesizer",
     Speech Incorporated, 61 Pleasant St., Brookline MA 02146,
     April 1, 1980. Unpublished. (Copy in SSSHP 96 Reprints.)

     SSSHP 92.2 Tape: "MIT - Demo Tape 2, 10/90"
          (male syn, 6 very long sen, 1:18 min: "The lens buyer
          must approach ... faster than f4.5 is really needed.";
          male syn, 1:01 min: "Yesterday morning, at 11:00, I
          bought an American flag ... dead of winter.")
          7" reel, good quality, copied from Klatt collection

     SSSHP 92.3 Tape: "MIT - Demo Tape 2, 10/90". First female
          voice.
          (male and female syn, 35 sec: "In many applications, it
          is advantageous to be able to synthesize two distinct
          voices. For example, ... for women.")
          7" reel, good quality, copied from Klatt collection


1981 Right to market software obtained from MIT.  (K)


1981 Klatt, D.H., "A text-to-speech conversion system," Proc.
     AFIPS Office Automation Conf., 51-61, 1981.  (K) The full
     text to speech system in the C computer language.


1982 SSSHP 92.6 Tape: "MIT - Demo Tape 2, 10/90"
          (male syn, 1:01 min: "Thank you for inviting me to the
          Second International ... Toronto, Canada, 1982. ...
          Digital Equipment Corp. is currently evaluating ...";
          female syn: "Hello."; child syn: "Hi.")
          7" reel, good quality, copy of Klatt tape: "Klatttalk
          full text-to-speech in C, 1982"


1982 All rights were sold to Digital Equipment Corporation, which
     developed the commercial DECtalk synthesizer (see SSSHP USA
     DEC file.)


1987 Klatt, D.H., "How KLATTALK became DECtalk: An academic's
          experiences in the business world," Speech Tech '87,
          New York, April 1987. Copy in SSSHP 96 Reprints.


------------------------------------------------------------- Top
PROJECT: MITALK TEXT TO SPEECH SYSTEM  (mid 1960s - 1979)


Outgrowth of work on a reading machine for the blind in the
Cognitive Information Processing Group under Prof. Sam Mason.
The text-to-speech component was started by Prof. Francis Lee,
who based the approach on recursive morphological decomposition of
English words.  Prof. Jon Allen took over the text-to-speech
component in 1970.  (For more historical details, see front matter
in 1987 book on MITalk and letter from Prof. J. Allen to H.D.
Maxey, 10/25/90, SSSHP USA MIT correspondence file.)


1968 Allen, J., "A study of the specification of prosodic
     features of speech from a grammatical analysis of printed
     text", Unpublished Ph.D. dissertation, (1968).  (I)


1969 Lee, F.F., "Reading machine: from text to speech," IEEE
     Trans. Audio and Electro., AU-17, 275-282 (1969).
     Decomposition of written words into morphemes, which greatly
     reduces the dependency for letter-to-sound rules. (K) Use of
     Holmes, Mattingly, and Shearme synthesis by rule technique,
     programmed by T.P. Barnwell.

     SSSHP 91.7 Tape: "MIT - Demo Tape 1, 10/90"
          (male syn: "Hello ladies and gentlemen, how are you? I
          will now count from one to ten, ... I will now say
          Humpty Dumpty, ... together again.")
          7" reel, good quality, copy Klatt history tape of 9/1/71


1971 Allen, J., "Speech synthesis from unrestricted text," Proc.
     IEEE International Convention, New York, March 22, 1971.
     Taped presentation on the technical problems to be solved was
     reproduced in the following tape.

     SSSHP 82.2 Tape: "IEEE Soundings: The Human Voice and the
          Computer, August 1, 1971."
          (male syn, 3 sen:"This paper presents a method for
          synthesizing ... applications for this process.")
          Cassette, need copy of master


1972 Letter-to-phoneme rules and Chomsky & Halle rules for 
     assignment of stress were developed by Hunnicutt and Carroll.
     A program for rule implementation was written by Francis X. 
     (Frank) Carroll.  Rule files for letter-to-phoneme and stress 
     assignment were supplied by M. Sharon (Sheri) Hunnicutt.


1973 Allen, J., "Speech synthesis from unrestricted text", SPEECH
     SYNTHESIS: BENCHMARK PAPERS IN ACOUSTICS, J.L. Flanagan and
     L.R. Rabiner, Eds., Dowden Hutchinson and Ross, Inc.,
     Stroudsburg (1973)  (B)


1976 Letter-to-phoneme rules (Hunnicutt) and phoneme-to-speech
     program (Klatt) licensed to Telesensory Systems Inc. for a
     reading machine for the blind (see SSSHP USA Telesensory
     Systems outline).


     Allen, J., "Synthesis of speech from unrestricted text",
     Proc. IEEE 64, 422-433 (1976). Morpheme dictionary.  (K)


     Hunnicutt, S., "Phonological rules for a text-to-speech
     system", Am. J. Comp. Ling. Microfiche 57, 1-72 (1976).
     Letter-to-sound rules.  (K)


     Hunnicutt, S., "A New Morph Lexicon for English," Proc.
     COLING, Ottawa, Canada (1976).


1977 O'Shaughnessy, D., "Fundamental frequency by rule for a
     text-to speech system", Proc. ICASSP-77, 571-574 (1977).
     Better F0 contour.  (K)


1979 Allen, J., S. Hunnicutt, R. Carlson, and B. Granstrom,
     "MITalk-79: The MIT Text-to-Speech System", J. Acoust. Soc.
     Amer. 65, Suppl. 1, S130 (1979). Final form of MITalk
     demonstrated at 1979 meeting of the Acoust. Soc. of Amer.,
     Boston. Morpheme dictionary of 12,000 items.  (K)

     SSSHP 93.6 Tape: "MIT - Demo Tape 3, 10/90". First four
          synthetic sentences are source for SSSHP 32.30.
          (Jon Allen, 8:20 min: explanation of system; syn, 19
          long sen, 2:54 min: "Speech is so familiar a feature
          of daily life ... cultural function."; syn, 9 long sen,
          1:35 min: "We usually take for granted our ability to
          produce and understand speech ... than any other.")
          7" reel, good quality, copy of "Allen-MITalk 79, Oct 4,
          1979"

     SSSHP 32.30 Tape: Demo to accompany "Review of Text-to-speech
          conversion for English," D.H. Klatt, JASA 82.3, Sept.
          1987.
          (male syn, 4 sen: "Speech is so familiar a feature ...
          learning to walk.")
          Cassette, Klatt MIT A/D and D/A

     SSSHP 91.14 Tape: "MIT - Demo Tape 1, 10/90"
          (male syn, 2:32 min: "The remaining speech that you
          will hear ... such as MITalk-79.")
          7" reel, good quality, copied from "Klatt Synthesis by
          Rule Program, 6/79"

     SSSHP 93.10 Tape: "MIT - Demo Tape 3, 10/90"
          (male syn, "?Wind and the Sun. The North Wind and ...
          ... overcoat ... stronger than he was.")
          7" reel, good quality but first words missing, copied
          from Klatt collection

     SSSHP 33.1 Tape: "MITalk, TSI Comparisons (Pisoni Tests)
          1980, D.B. Pisoni, 9/26/80"
          (MITalk, Northwind Passage, 35 sec: "The North Wind and
          the Sun.  The North Wind and the Sun were arguing one
          day ...  stronger than he was.")
          Cassette, fair quality, some print-through, copy of N.R.
          Dixon copy of master

     SSSHP 114 Tape: "MITalk 79 speech from development, 1978-1979,
          MIT, June 2001." Variations of synthesis during develop-
          ment work.
          Sect 1: "Man and Machines", Woody Allen, Virginia Woolf,
                  text about galaxy, text about fire.
          Sect 2: Selection from book LANGUAGE, by Sapir.
          Sect 3: Test material for evaluation of MITalk.
          Sect 4: Variations of "The North Wind and the Sun."
          Analog copy of Digital Analog Tape compiled Apr. 2001 at
          Royal Institute of Technology from Sheri Hunnicutt's 
          collection. Cassette.

     SSSHP 115 Tape: ""MITalk '79. Part of Senior thesis work in 
          Prof. Jonathan Allen's group at MIT by Alex Waibel. No 
          titles, backup copy." Video of MITalk demonstration.
          VHS videotape, 1/2" tape, 7" reel, from Sheri Hunnicutt
          

1987 Allen, J., S. Hunnicutt, and D.H. Klatt, FROM TEXT TO
     SPEECH: THE MITALK SYSTEM, Cambridge Univ. Press, Cambridge
     UK (1987).  (K) Complete description of MITalk system.


------------------------------------------------------------- Top
BIOGRAPHIES


(For more details, and list of publications, see SSSHP 97
Curriculum Vitae, in SSSHP USA MIT file.)


JONATHAN ALLEN

1956 A.B., Dartmouth College, Hanover, NH
1957 M.S. in Engineering, Dartmouth College, Hanover, NH
1957/58 Henry Fellowship, mathematics, Cambridge Univ., England
1962 Bell Telephone Laboratories, human factors computer-based
     testing
1966 Supervisor of Human Factors Engineering
1968 Supervisor of Speech Processing Systems Dept.
     Ph.D. in Elect. Engr., MIT
1968 Dept. of Elect. Engr., MIT
1975 Professor of Electrical Engineering, MIT
1981 Director, Research Laboratory of Electronics, MIT
2000 Deceased, Massachusetts


FRANCIS X. (FRANK) CARROLL


M. SHARON (SHERI) HUNNICUTT

1964 A.B. in mathematics, Gettysburg College, Gettysburg, PA
1965-71 teacher and researcher in mathematics
1967 M.S. in mathematics, Univ. of New Mexico, Albuquerque
1971 Research staff, Research Laboratory of Electronics, MIT
1981 Research staff, Dept. of Speech Comm. and Music Acoustics,
       Royal Institute of Technology (RIT), Stockholm, Sweden
1988 Ph.D., RIT
1989 Docent, KTH (RIT), Stockholm
1997 Lektor, KTH


DENNIS H. KLATT

1960 B.S. in Electrical Engineering, Purdue Univ., Lafayette, IN
1961 M.S. in Electrical Engineering, Purdue Univ., Lafayette, IN
1962 Res. Asst., Comm. Sciences Lab., Univ. of Michigan, Ann Arbor
1964 Ph.D. in Communication Science, Univ. of Michigan, Ann Arbor
     NIH Postdoctoral Fellow
1965 Asst. Prof. of Elect. Engr., Speech Comm. Group, Research
     Laboratory of Electronics, MIT
1979 Senior Research Scientist, Dept. of Elect. Engr.
1989 Deceased, Cambridge, MA


FRANCIS FAN LEE

1950 S.B. in Elect. Engr., MIT
1951 S.M. in Elect. Engr., MIT
1953 Research Engineer, Servomechanisms Lab., MIT
1955 BIZMAC Division, Radio Corporation of America
1956 Mgr. Advanced Systems, UNIVAC Division, Sperry-Rand Corp.,
       Philadelphia, PA
1966 Ph.D. in Elect. Engr., MIT
1966 Elect. Engr. Dept., MIT
     Professor of Elect. Engr., MIT
     Professor Emeritus, MIT


KENNETH N. STEVENS

1945 B.A.Sc. in Engineering Physics, Univ. of Toronto, Canada
1948 M.A.Sc. in Engineering Physics, Univ. of Toronto, Canada
1951 Instructor, MIT
1952 Sc.D. in Electrical Engineering, MIT
     Research Staff Member, MIT
1962/63 Researcher, Royal Inst. of Tech., Stockholm
1963 Professor of Electrical Engineering, MIT
1969 Visiting Professor, University College London
 
 
------------------------------------------------------------- Top
CONTRIBUTIONS AND REVIEW BY:

Prof. Kenneth N. Stevens
Research Laboratory of Electronics
Massachusetts Institute of Technology
Cambridge, MA 02139

(Quoted material from "KNS" is from a personal communication 
from K.N. Stevens to H.D. Maxey, October, 1990, in SSSHP USA 
MIT file.)

SSSHP Contents | Labs | Abbr. | Index

Smithsonian Speech Synthesis History Project
National Museum of American History | Archives Center
Smithsonian Institution | Privacy | Terms of Use