ELECTROTECHNICAL LABORATORY (ETL)
Ministry of International Trade and Industry
1-1-4 Umezono, Sakura-mura, Niiharu-gun
Ibaragi 305, Japan
CONTENTS:
HISTORY
STAGE 1: STATIC VOWELS AND NASAL MURMURS (1963-1966)
STAGE 2: VOCAL TRACT SIMULATOR MODEL 1 (1965-1967)
STAGE 3: TEXT-TO-SPEECH IN ENGLISH, MODEL 2 (1967-1968)
STAGE 4: COMPUTER SIMULATION OF SYSTEM (1968-1969)
BIOGRAPHIES
------------------------------------------------------------- Top
HISTORY OF THE ORGANIZATION
ETL was founded by the Japanese government more than a hundred
years ago for fundamental research in electricity and electronics.
Its original charter directed it to provide leadership in the
research field, with researchers having great freedom to choose
their own objectives. The speech synthesis project was undertaken
because of its challenge and its usefulness to man-machine
communication technology.
The speech synthesis project was carried out by the Speech
Synthesis Group in the Acoustics Section headed by Eiichi Matsui,
who provided the group's research direction. The supervisor of
the Speech Synthesis Group was Ryunen Teranishi.
The Speech Synthesis Group had two objectives:
- Synthesis of continuous speech using a vocal tract
analog simulator controlled by digital computer
- Completely automatic English speech synthesis from
English orthography
The project can be described in four stages. The speech
synthesizer used in each stage was:
Stage 1: Acrylic-made acoustic model of vocal tract (Plastic
Vocal Tract Model), for getting elementary data for
designing the later electronic vocal tract simulator
Stage 2: A composite type speech synthesis system composed of
analog computer vocal tract simulator controlled by a
digital computer
Stage 3: An improved version of Stage 2
Stage 4: Software simulation of vocal tract
------------------------------------------------------------- Top
STAGE 1: SYNTHESIS OF STATIC VOWELS AND NASAL MURMURS (1963-1966)
In this period, Teranishi and Umeda experimented with a plastic
model of the human vocal tract and nasal tract to obtain design
data for the later electronic simulation.
Artifact: Plastic Vocal Tract Model (see ETL News photos),
in possession of R. Teranishi in 1989
1964 Teranishi, R., "Speech synthesis with an acoustic vocal
tract model", ETL News, No. 171, April 1964. In Japanese.
Experiments with the mechanical model and plans for the
analog computer simulation. Photograph of plastic model.
(SSSHP 25 reports)
1965 Teranishi, R., "Normalizing the voice production factors
about the VTM (Vocal Tract Model)", ETL News, No. 181, Feb.
1965. In Japanese. Photograph of plastic model. Model can
produce male, female, or child's voice. (SSSHP 25 reports)
1966 Umeda, N., and R. Teranishi, "Phonemic feature and vocal
feature - synthesis of speech sounds, using an acoustic
model of vocal tract", J. Acoust. Soc. Japan, 22, 4, 195-203,
1966.
SSSHP 26.1 Diskette: "Speech Synthesis by Rule", Electro-
technical Laboratory, 1969.
(Japanese vowels "a, i, u, e, o", various voices)
Plastic diskette, 33 1/3 rpm
SSSHP 39.1 Tape: (R. Teranishi's copy of SSSHP 26 Diskette)
Cassette, good quality,
SSSHP 69 Tape: "Tape 3, Speech sounds of the ETL acoustic
model in 1965", copied from master by Hiroshi Omura,
ETL, June 27, 1989.
(Japanese vowels "a, i, u, e, o", male, female, child;
nasalized vowels)
Cassette, good quality, use for master
------------------------------------------------------------- Top
STAGE 2: SYNTHESIS OF CONTINUOUS JAPANESE SPEECH BY VOCAL TRACT
SIMULATOR MODEL 1 (1965 - 1967)
The first synthesizer was an analog computer simulation of a
17-section vocal tract and fixed nasal tract. It was simulated
as a ladder network of inverse L type LC units. Analog
multipliers were controlled every five milliseconds from digital
code from a tape separately prepared by an IBM 7090 computer.
The Hitachi-built synthesizer used 71 operational amplifiers and
22 multipliers. The multipliers were fixed-resistor networks
switched by photoconductor/neon-lamp pairs, controlled from the
digital tape. (For Hitachi speech synthesis, see SSSHP JAPAN
Hitachi file.)
With computer control, the synthesizer could produce arbitrary
sound sequences, in contrast to the plastic vocal tract model
which was limited to producing only sustained sounds. The
synthesizer was designed by Matsui and Teranishi, and the FORTRAN
control program of the model was designed by Matsui, Suzuki,
Umeda, and Omura. The linguistic, phonetic, and articulatory
specifications were done by Matsui, Suzuki, Umeda and Omura.
1965 Teranishi, R., "Dynamic analog speech synthesizer", ETL
News, No. 189, Oct. 1965. In Japanese. Produces speech
sequencies and sounds more natural than plastic model.
Photograph of synthesizer, waveforms of vowel. (SSSHP 25
reports)
1966 Matsui, E., "Speech synthesizer controlled by computer",
ETL News, No. 197, June 1966. In Japanese. First results.
Photograph of synthesizer, synthesizer circuit diagram, list
of control data, list of sample words, problems to be
solved. (SSSHP 25 reports)
SSSHP 67 Tape: "Tape 1, Synthesized speech reported in the
1966 ETL News #197", copied from master by Hiroshi
Ohmura, ETL, June 27, 1989.
(Japanese vowels, 15 CV's, 4 words, 9 phrases)
Cassette, good quality, samples from paper
1967 "Robot Raconteur", Electronics, Feb. 6, 1967. Photograph.
SSSHP 26.2-3 Diskette: "Speech Synthesis by Rule", Electro-
technical Laboratory, 1969.
(Japanese story, The Peach Boy (Momotaro no ohanashi),
several sentences, Japanese "tongue twisters")
Plastic diskette, 33 1/3 rpm
SSSHP 39 Tape: (R. Teranishi's copy of SSSHP 26 Diskette)
Cassette, good quality, use instead of diskette
SSSHP 71 Tape: "Tape 5, Japanese 'tongue twisters' copied
from the original tape", copied from master by Hiroshi
Ohmura, ETL, June 27, 1989.
(four phrases, three tempos each: "tonari no kyaku ...
tokkyo kyokakyoku")
Cassette, good quality, *** use for master ***
------------------------------------------------------------- Top
STAGE 3. TEXT-TO-SPEECH IN ENGLISH USING VOCAL TRACT SIMULATOR
MODEL 2 (1967-1968)
First demonstrated text-to-speech system for English. Some of
the rules were later used in Bell Laboratories text-to-speech
system (K). (see SSSHP USA Bell Telephone Laboratories file).
This second model was an improved synthesizer, also built by
Hitachi, using high speed analog multipliers to eliminate the
moisture sensitivity of the Model 1's photoconductors and to
reduce noise. A higher-frequency spectrum allowed simulation of
female or child's voice. A pi-network, rather than the earlier
inverse-L network, was used, allowing exact simulation mid-vocal
tract closures. Control data, changed every 5 milliseconds, fed
D/A converters to supply control voltages for the analog
multipliers. A dictionary with 1500-word vocabulary was
sufficient for children's fairy tales. About 30 computer runs
were made in preparing the following demonstration tape.
The synthesizer was designed by Matsui, Suzuki, and Omura, and
the control program by Matsui, Teranishi, Suzuki, and Omura. The
linguistic, phonetic and articulatory specifications were by
Teranishi and Umeda, who also did the dictionary and parser.
1968 Teranishi, R., "Read aloud English story", ETL News, No.
222, July 1968. In Japanese. Project of speech synthesis by
hardware vocal tract model is finished. Photograph of Model
II, discussion of the pronouncing dictionary and the
syntactic analysis, prospects for speech research, listing
of sample input sentences and intermediate string of
phonetic symbols (in alphanumeric computer characters).
Sonagram of "Once upon a time". (SSSHP 25 Reports)
Program Listing: PL/1 source code for converting input
sentences into control signal sequences for ETL's analog
vocal tract simulator Model 2. (SSSHP 36)
Program Listing: Sample input and output for PL/1 program.
(SSSHP 37)
1968 Teranishi,R., Umeda,N., "Use of pronouncing dictionary in
speech synthesis experiments", Proc. Sixth Intern. Congr.
Acoust., Tokyo, Japan, Aug. 1968, Paper B-5-2, B155-B158.
(B,K)
1968 Matsui,E., Suzuki,T., Umeda,N., Omura,H., "Synthesis of
fairy tales using an analog vocal tract", Proc. Sixth
Intern. Congr. Acoust., Tokyo, Japan, Aug. 1968, Paper
B-5-3, B159-B162. (K)
Reprint: "Grimm's Fairy Tales to Read Aloud", Wonder Books,
Inc., 1963. Pages 122 and 123 synthesized. (SSSHP 27)
1968 Umeda, N., and R. Teranishi, "The parsing program for
automatic text-to-speech synthesis developed at the Electro-
technical Laboratory in 1968,", IEEE Trans. ASSP-23, 183-188
(1975). (K)
SSSHP 70 Tape: "Tape 4, 'Sleeping Beauty' copied from the
original tape", copied from master by Hiroshi Ohmura,
ETL, June 27, 1989.
(English, male voice, 12 sen: Sleeping Beauty,"Once
upon a time ... King and Queen ... and fall down
dead."; child voice, 4 sen: "Once upon ... a daughter")
Cassette, good quality, *** use for master ***
SSSHP 26.4 Diskette: "Speech Synthesis by Rule", Electro-
technical Laboratory, 1969.
(English, male voice, 4 sen: Sleeping Beauty, "Once
upon a time...King and Queen...have a child - a
daughter")
Plastic diskette, 33 1/3 rpm
SSSHP 39 Tape: (R. Teranishi's tape of SSSHP 26 Diskette)
Cassette, better quality than diskette
SSSHP 38 Tape: "Tape 4, 'Sleeping Beauty, synthesized by
rule, Electrotechnical Laboratory, Tokyo, August,
1968", copy of copy, R. Teranishi 3/2/89.
(English, male voice, 12 sen: Sleeping Beauty,"Once
upon a time ... King and Queen ... and fall down
dead."; child voice, 4 sen: "Once upon ... a daughter")
3" reel, good quality
SSSHP 32.24 Tape: "Text-to-Speech History, D. Klatt, ASA
Demo, Copy 2, 2/87". Demo to accompany "Review of
Text-to-speech conversion for English," D.H. Klatt,
JASA 82.3, Sept. 1987.
(English, male voice, 3 sen: Sleeping Beauty, "Once
upon a time ... King and Queen ... your wish shall be
fulfilled")
Cassette, good quality, Klatt MIT A/D and D/A
SSSHP 137 Tape: "Sleeping Beauty", E. Matsui, et.al., ETL,
Aug 1968. Another copy.
------------------------------------------------------------- Top
STAGE 4: COMPUTER SIMULATION OF VOCAL ORGAN SYSTEM (1968-1969)
A software-only simulation by Matsui that assembled "sound
elements" synchronized to pitch periods. Each sound element was
a waveform of the impulse response of the simulated vocal tract,
computed by using the Fourier Transform. Implemented in PL/I
language on an IBM S/360 Model 75 computer; took about 20 times
real time.
1968 Matsui,E., "Computer-simulated vocal organs", Proc. 6th
Int. Congr. Acoust., Tokyo, Japan, August 1968, Paper B-5-1,
B151-B154. (I)
1968 "Speech Researches in the Electrotechnical Laboratory",
handout at 6th ICA, Tokyo, Japan, 1968. (SSSHP 14)
SSSHP 68 Tape: "Tape 2, Dr. Matsui's demo in 1968", copied
from master by Hiroshi Ohmura, ETL, June 27, 1989.
(syn, English:"Computer simulated vocal organs.")
Cassette, good quality, tape was played at ICA
------------------------------------------------------------- Top
BIOGRAPHIES
EIICHI MATSUI
1944 Graduated, Tohoku Univ., Sendai
1951 Research assistant, Tohoku Univ., Sendai
1953 ETL Japan (except 1962-63 NBS Washington, D.C., USA)
1960 Ph.D. in Electroacoustics, Tohoku Univ., Sendai
1962-63 National Bureau of Standards, Washington, D.C., USA
1972 Professor, Sizuoka Univ.
1984 Professor, Fukui Institute of Technology
HIROSHI OMURA
1963 B.A. in Elect. Engineering, Nippon University
1965 M.A. in Elect. Engineering, Nippon University
1965 ETL, Japan
TORAZO SUZUKI
1949 B.A. in Electrical Engineering, Mumashi Inst. of Technology
1949 ETL, Japan
1988 retired from ETL
RYUNEN TERANISHI
1950 B.A. and M.A. in Psychology, The Univ. of Tokyo
Research assistant, Hokkaido Univ., Sapporo
1958 ETL Japan (except 1965-66 P.D. Fellow, NRC Ottawa)
1962 Ph.D. in Psychology, The Univ. of Tokyo
1965-66 P.D. Fellow, National Research Council of Canada, Ottawa
1973 Professor, Kyushu Inst.of Design, Fukuoka
1975 Biography, photo in IEEE Trans ASSP, April 1975
1988 Fellow, Acoust. Soc. of America
NORIKO UMEDA
1957 BA in Linguistics, Univ of Tokyo
1959 MA in Linguistics, Univ of Tokyo
1962 Ph.D. in Linguistics, Univ of Tokyo
ETL Japan
1969 Bell Labs, Murray Hill, NJ
1975 biography, photo in IEEE Trans. ASSP, April 1975
1983 Prof. and Chairman, Dept. of Linguistics, 719 Broadway,
New York University, New York, NY 10003
1987 Director, Institute for Speech and Language Sciences,
New York University.
------------------------------------------------------------- Top
CONTRIBUTIONS AND REVIEW BY:
Dr. Takayuki Nakajima, Director
Machine Understanding Division
Electrotechnical Laboratory
1-1-4 Umezono
Tsukuba Science City 305, Japan
Prof. Ryunen Teranishi
Kyushu Institute of Design
4-9-1 Shiobaru, Minami-ku
Fukuoka-shi, 815 Japan
(History details were obtained from T. Nakajima and R. Teranishi,
in personal communications to H.D. Maxey in 1988/89. See SSSHP
JAPAN Electrotechnical Lab. file.)
|