Tran. Index | SSSHP Contents | Labs | Abbr. | Index | Page- | Page+
 
Transcription of Recordings - p. 3

SSSHP 48: "TEXT-TO-SPEECH SYNTHESIS USING DYADS", M.Y. LIBERMAN
          & J.P. OLIVE, 12-88"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Cassette, good quality.

CONTENTS: Text-to-speech using synthetic dyads, segments of human
speech stored as linear-prediction area parameters. Features are
good pronounciation of foreign names and other languages (example
of Mandrin Chinese.) Example of man/machine interaction using a
speech recognition program.

   (syn, 2:11 min: "Hello, I am a system for real time translation
   of unrestricted text into speech, developed in the Information
   Principles Laboratory at AT&T Bell Labs. My input is ordinary
   text, and my output is the kind of speech that you are hearing
   now. My inventors think that my word pronounciation and
   intelligibility are the best around. I can pronounce proper
   names of many nationalities correctly, for instance ... easy
   for us to produce new voices, accents, or languages ...

   ... a system for Mandrin Chinese that sounds like this ...

   ... This is the Bell Laboratories Flight Information System.
   May I help you?")

   (human: "I want one first class seat on flight number three one
   to Denver on Sunday.")

   (syn: "I'm sorry, the flight is not available on Sunday.")

   (human: "I would like to leave on Saturday.")

   (syn: "Please specify your departure time.")

   (human: "I want to depart at nine AM.")

   (syn: "Flight number three one ... available.")

END:  **********



SSSHP 49: "DIGITAL SPEECH CODING, R. V. COX, 4-29-88"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Cassette, good quality, some echo from printthrough.

CONTENTS:  Several vocoders for coding human speech from 64
kilobit/sec to 2.4 kilobit/sec at AT&T Bell Laboratories. Human
narration is processed with different vocoder techniques.


1. PULSE CODE MODULATION (PCM) AT 64 KILOBITS/SEC.

   (human: "The speech samples that will be used ... well
   established standard for long distance communications.)


2. ADAPTIVE DIFFERENTIAL PULSE CODE MODULATION (ADPCM) AT 32
KILOBITS/SEC.

   (human: "This is 32 kilobits/second ...  a recently proposed
   standard for digital telephony.")


3. ADAPTIVE BIT-ASSIGNMENT SUB BAND CODING (SBC) AT 16
KILOBITS/SEC.

   (human: "This is 16 kilobits/second ... sub band coding.")


4. STOCASTICALLY EXCITED LINEAR PREDICTIVE CODING (CELP) AT 8
KILOBITS/SEC.

   (human: "This is 8 kilobits/second ... for digital cellular
   mobile radio ... stocastically excited linear predictive
   coding.")


5. STOCASTICALLY EXCITED LINEAR PREDICTIVE CODING (CELP) AT 4.8
KILOBITS/SEC.

   (human: "This is 4.8 kilobits/second ... stocastically excited
   linear predictive coder.")


6. LINEAR PREDICTIVE VOCODER (LPC 10 E) AT 2.4 KILOBITS/SEC.

   (human:  "This is 2.4 kilobits/second, a standard digital rate
   required in high frequency band radio links. ... provide
   synthetic sounding speech ... linear predictive vocoder.")

END:  **********



SSSHP 50: "J.L. Flanagan and K.I. Ishizaka, Vocal Cord-Vocal
          Tract Synthesizer, ASA Meeting, Austin, TX, 4-10-75"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Cassette, some print-through echos.

CONTENTS: Demonstration tape from Acoustical Society of America
meeting, Austin, Texas, April 10, 1975. Use of a vocal cord-
vocal tract synthesizer with the Coker text-to-speech articulatory
model at AT&T Bell Laboratories.

   (syn, syllables x3: "uhpa', a'buh, ha, ha'ha")

   (syn, x2, all noise generation active: "She saw the house.")
   (syn, x2, no glottal noise generation: "She saw the house.")
   (syn, x2, no constriction noise gen.:  "She saw the house.")
   (syn, x2, no noise generation:         "She saw the house.")
   (syn, x2, all noise generation:        "This is a tes(t).")
   (syn, x2, /t/ aspirated:               "This is a test.")

   (syn, x2: "We were away a year ago.")

END:  **********



SSSHP 51: "AT&T BELL LABS VODER FROM WORLD'S FAIR EXHIBITS
          (New York, San Francisco) 1939-40 era"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Cassette, good quality, -10vu, a word (probably "comparatively")
seems to be missing from the narration.

CONTENTS:  Keyboard-operated 10-filter Vocoder.  Demonstrated at
1939 New York's World's Fair and 1940 San Francisco World's Fair.
US Patent No. 2,121,142. Unknown narrator introduces Mr. Garrett,
to explain the Voder, and Miss Helen Harper, to operate it.
Synthesis examples are each introduced by Mr. Garrett. Having Mr.
Garrett say the phrase first, greatly helped the listeners to
understand the Voder.


   (human:  "There are ten filter circuits in the Voder, and
   combined with the two energy sources, they give a total of
   twenty separate components to be used in building up speech
   sounds. But now let's have Mr. Garrett and Miss Harper ... )

   (syn, 4 inflections: "She saw me.")

   (syn, 3 voices: "Greetings everybody.")

   (syn, 2 voices: "Mary had a little lamb. It's fleece was
   white as snow. And everywhere that Mary went, the lamb was
   sure to go.")

   (syn: "ha, ha, ha")

   (syn, creaky voice: "Yes, I feel very old.")

   (syn, monotone and vibrato: "a")

   (syn, musical scales: "a")

   (syn, song Auld Lang Syne: "Should auld acquaintance be forgot,
   and ... days of auld lang syne.")

END:  **********



SSSHP 52: "TEXT-TO-SPEECH SYNTHESIS FROM AN ARTICULATORY MODEL,
          C.H. Coker, 1-28-80"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Cassette, good quality.

CONTENTS: AT&T Bell Laboratories text to speech using an
articulatory synthesizer.


   (syn, 18 sec: "Hello, I am a computer. I'm happy to ..
   demonstration of the way I speak.  Pretty good, isn't it?  When
   you consider that English is not my native language, and I'm
   still learning.  Next year, I will sound even better.")

END:  **********



SSSHP 58: "SPEECH CODING, demonstration recording to accompany
          paper of the same name, IEEE Trans. on Comm., Vol.
          COM-27, No. 4, April 1979, pp. 710-737"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Phonograph record, 7" plastic diskette, 33 1/3 rpm

CONTENTS:

Examples of various types of vocoders.  A description of the
contents is in Appendix A of the referenced paper (copy in SSSHP
98 Reprints.) This material is not included in the SSSHP.

END:  **********



SSSHP 59: "SYNTHETIC VOICES FOR COMPUTERS, demonstration
          recording to accompany paper of the same name, IEEE
          Spectrum, Vol. 7, No. 10, October 1970, pp. 22-45"

SOURCE:  Received 3/23/89 from Dr. James L. Flanagan, AT&T Bell
Laboratories. See SSSHP USA Bell Telephone Laboratories file.
Phonograph record, 7" plastic diskette, 33 1/3 rpm, 2 copies.

CONTENTS:  See cassette copy, SSSHP 99.

END:  **********



SSSHP 61: "NTT-4, Sadaoki Furui (NTT), Apr. 12, 1989"

SOURCE: Received 4/12/89 from Dr. Sadaoki Furui, NTT. See SSSHP
JAPAN Nippon Telegraph and Telephone Corp.  file.  7" reel, good
quality.

CONTENTS: Text-to-speech conversion system, LSP/CV synthesis in
1984. Synthesis of Japanese female voice. See SSSHP JAP NTT file
for transcription in konji.

   (syn, 2 sen: "Jishinnoyoona toppatsugenshooo yochisurukotowa
   muzukashii. Jishinto iumonoga hontouni toppatsutekideatte,
   soreno okoru maeni zenchootekina monoga zenzen nanimo
   arawarenaitoiumonode arunaraba, yochiwa genritekini
   fukanoodearu.")

END:  **********



SSSHP 62: "SPEECH SYNTHESIS BY RULE SUPPLEMENTARILY USING NATURAL
          SPEECH SEGMENTS, NEC Corp."

SOURCE:  Received May 25, 1989, from Mr. Yukio Mitome, NEC Corp.,
Kawasaki, Japan.  See SSSHP JAPAN NEC Corp. file. Cassette, good
quality, -10vu.

CONTENTS:  Terminal analog speech synthesizer in 1968.  Small
digital computer (NEAC-L3) controlled a serial synthesizer.
Unvoiced natural speech segments on magnetic drum. Synthesis of
male and female speech.

   (human narration, in English)

   (syn, 5 variations:"towadako ewa aomori kara basu de ikimasu.")

   (syn, 7 phrases: "...")

END:  **********



SSSHP 63: "A TERMINAL ANALOG SPEECH SYNTHESIZER IN A SMALL
          COMPUTER (TV COMMERCIAL), NEC Corp."

SOURCE:  Received May 25, 1989, from Mr. Yukio Mitome, NEC Corp.,
Kawasaki, Japan.  See SSSHP JAPAN NEC Corp. file. Cassette, -15vu.

CONTENTS: Terminal analog speech synthesizer in 1971.
Serial-parallel synthesizer controlled by a NEAC-3100 computer.
Linear formant transitions.  Analog formant circuits tuned by
pulse width modulation. Synthesis of male Japanese voice.

   (human narration in English, 6 sec)

   (dialog between human child and synthesizer,1:23 min)

   (human narration, 10 sec: "Hardware, software, ... NEC ...")

   (repeat of dialog between child and synthesizer, 1:23 min)

   (repeat of human narration, 10 sec: "Hardware, software, ...
   NEC ...")

   (syn only, 2:09min: "Konbanwa ... desu.")

END:  **********



SSSHP 64: "JAPANESE SPEECH SYNTHESIS SYSTEM IN A BOOK READER FOR
          THE BLIND, NEC Corp."

SOURCE:  Received May 25, 1989, from Mr. Yukio Mitome, NEC Corp.,
Kawasaki, Japan.  See SSSHP JAPAN NEC Corp. file. Cassette, good
quality, -15vu.

CONTENTS:  Text to speech from image scanner to synthesized
speech, 1986.  Project sponsored by Ministry of International
Trade and Industry, Japan, as technology development for
handicapped persons. Synthesis of male Japanese voice.

   (human narration, in English, 14 sec)

   (syn, part of the novel, Bocchan, 1:08 min)

END:  **********



SSSHP 65: "SPEECH SYNTHESIS BY RULE and EFFICIENT SPEECH CODING,
          Hitachi, Ltd."

SOURCE: Received from Mr. S. Takeda, Hitachi, Ltd., Tokyo, Japan,
July 1, 1989.  See SSSHP JAPAN Hitachi, Ltd., file.  Hitachi
master tapes were transferred to Digital Audio Tape (DAT) in May,
1989. This cassette was made from the DAT in June, 1989. Cassette,
good quality.

CONTENTS: Narration by female in English


1. CASSY, CONFIGURATIONAL ANALOGUE SPEECH SYNTHESIZER, 1968

Demonstration tapes for Ichikawa,A., and Nakata,K., "Speech
synthesis by rule", Proc. 6th Int. Congr. Acoust., Tokyo, Japan,
paper B-5-6, B171-174, Aug 1968, and Nakata, K., A. Ichikawa, and
T.  Miura, "Vocal tract analog speech synthesis by rule",
Preprints of Speech Symposium Kyoto, paper A-5, pp 1-5, Aug 1968.
Original recording Aug 1968, on 5" reel.  Song is "Ue o muite
arukou", or "Sukiyaki".

   A. (syn, male/female/child, short sentences in Japanese, x2, 35
      sec)

   B. (syn, song in Japanese, 1:45 min)


2. SYNTHESIS RULES FOR EXPRESSION OF EMOTIONS, 1967

Demonstration tape for Nakayama, T., A. Ichikawa, and T. Miura,
"Generalization of control rule of buzz source to improve the
naturalness of synthetic speech," 6th Int. Congr. Acoust., paper
B-5-5, B167-170, Aug 1968.  Sentence "Aoi ie o uru" ("To sell a
blue house.") with 9 variations in emphasis.  Original adult
female speech is converted to adult male voice by rule.  Original
recording Mar 1967, on 5" reel.  Use left channel.

   A. (syn, declarative sentence:          "Aoi ie o uru.")

   B. (syn, interrogative sentence:        "Aoi ie o uru.")

   C. (syn, emphasis on the first phrase:  "Aoi ie o uru.")

   D. (syn, emphasis on the second phrase: "Aoi ie o uru.")

   E. (syn, emphasis on the third phrase:  "Aoi ie o uru.")

   F. (syn, anger:                         "Aoi ie o uru.")

   G. (syn, joy:                           "Aoi ie o uru.")

   H. (syn, sorrow:                        "Aoi ie o uru.")

   I. (syn, anger with a loud voice:       "Aoi ie o uru.")


3. MULTIPLEXED SPEECH SYNTHESIZER FOR SPEECH SYNTHESIS BY RULE

Demonstration tape for Kimura, Y., A.  Ichikawa, K.  Nakata, T.
Hyodo, and T. Aso, "Development of audio response unit",
Information Processing in Japan, 12, 1-7, Dec 1972.  Original
recording Feb 1972, on 7" reel.  Part A is output sentences for
seat reservations.  Part B is words for use in output sentences
for seat reservations.  Low level (-20 vu), use left channel.

   A. (syn, sen in Japanese, 2:20 min)

   B. (syn, words in Japanese, 2:05 min)


4. MULTIPLEXED SPEECH SYNTHESIS METHOD FOR SPEECH SYNTHESIS BY
   RULE

Demonstration tape for Ichikawa, A., and K. Nakata, "A method of
a speech segments generation in the speech synthesis of
mono-syllables edition," Transactions of the Institute of
Electronics and Communication Engineers of Japan, D, 58-D, 9,
522-529, Sep 1975, in Japanese. Part A, synthesis by rule
sentences, was originally recorded Aug 1975, on 7" reel. Part B,
corporation names synthesized and inserted in natural speech
sentences, was originally recorded Feb 1974, on 7" reel. Level -10
vu, use left channel.

   A. (syn, female Japanese, syn by rule, 13 sen)

   B. (syn, female Japanese, Corp. names in natural sen, 45 sec)


5. SPEECH SYNTHESIS BY RULE FOR JAPANESE WORDS, 1984

Speech synthesis using PARCOR synthesizer. Fundamental frequency
contours were produced based on Fujisaki's Model (2nd order linear
system). Male Japanese synthesis of names of places, people, and
corporations.  Original recording Mar 1984, on cassette. Level
-5vu.

   (syn, male Japanese, 18 words)


6. SPEECH SYNTHESIS BY RULE FOR JAPANESE SENTENCES

Demonstration tape for Takeda, S., "A study of methods for
improving quality of female speech produced by residual-excited,
rule-based synthesis system," Inst. of Elect. Info. and Comm.
Engineers Technical Report (Speech), SP89-3, 17-24, May 1989, in
Japanese. Original recording May 1989, on cassette. Level -10vu,
both channels.

   (syn, sample scientific sentence, 13 sec)


7. LSI PARCOR VOCODER

The first speech synthesis chips in Japan. Demonstration tapes for
Sampei, T., A. Asada, and K. Nakata, "High quality PARCOR speech
synthesizer," IEEE Trans. CE., 26, 8, 353-358, Aug 1980. Original
recordings Apr 1980 (A), May 1980 (B and C), and Sep 1980 (D), all
on cassette. Level -10vu.

Synthesis at 2.4 kilobits per second.

   A. (syn, male English words, x3: "Car. Enemy. Play. Player.
      Off. Dial. -?-. -?-. Tire. Push. -?-. Two. Danger. Over.
      -?-.")

   B. (syn, female Japanese, 50 words and phrases)

   C. (syn, female Japanese, 3 sen)

Synthesis at 9.6 kilobits per second.

   D. (syn, female Japanese, 2 sen, x2)

      (syn, female English, x2:  "It's time to take your medicine.
      It's time for your appointment.")

      (syn, male Japanese, The Hyakunin-isshu, selected wakas, 55
      sec)


8. LPC PARCOR VOCODER USING HITACHI DSP CHIPS

Demonstration tape for Nakata, K.  and T. Miyamoto, "An
implementation of real time PARCOR analysis by high speed signal
processors," 11th Int. Congr. Acoust., Paris, 125-128, 1983.
Original recording Apr 1982, on cassette. Left channel contains
input speech. Right channel contains LPC VOCODER speech at 9.6
kilobits per second.

   A. (male Japanese, speaker 1, 10 sen)

   B. (male Japanese, speaker 2, 10 sen)

   C. (female Japanese, speaker 1, 10 sen)

   D. (female Japanese, speaker 2, 10 sen)


9. TOR VOCODER

Demonstration tape for Ichikawa, A., S. Takeda, and Y. Asakawa,
"A speech coding method using thinned-out residual," ICASSP85,
25.7, 961-974, Mar 1985. Original recording Apr 1985, on cassette.
Level -10 vu. Samples are input speech, followed by Vocoder at 16,
9.6, and 8 kilobits per second.

   A. (female Japanese, input and 3 rates, 4 sen)

   B. (male Japanese, input and 3 rates, 4 sen)

Demonstration tape for Miyamoto, T., K. Kondo, T. Suzuki, Y.
Asakawa, and A. Ichikawa, "Single DSP 8 kbps speech codec,"
ICASSP86, 33.10, 1717-1720, Apr 1986. English sentence, input
speech followed by Vocoder at 8 kilobits per second. Level -10vu.

   C. (female, input and 8 kbps: "Japanese women tend to hide
      their mouth with their hands when laughing.")

      (male, input and 8 kbps: "Tokyo is Japan's largest city,
      holding almost ten percent of Japan's total population.")

END:  **********



SSSHP 66: "SMITHSONIAN SPEECH SYNTHESIS PROJECT, #808, HASKINS
          LABORATORIES, Dr. Patrick W. Nye"

SOURCE: Received 8/1/89 from Dr. Patrick W. Nye, Haskins
Laboratories. See SSSHP USA Haskins Laboratories file. 7" reel,
good quality.

CONTENTS:


   (human: "These recordings from the sound archives of Haskins
   Laboratories, were assembled during the months of February and
   March, 1989. The recordings are introduced by Patrick Nye.")


1. The first utterance produced by the Pattern Playback, circa
1949. Copy of spectrogram from Bell Telephone Laboratories.

   (syn: "Eat at Joe's. ")


2. Sample of synthetic versions of Harvard sentences recorded
circa 1950 from the Pattern Playback.

   (syn: "The gift of speech was denied the poor child. Never
   kill a snake with your bare hands. Death marks the end of our
   efforts. The gift of speech was denied the poor child. Never
   kill a snake with your bare hands.")


3. Demonstration accompanying paper by Franklin Cooper, given at
the National Academy of Sciences, October 10, 1950.

   (syn, 3 variations: "Many are taught to breath through the
   nose.";syn, x2:"  A large size of stockings is hard to sell.";
   syn: "Bae, dae, gae.")


4.  Typical Pattern Playback synthesis of this period included
three methods, illustrated by the following samples.  First is
transmission through a transparent (photographic film) spectrogram
of natural speech.  Second is reflection from a carefully painted
copy of a natural spectrogram.  Third is reflection from a
hand-painted simplified spectrogram.

   (syn, 3 methods, x3: "Never kill a snake.")


5. Four other sentences produced by the Pattern Playback, circa
1953.

   (syn: "These days a chicken leg is a rare dish.  It's easy to
   tell the depth of a well.  Four hours of steady work faced us.
   A large size of stockings is hard to sell.")


6. Studies on the Pattern Playback of the role of consonant-
vowel transitions in the perception of stop and nasal consonants,
1954.

   (syn: "Ba, da, ga. Pa, ta, ka. Am, an, ang. Ba, pa, am.")

1955 Delattre, P., A.M. Liberman, and F.S. Cooper, "Acoustic loci
     and transitional cues for consonants," J. Acoust. Soc. Amer.
     27, 769-774 (1955). Locus theory for CV syllables.  (K)


7. The following two sentences were synthesized by rule by Pierre
Delattre in 1955.

   (syn: "Oh my aching back. A big bad man demanding money can
   kill you, bang bang.")


8. The musical composition "Scotch Plaid" composed by Pierre
Delattre and played on the Pattern Playback, about 1957.

   (syn, Scotch Plaid, 3 speeds, 30 sec:)


9. Study on Pattern Playback of incremental variations of
   formant transitions of stop consonants, 1957.

   (syn, 13 CV's: "Bae,..., dae,..., gae.")


10. Synthesis by rule on Pattern Playback by Frances Ingemann,
1959.

   (syn,: "I synthesized this by rule without looking at a
   spectrogram. Can you understand it?")


11. Octopus formant synthesizer, 1958.

   (syn, x2: "Box, socks, sacks, sack, sash, gash, gas, guess,
   yes.")


12. Last words of the Octopus, on its performance, in 1958.

   (syn: "No comment.")


13. The Voback, a Voder-type synthesizer. Synthesis prepared by
rule by Pierre Delattre, 1957.

   (syn, 2 rates: "Alexander's an intelligent conversationalist.")


14. Syllables on the Voback, for a demonstration tape made for
the 4th International Congress of Phonetic Sciences, Helsinki,
1961.

   (syn, x2: "Fa, tha, sa, sha.  Sha, dza, cha, dza.")


15. Examples of the fundamental frequency and rate control
available with the Voback's vocoder, 1963. The vocoder's control
voltages were recorded and replayed on a multi-speed 22-track
tape recorder. The original utterance comes first, followed by
Voback output with modified intonation or speed.

   (English, original and Voback with different intonation:
   "Wouldn't you like to know? The play ended happily.")

   (Thai, original and Voback with different intonation:
   "Nau .....")

   (English paragraph, original and Voback at slower speed: "Solar
   photography is ... only star.")


16. Haskins parallel formant synthesizer. Synthesis by Jane
   Gaitenby, 1967.

   (syn: "There are many books of Christmas stories for children,
   but this one deserves your very special attention. Will you
   have another helping of roastbeef? No, thank you, the last one
   was absolutely disgusting. Se tu, fini. Or do you prefer, se
   tu, fini. ... from a very big machine. This is the kind of
   speech we can expect from a reading machine. Three stress
   levels are used in this demonstration. Three stress levels
   are used in this demonstration.")


17. Examples of synthesis by rule of General American English
with the parallel formant synthesizer, from the thesis of Ignatius
Mattingly in 1968.

   (syn, 3:59 min:  "You are listening to speech ...  (V's and
   VCV's) ...  stress and intonation is also distinctive ...
   (single and polysyllabic words and phrases) ... Give me a
   breath of fresh air. A penny saved is a penny earned. A bird in
   the hand is worth two in the bush. The rain in Spain falls
   mainly in the plain. Now is the time for all good men to come
   to the aid of the party. Are you nearly ready? Emily Post said,
   'If two are served, all may begin'. The most difficult step in
   the study of language is the first step. If you receive a
   malicious or annoying phone call ...  telephone business
   office.  Ignatius Mattingly.")


18.  Haskins formant synthesis.  Presented at IEEE Int.  Conf. on
Communications, Boulder CO, June 9-11, 1969.  Umeda parsing
procedure with Mattingly synthesis-by-rule.  First passage was
synthesized by art using hand selected phonetics, stress, etc. and
increasing speaking rate toward the end of the passage. The second
passage was synthesized entirely by rule.

   (syn, The North Wind and the Sun:  "The North Wind and the Sun
   were arguing one day about which of them ...  wrapped in a warm
   coat ...  Sun was the stronger of the two.")

   (syn, The Bluejay by Mark Twain, 1:22 min:  "Some early
   research in animal communication, by Mark Twain.  ...  stuck
   for a word.")


19.  Haskins formant synthesis.  Demonstration of word and stress
rules for a reading machine, by Gaitenby, Sholes, and Kuhn, at the
Conference on Speech Communication and Processing, Newton, Mass.,
1972.

   (syn, 2:21 min:  "This is the voice of the synthesizer at
   Haskins Laboratories.  There are two main parts of Mattingly's
   speech synthesis program.  The first part consists of a table
   of standard American English phonemes, and the second part
   consists of digital instructions for combining the phonemes
   into syllables with reasonable intonation.  ...  (12 example
   phrases) ...  so many other possibilities.")


20. Haskins formant synthesis. Examples of Modified Rhyme Test
stimuli for consonant intelligibility studies, 1973, by Nye and
Gaitenby.

   (syn, 8 sen: "Number 1, please mark the word, sill. Number 2,
   please mark the word, look. Number 3, please mark the word,
   best. ... mark the word, dust.")


21. Haskins formant synthesis by rule, 1973.  Examples from
reading comprehension tests for a prototype reading machine
service for the blind.

   (syn, 30 sen, 4:55 min: "Just as a plan to climb ...  and
   cross the ocean at top speed, and baseball players improve
   their batting averages, records are made and broken year after
   year in driving tunnels.  Tunnel men long remember tales of
   tunnels that were hard to ....  always lurking.")


22. Haskins formant synthesis. Examples from intelligibility tests
of synthetic monosyllable words in short, syntactically normal
sentences. By Nye and Gaitenby, 1974. Haskins Anomalous sentences.

   (syn, 7 sen: "The wrong shot led the farm. The black cup
   ran the spring. ....  The rich paint said the land.")


23. Extract from a textbook, synthesized on the OVE-III, firstly
   by rule, and then by art. Sample produced in 1976.

   (syn, 1:24 min: "Perception. The first step in the study of
   perception is to describe our immediate experience of the
   physical world in which we live. ... context in which they
   occur.")

END:  **********



SSSHP 67: "TAPE 1, SYNTHESIZED SPEECH REPORTED IN THE 1966 ETL
          NEWS #197, Hiroshi Ohmura, ETL"

SOURCE:  Received October 7, 1989, from Dr. Takayuki Nakajima,
Director, Machine Understanding Division, Electrotechnical
Laboratory, Tsukuba Science City, Japan. See SSSHP JAP Electro
Technical Laboratory file.

CONTENTS:  Synthesized speech as reported in the 1966 ETL News
#197.  Narration in English.  Synthesis of Japanese.  Cassette,
good quality.


1. Five Japanese vowels.

   (syn, x2: "A, i, u, e, o.")


2. Nasal /m/ in CV-syllables. -10vu

   (syn, x2: "Ma, mi, mu, me, mo.")


3. Nasal /n/ in CV-syllables. -10vu

   (syn, x2: "Na, ni, nu, ne, no.")


4. Fricative /s/ in CV-syllables. -10vu

   (syn, x2: "Sa, si, su, se, so."


5. Words: cow, horse, dog, wild boar.

   (syn, x2: "Usi, uma, inu, inoshishi.")


6. Phrases: Nice meal. Nice sashimi. Nice sushi. Sweet pear.

   (syn, x2: "Umai meshi. Oishii sashimi. Umai osushi. Amai
   nashi.")


7. Phrases: Blue sea. Shallow sea. Cold morning. Heavy stone.
Small house.

   (syn, x2: "Aoi umi. Asai umi. Samui asa. Omoi ishi. Semai
   ie.")

END:  **********



SSSHP 68: "TAPE 2, DR. MATSUI'S DEMO IN 1968, Hiroshi Ohmura ETL"

SOURCE:  Received October 7, 1989, from Dr. Takayuki Nakajima,
Director, Machine Understanding Division, Electrotechnical
Laboratory, Tsukuba Science City, Japan. See SSSHP JAP Electro
Technical Laboratory file.

CONTENTS: Synthesized speech. Dr. Matsui reported in ICA paper
B-5-1 in 1968. Computer assembled speech in English. Narration in
English. Cassette, good quality, -5vu.

   (syn, x3: "Computer simulated vocal organs.")

END:  **********



SSSHP 69: "TAPE 3, SPEECH SOUNDS OF THE ETL ACOUSTIC MODEL IN
          1965, Hiroshi Ohmura ETL"

SOURCE:  Received October 7, 1989, from Dr. Takayuki Nakajima,
Director, Machine Understanding Division, Electrotechnical
Laboratory, Tsukuba Science City, Japan. See SSSHP JAP Electro
Technical Laboratory file.

CONTENTS: Synthesized speech sounds of an acoustic model
simulating a human vocal tract. Dr. Umeda and Dr. Teranishi
reported in the 1965 ETL News #181, and paper "Phonemic feature
and vocal feature," J.A.S.J. 22, 4, pp 195-203, 1966.  Cassette,
good quality.


1. Japanese vowels for male, female, and child.

   (syn, 3 voices, x2: "A, i, u, e, o.")


2. Unnatural vowel sound of child's vocal tract and male's glottal
source.

   (syn, x2: "A, i, u, e, o.")


3. Unnatural vowel sound of male's vocal tract and child's glottal
source.

   (syn, x2: "A, i, u, e, o.")


4. Vowels and nasalized vowels.

   (syn, vowel and nasalized vowel, x2: "A, a, i, i, u, u, e, e,
   o, o.")

END:  **********



SSSHP 70: "TAPE 4, 'SLEEPING BEAUTY' COPIED FROM THE ORIGINAL
          TAPE, Hiroshi Ohmura ETL"

SOURCE:  Received October 7, 1989, from Dr. Takayuki Nakajima,
Director, Machine Understanding Division, Electrotechnical
Laboratory, Tsukuba Science City, Japan. See SSSHP JAP Electro
Technical Laboratory file.

CONTENTS: Grimm's fable, Sleeping Beauty, copied from the original
tape. Narration in English, synthesis in English.  Cassette, good
quality, use for master.


   (syn, male, 12 sen, 2:09 min: "Once upon a time ...  King and
   Queen ...  and fall down dead."

   (syn, child, 4 sen, 31 sec: "Once upon ... a daughter.")

END:  **********



SSSHP 71: "TAPE 5, JAPANESE 'TONGUE TWISTERS' COPIED FROM THE
          ORIGINAL TAPE, Hiroshi Ohmura ETL"

SOURCE:  Received October 7, 1989, from Dr. Takayuki Nakajima,
Director, Machine Understanding Division, Electrotechnical
Laboratory, Tsukuba Science City, Japan. See SSSHP JAP Electro
Technical Laboratory file.

CONTENTS: Japanese tongue twisters, copied from the original tape.
Narration in English, synthesis in Japanese. Cassette, good
quality, use for master. Four phrases, three tempos each.

   (syn: "Tonari no kyaku ha yoku kaki kuu kyakuda.")

   (syn: "Bouzu ga byoubu ni zyouzu ni bouzu no e o kaita.")

   (syn: "Namagome namamugi namatamago.")

   (syn: "Toukuou tokkyo kyokakyoku.")

END:  **********



SSSHP 72: "SMITHSONIAN SEMINAR, H.D. MAXEY 6/87"

SOURCE: Donated by H.D. Maxey, 5/27/90.

CONTENTS:  Demonstration tape for seminar on the Smithsonian
Speech Synthesis History Project at the Smithsonian. Most of the
samples are taken from tape SSSHP32, "Text-to-Speech History, D.
Klatt, ASA Demo, Copy 2, 2/87".  Demo to accompany "Review of
Text-to-speech conversion for English," D.H. Klatt, JASA 82.3,
Sept. 1987. Outline of talk is in the SSSHP files. Cassette, good
quality.


1. The Voder, Bell Telephone Laboratory, 1939.  Manually operated
vocoder. Copy of SSSHP 32.1.

   (Radio Announcer: "Will you please make the Voder say for our
   Eastern listeners, 'Good evening, radio audience'?")

   (syn:  "Good evening, radio audience.")

   (Radio Announcer:  "And now for our Western listeners say,
   'Good afternoon, radio audience.'")

   (syn:  "Good afternoon, radio audience.")


2. DAVO, Massachusetts Institute of Technology's Electronic Vocal
Tract, 1958.  Articulatory synthesizer with hand prepared control
data. Copy of SSSHP 32.11.

   (syn: "This is the voice of DAVO at MIT."; song:"A, B, C, ...")


3. Haskins Laboratory's Pattern Playback, 1951. Fifty-channel
spectrum synthesizer at multiples of 120 Hz. Copy of SSSHP 32.2.

   (syn: "These days a chicken leg is a rare dish.  It's easy to
   tell the depth of a well.  Four hours of steady work faced us.
   A large size in stockings is hard to sell.")


4. Royal Institute of Technology's OVE-I, 1953. Hand controlled
formant synthesis. Copy of SSSHP 32.4.

   (syn: "How are you?  I love you.")


5. PAT, Signals Research and Development Establishment, England,
1953. Hand prepared formant synthesis. Copy of SSSHP 32.3.

   (syn: "What did you say before that?  Tea or coffee?  What have
   you done with it?")


6. PAT, Edinburgh University, 1962. Hand prepared formant
synthesis. Copy of SSSHP 32.5.

   (syn: "Welcome to the Stockholm Speech Communication Seminar.")


7. Royal Institute of Technology's OVE-II, 1961. Hand prepared
formant synthesis. Copy of SSSHP 32.7.

   (syn, male: "I enjoy the simple life.")
   (human: "I enjoy the simple life.")
   (syn, fem: "He knows just what he wants.")
   (human: "He knows just what he wants.")


8. TASS-II, International Business Machines Corp., 1965. Hand
prepared formant synthesis. First sample is repeated, each time
controlling an additional parameter. Narration by N.R. Dixon.

   (syn, 7 variations: "Inventory Number 902, Series A.")

   (syn, "Now this is Old Ironjaw talk'in.")


9. First computer-based phonemic synthesis by rule program, Bell
Telephone Laboratories, 1961. Copy of SSSHP 32.16.

   (syn: "To be, or not to be. That is the question. Whether it is
   nobler in the mind to suffer the slings and arrows of
   outrageous fortune or to take arms against a sea of troubles,
   and by opposing, end them.")


10.  TASS-II, phonetic synthesis by rule using diphone
concatenation, International Business Machines Corp, 1968. Copy of
SSSHP 32.18.

   (syn:  "The number you dialed, ME1-5280, has been changed. The
   new number is PA6-1347.  This is a recording.  I'm sorry,
   you've reached this office by mistake.  Please consult your
   directory and dial again.")


11. First full text to speech system, Electro Technical
Laboratory, Japan, 1968. Copy of SSSHP 32.24.

   (syn: "One upon a time, there lived a King and Queen who had no
   children.  Not a day passed but that the Queen did not say, 'If
   only we had a child.' One day as the Queen was walking beside
   the river, a little fish lifted its head out of the water and
   said, 'Dear Queen, your wish shall be fulfilled.'")


12. Commercial system, PROSE 2000, Speech Plus Inc., 1982. Copy of
SSSHP 32.32.

   (syn:  "Four hours of steady work faced us.  A large size in
   stockings is hard to sell.  The boy was there when the sun
   rose.  A rod is used to catch pink salmon.")


13. Klattalk system, Massachusetts Institute of Technology, 1983.
Basis for DECtalk system. Copy of SSSHP 32.33.

   (syn:"Text-to-speech systems are beginning to be applied in
   many ways, including aids for the handicapped, medical aids,
   and teaching devices.  The first kind of aid to be considered
   is a talking aid for the vocally handicapped.  According to the
   American Speech and Hearing Association, there are over one
   million people in the United States who are unable to speak for
   one reason or another.  Any person in this group who can use a
   typewriter keyboard, or point at some kind of communication
   board, is a potential user of a text-to-speech system.")


14. Several of the DECtalk voices, Digital Equipment Corp., 1986.
Copy of SSSHP 32.35.

   (syn: "I am Beautiful Betty, the standard female voice.  Some
   people think I sound a bit like a man.")

   (syn: "I am Huge Harry, a very large person with a deep voice.
   I can serve as an authority figure.")

   (syn: "My name is Kit the Kid, and I am about ten years old,
   and I sound like a boy or a girl.")

   (syn: "I am Whispering Wendy and I have a very breathy voice
   quality. Can you understand me even though I am whispering?)


15. DECtalk speaking at about 300 words/minute, 1986. Copy of
SSSHP 32.36.

   (syn: "The following is a list of topics in today's news.  In
   the sports world the Red Socks lost to Detroit.  First round
   matches were played in Wimbledon tennis tournament.  Arnold
   Palmer won the Senior Golf Tourney in La Grove, Penn.  In local
   news, there was a five alarm fire in Cambridge.")

END:  **********



SSSHP 73: "MAXEY SMITHSONIAN SEMINAR 5/26/89"

SOURCE: Donated by H.D. Maxey, 5/27/90.

CONTENTS:  Demonstration tape for progress report seminar on the
Smithsonian Speech Synthesis History Project at the Smithsonian.
A few selections from the tape collection, covering a time span
of about 50 years.  Outline of talk is in the SSSHP files.
Cassette, good quality.


1. H.M. Truby and TASS-II formant synthesizer, International
Business Machines Corp., 1964. Truby seems to be teaching the
machine to utter vowel sounds.

   (human and syn: "er, ae, aw, ou, uh, u, oo, ie")


2. The Voder, Bell Telephone Laboratory, 1939.  Manually operated
vocoder. From SSSHP 51. Mr. Garrett demonstrating the Voder, which
is operated by Miss Helen Harper. Good quality in a 50-year old
tape.

   (human: "There are ten filter circuits in the Voder ...")

   (syn: "... feel very old.")


3. OVE-II, Royal Institute of Technology, 1962.

   (syn: "Welcome to the Stockholm Speech Communication Seminar.
   Hello, is Docent Fant there? ... 23 65 20?")


4. First full text to speech system, Electro Technical Laboratory,
Japan, 1968. From SSSHP 70.

   (syn, male, 12 sen, 2:09 min: "Once upon a time ...  King and
   Queen ...  wish shall be fulfilled.")


5. Klattalk system, Massachusetts Institute of Technology, 1983.
Basis for DECtalk system of the Digital Equipment Corp. Currently
the best commercial system for General American English.  (SSSHP
32.33)

   (syn: "Text to speech systems ...text to speech system.")

END:  **********



SSSHP 77: "JSRU DEMO SYNTHESIS 1965. Synthesis demonstration from
          JSRU, England, prepared by John Holmes, 1965 plus other
          brief excerpts of unknown origin."

SOURCE: Received from Dr. K. N. Stevens, MIT, July 31, 1990. Copy
of D.H. Klatt's copy, received Oct 7, 1986, from Dr. John
Holmes, JSRU England.  See SSSHP UK JSRU file.  5" reel. Master tape is
at JSRU.

CONTENTS:  Human narration, synthesis of British English.


1. Comparison of synthesis and a natural sentence, John Holmes
using his parallel formant synthesizer at JSRU, 1973, to closely
copy a natural utterance.

   (syn?: "I enjoy the simple life, as long as there is plenty of
   comfort.")

   (syn?: "I enjoy the simple life, as long as there is plenty of
   comfort.")

   (syn?: "I enjoy the simple life, as long as there is plenty of
   comfort.")


2. Formant speech synthesis by rule at JSRU, 1965.  Associated
written description is JSRU Ref. JU/SCM/54, 26th March, 1965,
which contains a listing of the computer input for the example
synthesis. A copy is in SSSHP UK JSRU file.

   (human:  This is a demonstration record of speech synthesis by
   rule, made at the Joint Speech Research Unit in March, 1965.
   The demonstrations are explained in an associated written
   description."

The demonstrations are of three types:

    i: Direct synthesis. Control parameters are manually copied
       from a spectrogram of a natural utterance.

   ii: Synthesis by rule from phonetic input with copied timing.
       Synthesis durations, only, are taken from a natural
       utterance.

  iii: Synthesis by rule from phonetic input with stress marks.
       Rules control pitch and timing.


   2.1 (syn, type i, "A bird in the hand is worth two in the
       bush.")

   2.2 (syn, type ii, "A bird in the hand is worth two in the
       bush.")

   2.3 (syn, type iii, "A bird in the hand is worth two in the
       bush.")

   2.4 (syn, type ii, "The process of amplitude modulation of a
       high frequency carrier wave has long been familiar to
       communications engineers, and, for many years, it has been
       the most commonly used method for transmitting the
       waveforms of speech and music, over radio telephony
       channels.

   2.5 (syn, type iii, "It was the last thing I expected to find
       there.")

   2.6 (syn, type iii, "Did you come by motorcar?")

   2.7 (syn, type iii, "I'm going home now.")

   2.8 (syn, type iii, "Someone, somewhere, wants a letter from
       you.")

   2.9 (syn, type iii, "I've called several times and never found
       you there.")

   1.10 (syn, type iii, "Like most old people, he was fond of
       talking about old days, and that he had known hosts of
       interesting and important men, had a tenacious memory, and
       spoke the most finished English. It was a pleasure to
       listen to his reminiscences.")


3. Comparison of synthesis and a natural sentence, John Holmes
using the OVE-II formant synthesizer at RIT, 1961, to closely
copy a natural utterance of male and female speech.

   (human:    "I enjoy the simple life.")
   (male syn: "I enjoy the simple life.")
   (human:    "I enjoy the simple life.")

   (human:   "He knows just what he wants.")
   (fem syn: "He knows just what he wants.")

END:  **********



SSSHP 80: "SCP MASTER #942, IBM T67.1".

SOURCE: Donated by H.D. Maxey, Aug 23, 1990, from IBM tape
collection. See SSSHP USA AFCRL (Air Force Cambridge Research
Laboratories) file. 7" reel, good quality

CONTENTS: Human speech source tape for testing various speech
processing techniques.  Prepared by Caldwell Smith, AFCRL, L.G.
Hanscom Field, Bedford Mass, May 1967. Tape was sent to requesting
companies for use with their speech processing equipment. The
resulting tapes were sent to AFCRL for inclusion in the Speech
Analysis/Synthesis Survey Tape that was presented to the 1967
Conference on Speech Communication & Processing, MIT, November
6-8, 1967 (see SSSHP 81 tape.) The following transcription
accompanied this source tape.


LOG SHEET:  SPEECH RECORDINGS FOR ANALYSIS/SYNTHESIS SURVEY


1. CALIBRATIONS:

a.  The recordings are full-track on 1/4-inch magnetic tape,
recorded at 7-1/2 ips tape speed.  NAB equalization was used.

b.  At the start of the tape, a 10 kHz tone of approximately 1
minute duration is recorded for head alignment check.  It is
followed by a 1 kHz tone of approximately 10 sec duration,
recorded at 0 vu reference level.  The average speech peaks are
recorded approximately -3 db relative to this tone.

c.  At the conclusion of the tape, a frequency calibration test
run is recorded.  The measured frequency response for each tape
was calibrated with this test run, and the overall record/playback
response calibration is inclosed with each reel of tape.


2. SPEAKERS:

a. Three male speakers were recorded, representing neutral pitch
(C.H.), low pitch (J.C.), and high pitch (B.H.).

b. Each section of text is read by the three speakers in turn.
(There are minor differences where word groups were read twice by
some speakers.)


3. TEXT OVERVIEW:

a. Sentence clusters. Each cluster consists of a group of three
sentences, with a total of 39 sentences by each of the three
speakers.

b. Word clusters. Each cluster is a group of three words, which
include digits, spondees, iambs, etc., totalling approximately
fifty words read by each speaker.

c. Sentence List. A list of sixteen sentences is read by each of
the three speakers. Total time: 3:40 min.

d. Approximate total running time for text: 19 min.


4. TEXT DETAILS:


a. SENTENCE CLUSTERS

   (1) (human: "He took a walk every morning. Give me a breath of
       fresh air. Perhaps you did measure the changes.")

   (2) (human: "Your shouting was inexcusable. He came down
       through the chimney. The jumps of four girls were
       measured.")

   (3) (human: "Yawning often shows boredom. The sixth grade had a
       picnic. We've changed the measures.")

   (4) (human: "The treasure chest was found. Give the cash box to
       me. Your jumping thrilled him.")

   (5) (human: "The usher changed our places. The bigot excused
       himself. You've been measuring the width.")

   (6) (human: "We gazed at the azure sky. Bring some gin for him.
       You've got three fresh perch.")

   (7) (human: "The thin dish has been chipped. He gave me a
       corsage. Your jingle was first.")

   (8) (human: "Did you extinguish the fire? Measure exactly an
       inch. He jumped out of the bath.")

   (9) (human: "Your dog is chewing my shoe. Give the thin boy
       some fudge. He walked with pleasure.")

  (10) (human: "They march in precision. She enjoyed his first
       song. We gave you three books.")

  (11) (human: "The pressure was too much. Your gift is a
       birthday cake. Just bring his revision.")

  (12) (human: "The fourth chapter swings. Your vision has gone
       bad. We met at the junction.")

  (13) (human: "Did she change her decision? The waves looked
       threatening. You forgot my book.")


b. WORD CLUSTERS

   (1) (human: "One. Two. Three.")

   (2) (human: "Four. Five. Six.")

   (3) (human: "Seven. Eight. Nine.")

   (4) (human: "Ten. Them. Give.")

   (5) (human: "Earn. Chest. Show.")

   (6) (human: "Jaw. Move. Ice.")

   (7) (human: "Ease. Cap. Sidewalk.")

   (8) (human: "Cowboy. Pancake. Airplane.")

   (9) (human: "Baseball. Greyhound. Hotdog.")

  (10) (human: "Mousetrap. Toothbrush. Duckpond.")

  (11) (human: "About. Certain. Father.")

  (12) (human: "Nation. People. Magazine.")

  (13) (human: "Introduce. Operate. Memorize.")

  (14) (human: "Telephone. Condition. Delicate.")

  (15) (human: "Potato. Optimum. Delicious.")

  (16) (human: "Generation. Additional. Possibility.")

  (17) (human: "International. Immediately. (speaker's name)")


c. SENTENCE LIST ONE

   (1) (human: "His remarks are too dense.")

   (2) (human: "Please return that bar stool.")

   (3) (human: "Set Debbie's car for speed.")

   (4) (human: "Palm trees grow very tall.")

   (5) (human: "Prudent thieves look harmless.")

   (6) (human: "His partner took a jar.")

   (7) (human: "She gets to watch parties.")

   (8) (human: "Chicken farms are for eggs.")

   (9) (human: "The old red barn burned down.")

  (10) (human: "Let's sit in the cool bar.")

  (11) (human: "The man tore his dark suit.")

  (12) (human: "Jack took part of the fish.")

  (13) (human: "The cook has a red shawl.")

  (14) (human: "The tall man took your seat.")

  (15) (human: "The next show starts in March.")

  (16) (human: "Hand me the blue teapot.")

END:  **********



SSSHP 81: "SPEECH ANALYSIS/SYNTHESIS DEMONSTRATION, COPY NO. 2-9,
          T 67.2"

SOURCE: Donated by H.D. Maxey, Aug 23, 1990, from IBM tape
collection. See SSSHP USA AFCRL (Air Force Cambridge Research
Laboratories) file. 7" reel, good quality.

CONTENTS:  Results of many companies' processing of tape SSSHP 80,
compiled for a demonstration tape at the 1967 Conference on Speech
Communication and Processing, MIT, Nov 6-8, 1967. Prepared by
Caldwell Smith, AFCRL, L.G. Hanscom Field, Bedford, Mass., Nov
1967. For the purposes of this history project, only the synthesis
examples are transcribed.

Total running time: 22:45 min


1. REAL-TIME 2400 BPS SYSTEMS

   A. Collins Radio Co. Vocoder
   B. LN Ericsson Vocoder
   C. Lincoln Laboratory Vocoder
   D. Bell Telephone Labs. Cepstrum Vocoder
   E. Texas Instruments Co. Vocoder
   F. Philco Co. Vocoder
   G. U.S. Army Vocoder
   H. Philco Co. Vocoder with VRS
   I. AFCRL Vocoder


2. BASEBAND VOCODERS

   A. Texas Instruments Co. 9600 BPS Voice-Excited Vocoder
   B. Nippon Electric Co. Partial Vocoder (Analog VEV)
   C. Nippon Electric Co. Analog Double-Excitation Vocoder
   D. IBM France 9600 BPS Voice-Excited Vocoder
   E. AFCRL Analog Baseband Vocoder, pitch-excited


3. LOW-DATA-RATE REAL-TIME SPEECH COMPRESSION METHODS

   A. Philco Co. 1200 BPS system
   B. AFCRL 900 BPS pattern-matching method


4. FORMANT VOCODER

   A. Philco Co. Formant Vocoder, analog


5. CEPSTRUM VOCODER EXAMPLES - BELL TELEPHONE LABORATORIES

   A. Real-time 2400 BPS Cepstrum Vocoder
   B. Simulated 2400 BPS Cepstrum Vocoder
   C. High-resolution Analog Cepstrum Vocoder


6. COMPUTER PROCESSING METHODS

   A. Sylvania Co. Simulated 2400 BPS Vocoder


   B. Royal Institute of Technology (Sweden) OVE III - Synthesis
      by Matching. Matching of spectra on tape SSSHP 80.

   (syn: "Did you extinguish the fire? Measure exactly an inch. He
   jumped out of the bath. Your dog is chewing my shoe. Give the
   thin boy some fudge. He walked with pleasure. They march in
   precision. She enjoyed his first song. We gave you three
   books.")


   C. IBM France 2400 BPS Vocoder

   (syn, male, English: "Today is Tuesday, November 7. One, two,
   three, four, five, six.")

   (syn, female, English: "One, two, three, four, five, six.")

   (syn, female, French sen: " ... ")


   D. IBM USA Diphone Synthesis

   (syn: "The number you dialed, ME1-5280, has been changed. The
   new number is PA6-1347. This is a recording. The number you
   dialed, UN4-4482, has been temporarily disconnected. Thank you.
   I'm sorry, you've reached this office by mistake. Please
   consult your directory and dial again.")


7. SYNTHESIS-BY-RULE

   A. Examples by L. Rabiner, Bell Telephone Laboratories

   (syn: "The fourth chapter swings. Your vision has gone bad. We
   met at the junction. They march in precision. She enjoyed his
   first song. We gave you three books. We gazed at the azure sky.
   Bring some gin for him. You've got three fresh perch.

   This is the story about a man. The man who works in a factory.
   What is his name? The man's name is Bob. What does Bob do? Bob
   helps make steel on the big machine. Bob works from 9 to 5,
   then he goes home to his wife and family. Bob enjoys his
   life.")


   B. Examples by I. Mattingly, Haskins Laboratories

   (syn: "He took a walk every morning. Give me a breath of fresh
   air. Perhaps you did measure the changes. Your shouting was
   inexcusable. He came down through the chimney. The jumps of
   four girls were measured. Yawning often shows boredom. The
   sixth grade had a picnic. We've changed the measures.

   If you receive a malicious or annoying phone call, hang up.
   Don't keep talking, that is what the caller wants. If the calls
   persist, please contact your telephone business office.")


   C. Examples by M. Haggard, Haskins Laboratories

   (syn, 23 sec:  "Speech is extraordinarily resistant to
   distortion and disturbances of many kinds.  Not only does it
   provide a most effective means of communication in the ...
   noises but it may be deliberately distorted in the laboratory,
   yet remain intelligible.  ...  essential attributes of speech
   spectra ...  nonredundant phonetic description.")

   (syn, 19 sen: "He took a walk every morning. ... We've changed
   the measures.")

END:  **********
Tran. Index | SSSHP Contents | Labs | Abbr. | Index | Page- | Page+

Smithsonian Speech Synthesis History Project
National Museum of American History | Archives Center
Smithsonian Institution | Privacy | Terms of Use