ELOQUENT TECHNOLOGY, INC. (ETI)
1988 24 Highgate Circle, Ithaca, NY 14850
1995 2389 N. Triphammer Rd., Ithaca, NY 14850
2001 SpeechWorks International, Inc.
CONTENTS:
History
General/Surveys
The Delta System (1983-1995)
Synthesis of American English Dialects (1989-1995)
The Phone-and-Transition Model of Speech Timing (1990-1992)
The Syllt Program for Generating Synthetic Stimuli (1993)
Multi-Voice Speech Synthesis (1993-1998)
Voice Quality (1997)
Multi-Dialect and Multi-Language Speech Synthesis (1997-2001)
Biographies
------------------------------------------------------------- Top
HISTORY
Eloquent Technology, Inc. (ETI), a company exclusively focused on
the development and marketing of multi-voice, multi-language text-
to-speech systems, was an outgrowth of the research conducted by
Susan R. Hertz at Cornell University between 1974 and 1983 (see
SSSHP USA Cornell University and A Personal Narrative by Susan
Hertz). In 1983, Hertz started doing business as "Eloquent Tech-
nology," focusing on the development of the Delta System, a sophis-
ticated research and development tool for expressing and testing
programs that convert text to speech. Hertz worked on the Delta
System with two part-time consultants. In 1988, Eloquent Technology
was incorporated, and three full-time employees were hired, who
worked out of Hertz's home. In 1995, after the company had grown
to a size of six, it moved from Hertz's house to an outside office,
eventually increasing in size to seventeen full-time employees.
Between 1988 and 2001, the company worked on a variety of research
projects in the areas of multi-voice, multi-dialect, and multi-
language speech synthesis by rule. The particular linguistic models
developed, which included language-universal and dialect-universal
components, form the basis for the ETI-Eloquence text-to-speech
system, which has been marketed since 1995, and at the time of this
writing, is available for twelve languages on multiple computer plat-
forms, including small hand-held devices.
In August 1996, the company formed a strategic partnership with IBM,
which acquired certain portions of the technology developed by ETI,
and ultimately incorporated it into its ViaVoice line of speech pro-
ducts. See A Personal Narrative by Susan Hertz for a more complete
history of Eloquent Technology, Inc. In January 2001, Eloquent Tech-
nology, Inc. merged with SpeechWorks International, Inc., which is
continuing to market the ETI-Eloquence system, and is also combining
portions of it with SpeechWorks' concatenative text-to-speech system,
Speechify, and other speech technology.
------------------------------------------------------------- Top
GENERAL/SURVEYS
1997 Hertz, S.R., "The Technology of Text-to-Speech", Speech Tech-
nology, CI Publishing, 18-21.
2000 Hertz, S.R., Younes, R.J., and Hoskins, S.R., "Space, Speed,
Quality, and Flexibility: Advantages of Rule-Based Speech Syn-
thesis", Conference Proceedings, AVIOS 2000, May 22-24, 2000,
San Jose, CA, 217-227. (2000) (Copy in SSSHP USA Eloquent
Technology, Inc. file.)
------------------------------------------------------------- Top
PROJECT: THE DELTA SYSTEM (1983-1995)
The Delta System is a software tool for text-to-speech rule develop-
ment for any human language. The Delta System includes a special
programming language and interactive environment specifically de-
signed to build and manipulate a multi-tiered utterance represen-
tation called a delta. In a delta, the relationships between all
relevant (user-definable) abstract linguistic units (e.g., phrases,
words, syllables, phonemes) as well as quantitative phonetic values
(e.g., formant frequencies, amplitudes, durations) can be explicitly
represented. The DeltaTools interactive environment can be used to
trace program (rule) execution and to experiment by listening to the
synthetic speech output produced with different acoustic values.
1985 Hertz, S.R., Kadin, J. and Karplus, K., "The Delta rule develop-
ment system for speech synthesis from text", Proceedings of the
IEEE, 73, Special Issue on Man-Machine Speech Communication,
1589-1601. (1985) (Copy in SSSHP USA Eloquent Technology, Inc.
file.)
1990 Hertz, S.R., "The Delta programming language: an integrated
approach to non-linear phonology, phonetics, and speech synthe-
sis", in J. Kingston and M. Beckman (eds.), Papers in Labora-
tory Phonology I: Between the Grammar and the Physics of Speech,
Cambridge University Press. (1990) (Copy in SSSHP USA Eloquent
Technology, Inc. file.)
------------------------------------------------------------- Top
PROJECT: SYNTHESIS OF AMERICAN ENGLISH DIALECTS (1989-1995)
Between 1989 and 1995, we worked on a variety of projects involving
the synthesis of American English dialects, including a dialect of
Black English, General American, Brooklyn, Boston, and Alabama. In
this project, we aimed to further test the validity of our nucleus-
based phone-and-transition model of speech timing (see Phone-and-
Transition Model of Speech Timing, below), and to extract dialect-
universal generalizations for multi-dialect speech synthesis by rule.
Toward these ends, we developed a multi-dialect relational database
that contained both higher-level linguistic information and detailed
spectral and durational information about formants, voicing, frica-
tion, and aspiration for a variety of utterance types in each dialect.
From information in the database, multi-tiered utterance representa-
tions could be automatically generated for synthesis with the Delta
System (above.)
1990 Hertz, S.R., "A modular approach to multi-dialect and multi-
language speech synthesis using the Delta System", Proceedings
of the Workshop on Speech Synthesis, European Speech Communica-
tion Association, Autrans, France, 225-228. (Copy in SSSHP USA
Eloquent Technology, Inc. file.)
1994 Hertz, S.R., Zsiga, E.C., de Jong, K.J., Gries, P., Lockwood,
K.E. "From database to speech: a multi-dialect relational data-
base integrated with the ETI-Eloquence synthesis technology,
Conference Proceedings of the Second ESCA/IEEE Workshop on
Speech Synthesis, 45-48. (Copy in SSSHP USA Eloquent Technology,
Inc. file.)
SSSHP 172.1 Tape: "Eloquent Technology, Inc. speech synthesis
samples"
(syn "The colors of the rainbow are red, orange, yellow,
green, blue and violet.") Southern American English.
Cassette, good quality.
------------------------------------------------------------- Top
PROJECT: PHONE-AND-TRANSITION MODEL OF SPEECH TIMING (1990-1992)
Building on our observations about formant timing patterns during
our earlier rule-based multi-language speech synthesis research
between 1978 and 1990, we developed a new model of speech timing,
called the phone-and-transition model. The phone-and-transition
model is based on a segmentation of speech into independent phone
and formant transition units, rather than abutted phoneme-sized
units that incorporate the transitions, as in more conventional
models for rule-based synthesis. The separate phone and transition
units are grouped into higher level units, such as phonemes, syl-
lable nuclei, and syllables. The model was derived from observa-
tions of the durational behavior of sonorants before voiced and
voiceless obstruents, diphthongs in fast and slow speech, and the
timing of aspiration. In addition to more straightforward expres-
sion of timing patterns in a particular language, the model has
made possible the direct expression of a variety of acoustic uni-
versals, and it has helped us increase our understanding of the
relationship between phonology and phonetics.
1991 Hertz, S.R., "Streams, phones, and transitions: toward a
phonological and phonetic model of formant timing", Journal
of Phonetics, 19, Special Issue on Speech Synthesis and Pho-
netics, edited by R. Carlson. (1991) (Copy in SSSHP USA
Eloquent Technology, Inc. file.)
1992 Hertz, S.R. and Huffman, M.K., "A nucleus-based timing model
applied to multi-dialect speech synthesis by rule", Proceed-
ings of the International Conference on Spoken Language Pro-
cessing, 2, 1171-1174. (1992) (Copy in SSSHP USA Eloquent
Technology, Inc. file.)
------------------------------------------------------------- Top
PROJECT: THE SYLLT PROGRAM FOR GENERATING SYNTHETIC STIMULI (1993)
Syllt (Syllable Tool) is a partial phone-to-speech program designed
to be used in conjunction with the Delta System (above) for teaching
and research in acoustic phonetics and speech synthesis. Written in
the Delta programming language, Syllt takes a string of phonetic
symbols representing a CVC (consonant-vowel-consonant), CV, VC, or
VCV utterance as input, and creates a multi-tiered utterance repre-
sentation (delta) containing the phonological and acoustic structure
of the utterance as output. From the acoustic values, parameter
values for a Klatt-style formant synthesizer are automatically
derived. The output deltas can be modified either interactively
with simple Delta System commands, or automatically with built-in or
user-defined Delta language procedures. Syllt can also quickly
implement stepwise changes to a delta to generate stimulus continua
or matrices.
1995 Hertz, S.R. and Zsiga, L., "The Delta System with Syllt:
Increased capabilities for teaching and research in phonetics",
Proceedings ICPhS 95 Stockholm, 2, 322-325. (Copy in SSSHP USA
Eloquent Technology, Inc. file.)
------------------------------------------------------------- Top
PROJECT: MULTI-VOICE SPEECH SYNTHESIS (1993-1998)
Between 1994 and 1998, we added to our formant-based synthesis rule
sets for different languages a universal "voice filter" component
that operates on the acoustic parameter values produced by the rules
to generate the desired voice quality, including male, female, and
child. A variety of parameters can be set to modify the male,
female, and child voices to produce a virtually limitless set of
voices. These parameters include breathiness, roughness, speed,
volume, pitch baseline, and degree of pitch fluctuation.
SSSHP 172.4 Tape: "Eloquent Technology, Inc. speech synthesis
samples"
(syn 3:11 "Hello. My name is Reed. I believe we may have
met before. Would you like to meet my sister Shelley?
... Goodbye, and thanks for listening.")
Cassette, good quality.
------------------------------------------------------------- Top
PROJECT: VOICE QUALITY (1997)
With the aim of improving the naturalness of our formant-based
synthesis by rule, we conducted experiments in which we hand-
modified certain rule-generated parameter values related to voice
quality and prosody, such as spectral tilt and fundamental
frequency. Highly natural-sounding speech was achieved for the
target sentences. Through our experimentation, we determined that
we could abstract away from certain details of the original model
utterances, such as some formant target misalignments, without
degrading the speech quality. In particular, we were able to
structure the synthetic utterances in accordance with the phone-
and-transition model underlying our rules, suggesting that rule-
based formant synthesis within this model has the potential to
sound highly natural.
SSSHP 172.2 Tape: "Eloquent Technology, Inc. speech synthesis
samples"
(syn 0:01 "Today's a spectacular day.")
Cassette, good quality.
------------------------------------------------------------- Top
PROJECT: MULTI-DIALECT AND MULTI-LANGUAGE SPEECH SYNTHESIS
(1997-2001)
Between late 1996 and 2001, we used the Delta System (above) to
develop text-to-speech synthesis rules for thirteen languages/
dialects -- German, Parisian and Canadian French, Castilian and
Mexican Spanish, General American and British English, Finnish,
Brazilian Portuguese, Mandarin Chinese, Japanese, and Korean.
(The Romanization portions of Chinese, Korean, and Japanese,
which generate a Romanized string on the basis of the original
input characters, were implemented in C++, rather than Delta).
Building on the linguistic models we developed between 1990 and
1996 (see Synthesis of American English Dialects and The Phone-
and-Transition Model of Speech Timing, above), the rules for each
language are divided into language-universal, language-specific
(dialect-universal), and dialect-specific rule modules.
1990 Hertz, S.R., "A modular approach to multi-dialect and multi-
language speech synthesis using the Delta System", Proceedings
of the Workshop on Speech Synthesis, European Speech Communi-
cation Association, Autrans, France, 225-228. (Copy in SSSHP
USA Eloquent Technology, Inc. file.)
1999 Hertz, S.R., Younes, R.J., and Zinovieva, N., "Language-uni-
versal and language-specific components in the multi-language
ETI-Eloquence text-to-speech system", Proceedings of the XIV
International Congress of Phonetic Sciences, San Francisco,
CA, Aug. 1-7, 2283-2286. (1999) (Copy in SSSHP USA Eloquent
Technology, Inc. file.)
SSSHP 172.3 Tape: "Eloquent Technology, Inc. speech synthesis
samples"
(syn 1:46 "Hi, my name is Reed. I'm an American. I speak
English ... \Vce=Speaker=Antti\ Hei. Nimeni on Antti.
Olen suomalainen. Puhutko suomea? 1, 2, 3, 4, 5, 6, 7,
8, 9, 10.")
Cassette, good quality.
------------------------------------------------------------- Top
BIOGRAPHIES
KENNETH DE JONG
1984 B.A., English, Calvin College, Grand Rapids, MI
1987 M.A., linguistics, Ohio State University, Columbus, OH
1991 Ph.D., linguistics, Ohio State University, Columbus, OH
1991 Postdoctoral Fellow, Phonetics Lab, Univ. of California,
Los Angeles, CA
1992 Visiting Asst. Professor, Department of Linguistics,
University of California, Los Angeles, CA
1993 Visiting Scholar, Department of Modern Languages and
Linguistics, Cornell University, Ithaca, NY
1993 Research Linguist, Eloquent Technology, Inc., Ithaca, NY
1994 Visiting Asst. Professor, Department of Linguistics,
Indiana University, Bloomington, IN
1995 Asst. Professor, Department of Linguistics, Indiana
University, Bloomington, IN
PAUL GRIES
SUSAN R. HERTZ
1972 B.A., linguistics and German, Univ. of California, Davis
1974 SRS tool and rule development
1975 M.A., general linguistics, computer science, and Germanic
linguistics, Cornell University, Ithaca, NY
1979 Ph.D., linguistics, Cornell University, Ithaca, NY
1979 Acting Assistant Professor, Department of Modern Languages
and Linguistics, Cornell University, Ithaca, NY
1983 Delta System tool and rule development; ETI-Eloquence
product development
1983 President and CTO, Eloquent Technology, Inc.
1985 Acting Assistant Professor, Department of Modern Languages
and Linguistics, Cornell University, Ithaca, NY
1986 Senior Research Associate, Department of Modern Languages
and Linguistics, Cornell University, Ithaca, NY
1996 Adjunct Associate Professor, Department of Linguistics,
Cornell University, Ithaca, NY
2001 Director and Lead Scientist, Text-to-Speech Technologies,
SpeechWorks International, Inc., Ithaca, NY
STEVE R. HOSKINS
1982 B.E.E., Univ. of Delaware, Newark, DE
1982 Electrical/Software Engineer for Raytheon Equipment Div.,
Fenwal Electronics, Allen Bradley
1991 Research/Teaching Assistant, Linguistics Department, Univ.
of Delaware, Newark
1993 M.A., linguistics, Univ. of Delaware, Newark
1994 Research Assistant, Applied Sciences and Engineering
Laboratories, Wilmington, DE
1997 Ph.D., linguistics, Univ. of Delaware, Newark
1997 Postdoctoral Researcher, Applied Sciences and Engineering
Laboratories, Wilmington, DE
1999 Computational Linguist, Eloquent Technology, Inc., Ithaca
1999 Visiting Scholar, Linguistics Department, Cornell Univ.,
Ithaca, NY
2001 Text-to-Speech Scientist and Software Developer, SpeechWorks
International, Inc.
MARIE K. HUFFMAN
1982 B.A., linguistics, Univ. of California, Riverside, CA
1985 M.A., linguistics, Univ. of California, Riverside, CA
1989 Ph.D., linguistics, Univ. of California, Los Angeles, CA
1989 Postdoctoral Fellow, Speech Communication group, Massa-
chusetts Institute of Technology, MA
1991 Visiting Scholar, Dept. of Modern Languages and Linguis-
tics, Cornell University, Ithaca, NY
1991 Research Linguist, Eloquent Technology, Inc., Ithaca, NY
1993 Asst. Professor, Dept. of Linguistics, State University of
New York, Stony Brook, NY
1999 Assoc. Professor, Dept. of Linguistics, State University of
New York, Stony Brook, NY
JAMES A. KADIN
1978 B.A., mathematics, Ithaca College, Ithaca, NY
1981 M.S., computer science, Cornell University, Ithaca, NY
1983 Delta System development, Eloquent Technology, Inc., Ithaca
1988 Ph.D., computer science, Cornell University, Ithaca, NY
1989 Asst. Professor, Dept. of Computer Science, University of
Maine, Orono, ME
1994 Software Engineering Manager, Alacare Home Health Services,
Inc., Birmingham, AL
1996 Research Scientist, Mouse Genome Informatics, The Jackson
Laboratory, Bar Harbor, ME
KEVIN KARPLUS
1974 B.A., mathematics, Michigan State University, MI
1976 M.S., mathematics, Stanford University, CA
1983 Ph.D., computer science, Stanford University, CA
KATHERINE E. LOCKWOOD
1993 B.A., linguistics, Cornell University, Ithaca, NY
1993 English dialect rule development and English dialect
research, Eloquent Technology, Inc., Ithaca, NY
2000 M.S., speech-language pathology, Ithaca College, Ithaca, NY
2001 Speech Therapist, Franziska Racker Centers, Ithaca, NY
REBECCA J. YOUNES
1976 Princeton University
1979 B.A., linguistics, University of Minnesota, Minneapolis
1982 M.A., linguistics, The University of Texas at Austin, TX
1982 Center for Arabic Study Abroad, Cairo, Egypt
1982 Instructor, Dept. of English, Birzeit University, Birzeit,
The West Bank
1996 Computational Linguist, Eloquent Technology, Inc., Ithaca
2001 Text-to-Speech Scientist and Software Developer, SpeechWorks
International, Inc., Ithaca
NINA ZINOVIEVA
1976 M.A., linguistics, Philological Faculty, Moscow State
Lomonosov University
1976 Researcher, Laboratory of Phonetics and Speech Communication,
Philological Faculty, Moscow State Lomonosov University
1986 Ph.D., linguistics, Philological Faculty, Moscow State
Lomonosov University
1994 Senior Researcher, Accent Inc., CA
1997 Computational Linguist, Eloquent Technology, Inc., Ithaca
1999 Computational Linguist, Lernout & Hauspie, Burlington, MA
2001 Senior Voice User Interface Engineer, Comverse Technology,
Cambridge, MA
ELIZABETH ZSIGA
1986 B.A., linguistics, Wesleyan University, Middletown, CT
1988 M.A., linguistics, Yale University, New Haven, CT
1992 Lecturer, Department of Linguistics, Yale University,
New Haven, CT
1993 Ph.D. in linguistics, Yale University, New Haven, CT
1993 Research Linguist, Eloquent Technology, Inc., Ithaca
1994 Asst. Professor, Dept. of Linguistics, Georgetown
University, Washington, DC
1999 Assoc. Professor, Dept of Linguistics, Georgetown
University, Washington, DC
------------------------------------------------------------- Top
CONTRIBUTIONS AND REVIEW BY:
Dr. Susan R. Hertz (2001)
Cornell University and SpeechWorks International, Inc.
(Quoted material is from a personal communication from Susan R.
Hertz to H.D. Maxey, January 23, 2002, for this history. See
SSSHP USA Eloquent Technology, Inc. file.)
|