History of the Project

These online pages document the Smithsonian Speech Synthesis History Project (SSSHP) to collect and preserve tape recordings, technical records, artifacts, and other supporting materials of the development of speech synthesis technology. The records provide a history of the contribution of speech synthesis to an understanding of human speech production and perception, and a history of the development of computer voice-output.

Work on automatic analysis/synthesis of human speech (e.g., CODEC's, vocoders) is not being included in the SSSHP unless the synthesis parameters were artificially modified for voice-response applications.

The SSSHP, running from 1986 to 2002, was conducted by H. David Maxey and other volunteers from the speech research community, in collaboration with Bernard S. Finn, Division of Information Technology and Communications of the Smithsonian Institution. H. D. Maxey directed the technical aspects of the program and coordinated contacts with an Advisory Committee and the speech research laboratories. Elliot Sivowitch and Harold Wallace were Smithsonian liaisons for the project, with assistance from other Smithsonian experts as needed. John Fleckner, of the Smithsonian tape archive, took responsibility for the recordings. The items received are in the Smithsonian Speech Synthesis History Project Collection, Call No. ACNMAH 0417.

The Advisory Committee, consisting of the following members, critiqued the plan of work for the SSSHP, making many valuable contributions to the effort. Affiliation shown was in 1986.

Franklin Cooper   Haskins Laboratories, USA
N. Rex Dixon   International Business Machines Corp., USA
Gunnar M. Fant   Royal Institute of Technology, Sweden
James L. Flanagan   AT&T Bell Laboratories, USA
Shizuo Hiki   Tohoku University, Japan
John N. Holmes   Joint Speech Research Unit, England
Dennis H. Klatt   Massachusetts Institute of Technology, USA
Ronald R. Kline   IEEE Center for the History of Electrical Engineering, USA
Kazuo Nakata   Tokyo University of Agriculture and Technology, Japan

Preliminary outlines of each laboratory's research were constructed by H. D. Maxey from the following histories of speech synthesis. References from these histories are identified in the laboratory outlines by the letters B, I, or K within parentheses.

B:    for the Benchmark book, SPEECH SYNTHESIS: BENCHMARK PAPERS IN ACOUSTICS, J. L. Flanagan and L. R. Rabiner, eds., Dowden Hutchinson and Ross, Inc., Stroudsburg, 1973. Reprints of important papers on speech synthesis.
I:    for the Ignatius Mattingly survey, "Speech Synthesis for Phonetic and Phonological Models", I. G. Mattingly, CURRENT TRENDS IN LINGUISTICS, T. A. Sebeok, ed., Vol. 12, Mouton, the Hague, 1974. Online, this site.
K:    for the Klatt survey, "Review of Text-to-Speech Conversion for English", D. H. Klatt, Journal of the Acoustical Society of America, 82.3, 737-793, Sept 1987. Online, this site.

In many cases the preliminary outlines were edited and completed by contacts for the laboratories, as noted at the end of each outline.


IMPORTANT NOTE: We are no longer collecting objects for this collection, the submission instructions are provided for historical data only.

The following instructions were included with request letters to laboratories for copies of historical material. In later years, development of the Internet allowed electronic request and response.

It is important that any confidential information contributed to the Smithsonian be clearly identified. Smithsonian archivists are able to make special arrangements if restrictions are required. All other material will be considered non-confidential and copies will be made available to the general public for educational, non-commercial purposes. Some of the material may be extracted and republished in the form of histories and demonstration tapes, or made available on the Internet.

We are particularly interested in obtaining:

  • information on additional research projects

  • corrections and additions to the research outlines

  • copies of tape recordings for the referenced projects

  • copies of technical records of how the speech was created

  • information on related artifacts, and their availability to the Smithsonian

To submit material to the Smithsonian Speech Synthesis Project, please email a notice of availability to the Editor, H. David Maxey. He will be receiving tape recordings and project records for evaluation and integration into the collection before forwarding them to the Smithsonian. For artifacts, please send him a description for evaluation by the Smithsonian. If the Smithsonian can accept an object, special shipping arrangements will be made directly with the owner.

Permission Statement

The contributed material will need to be accompanied by the following statement, signed by the owner or a legal representative:

"(Owner's name) retains copyright ownership of the donated historical material on speech synthesis development, but grants permission to the Smithsonian Institution to make copies for archival purposes, to present the records, images, and tape recordings on the Internet, and to provide copies to the general public for educational, non-commercial purposes."


Please review the following before making copies of tape recordings.

Notes on copying tape recordings

7" Reels (Preferred). The medium of preference for the Smithsonian archives is a 7" reel of good quality 1.5 mil tape (1200 feet), recorded at 7.5 ips (30 minutes in one direction). The 1.5 mil thickness reduces print-through effects and is more durable.

Cassettes. If reel-to-reel recorders are not available, the second choice is a good quality standard-size cassette, recorded at 1-7/8 ips, normal bias, 120us equalization, without Dolby or other compensation. Preference is again for 1.5 mil tape (a full cassette will record for 30 minutes in one direction). An example is TDK brand, type D60 cassette. If copies are obtained on cassettes, the Smithsonian will re-record them on 7" reels for archival storage.

Physical Label. Each tape should be labeled with an identifying title, the name of the organization, the name of the person making the copy, and the date.

Audio Labels. Each tape should contain the following audio labels:

  • An audio header with an identifying title, the name of the organization, the name of the person making the copy, and the date
  • An audio identification for each entry
  • An audio trailer of "End of demonstration tape"

The voice quality of the audio labels is not important as they will not be used on subsequent demonstration tapes.

Please rewind master! Please rewind the master tape once before copying to reduce print-through and to separate the layers of tape that may be sticking. Rewinding at Play speed is safer (lower tape tension.)

Shipping. Anti-magnetic cans are desirable, particularly for overseas shipments. For domestic shipments, packing in strong cardboard boxes with at least 3 inches of padding should be sufficient to guard against stray magnetic fields or mechanical damage.


The outlines serve as a permanent index to the collection of project records, publications, and tape recordings. The following is a summary of the desired contents. A good example is the outline for the Joint Speech Research Unit (JSRU).

Outline Format

  1. Organization name and address
  2. Brief history of organization
  3. List of Projects
    • Beginning and ending year
    • Paragraph or two for each project or important stage in a project. High-level description of project, objectives, and accomplishments.
    • A few key technical references (books, papers, reports). Include the first reference and the most comprehensive.
    • Please identify tape recordings that accompany the technical references with a few words of the contents. Tape recordings with number labels starting with "T" (e.g., "T87.1.27") in the existing outlines refer to copies already available to the Smithsonian. Higher quality and more comprehensive copies are desirable.
    • If possible, provide sufficient technical records to allow a future researcher to reconstruct the work.
    • Please identify, with a brief description, any associated artifacts (research machines, commercial machines).
  4. Biographies of referenced workers. A list, with year, of education and place of employment.
  5. Contact person, address, and email address


Each tape or disk recording, record, artifact, or reprint was assigned a sequential SSSHP number as received, starting with SSSHP 1 in 1988. In addition, speech samples within a tape or disk recording are denoted with sequential numbers. For example, synthetic speech sample "SSSHP 66.15" can be found as the 15th entry on tape SSSHP 66.

Some of the laboratory outlines refer to recordings in H. D. Maxey's collection, recordings for which better copies were sought. Maxey recordings are designated by the letter "T", followed by the year received and the order the tape was received that year. For example, recording T87.3 was the third one received in 1987. As above, reference to speech sample T87.3.5 refers to sample number 5 on Maxey tape T87.3.


H. David Maxey, IEEE Senior Member
Compiler and Editor - 2002

