NMAH | Smithsonian Speech Synthesis History Project (ss

History of the Project These online pages document the Smithsonian Speech Synthesis History Project (SSSHP) to collect and preserve tape recordings, technical records, artifacts, and other supporting materials of the development of speech synthesis technology. The records provide a history of the contribution of speech synthesis to an understanding of human speech production and perception, and a history of the development of computer voice-output. Work on automatic analysis/synthesis of human speech (e.g., CODEC's, vocoders) is not being included in the SSSHP unless the synthesis parameters were artificially modified for voice-response applications. The SSSHP, running from 1986 to 2002, was conducted by H. David Maxey and other volunteers from the speech research community, in collaboration with Bernard S. Finn, Division of Information Technology and Communications of the Smithsonian Institution. H. D. Maxey directed the technical aspects of the program and coordinated contacts with an Advisory Committee and the speech research laboratories. Elliot Sivowitch and Harold Wallace were Smithsonian liaisons for the project, with assistance from other Smithsonian experts as needed. John Fleckner, of the Smithsonian tape archive, took responsibility for the recordings. The items received are in the Smithsonian Speech Synthesis History Project Collection, Call No. ACNMAH 0417. The Advisory Committee, consisting of the following members, critiqued the plan of work for the SSSHP, making many valuable contributions to the effort. Affiliation shown was in 1986.

Franklin Cooper		Haskins Laboratories, USA
N. Rex Dixon		International Business Machines Corp., USA
Gunnar M. Fant		Royal Institute of Technology, Sweden
James L. Flanagan		AT&T Bell Laboratories, USA
Shizuo Hiki		Tohoku University, Japan
John N. Holmes		Joint Speech Research Unit, England
Dennis H. Klatt		Massachusetts Institute of Technology, USA
Ronald R. Kline		IEEE Center for the History of Electrical Engineering, USA
Kazuo Nakata		Tokyo University of Agriculture and Technology, Japan

Preliminary outlines of each laboratory's research were constructed by H. D. Maxey from the following histories of speech synthesis. References from these histories are identified in the laboratory outlines by the letters B, I, or K within parentheses.

In many cases the preliminary outlines were edited and completed by contacts for the laboratories, as noted at the end of each outline.

INSTRUCTIONS FOR SUBMITTING MATERIAL

IMPORTANT NOTE: We are no longer collecting objects for this collection, the submission instructions are provided for historical data only.

The following instructions were included with request letters to laboratories for copies of historical material. In later years, development of the Internet allowed electronic request and response.

It is important that any confidential information contributed to the Smithsonian be clearly identified. Smithsonian archivists are able to make special arrangements if restrictions are required. All other material will be considered non-confidential and copies will be made available to the general public for educational, non-commercial purposes. Some of the material may be extracted and republished in the form of histories and demonstration tapes, or made available on the Internet.

We are particularly interested in obtaining:

information on additional research projects

corrections and additions to the research outlines

copies of tape recordings for the referenced projects

copies of technical records of how the speech was created

information on related artifacts, and their availability to the Smithsonian

To submit material to the Smithsonian Speech Synthesis Project, please email a notice of availability to the Editor, H. David Maxey. He will be receiving tape recordings and project records for evaluation and integration into the collection before forwarding them to the Smithsonian. For artifacts, please send him a description for evaluation by the Smithsonian. If the Smithsonian can accept an object, special shipping arrangements will be made directly with the owner.

Permission Statement

The contributed material will need to be accompanied by the following statement, signed by the owner or a legal representative:
"(Owner's name) retains copyright ownership of the donated historical material on speech synthesis development, but grants permission to the Smithsonian Institution to make copies for archival purposes, to present the records, images, and tape recordings on the Internet, and to provide copies to the general public for educational, non-commercial purposes."

Tapes

Please review the following before making copies of tape recordings.

Notes on copying tape recordings

7" Reels (Preferred). The medium of preference for the Smithsonian archives is a 7" reel of good quality 1.5 mil tape (1200 feet), recorded at 7.5 ips (30 minutes in one direction). The 1.5 mil thickness reduces print-through effects and is more durable.

Cassettes. If reel-to-reel recorders are not available, the second choice is a good quality standard-size cassette, recorded at 1-7/8 ips, normal bias, 120us equalization, without Dolby or other compensation. Preference is again for 1.5 mil tape (a full cassette will record for 30 minutes in one direction). An example is TDK brand, type D60 cassette. If copies are obtained on cassettes, the Smithsonian will re-record them on 7" reels for archival storage.

Physical Label. Each tape should be labeled with an identifying title, the name of the organization, the name of the person making the copy, and the date.

Audio Labels. Each tape should contain the following audio labels:

An audio header with an identifying title, the name of the organization, the name of the person making the copy, and the date

An audio identification for each entry

An audio trailer of "End of demonstration tape"

The voice quality of the audio labels is not important as they will not be used on subsequent demonstration tapes.

Please rewind master! Please rewind the master tape once before copying to reduce print-through and to separate the layers of tape that may be sticking. Rewinding at Play speed is safer (lower tape tension.)

Shipping. Anti-magnetic cans are desirable, particularly for overseas shipments. For domestic shipments, packing in strong cardboard boxes with at least 3 inches of padding should be sufficient to guard against stray magnetic fields or mechanical damage.

Outlines

The outlines serve as a permanent index to the collection of project records, publications, and tape recordings. The following is a summary of the desired contents. A good example is the outline for the Joint Speech Research Unit (JSRU).

Outline Format

Organization name and address

Brief history of organization

List of Projects

Beginning and ending year

Paragraph or two for each project or important stage in a project. High-level description of project, objectives, and accomplishments.

A few key technical references (books, papers, reports). Include the first reference and the most comprehensive.

Please identify tape recordings that accompany the technical references with a few words of the contents. Tape recordings with number labels starting with "T" (e.g., "T87.1.27") in the existing outlines refer to copies already available to the Smithsonian. Higher quality and more comprehensive copies are desirable.

If possible, provide sufficient technical records to allow a future researcher to reconstruct the work.

Please identify, with a brief description, any associated artifacts (research machines, commercial machines).

Biographies of referenced workers. A list, with year, of education and place of employment.

Contact person, address, and email address

LOG OF ITEMS RECEIVED

Each tape or disk recording, record, artifact, or reprint was assigned a sequential SSSHP number as received, starting with SSSHP 1 in 1988. In addition, speech samples within a tape or disk recording are denoted with sequential numbers. For example, synthetic speech sample "SSSHP 66.15" can be found as the 15th entry on tape SSSHP 66.

Some of the laboratory outlines refer to recordings in H. D. Maxey's collection, recordings for which better copies were sought. Maxey recordings are designated by the letter "T", followed by the year received and the order the tape was received that year. For example, recording T87.3 was the third one received in 1987. As above, reference to speech sample T87.3.5 refers to sample number 5 on Maxey tape T87.3.

H. David Maxey, IEEE Senior Member
Compiler and Editor - 2002


B:		for the Benchmark book, SPEECH SYNTHESIS: BENCHMARK PAPERS IN ACOUSTICS, J. L. Flanagan and L. R. Rabiner, eds., Dowden Hutchinson and Ross, Inc., Stroudsburg, 1973. Reprints of important papers on speech synthesis.

I:		for the Ignatius Mattingly survey, "Speech Synthesis for Phonetic and Phonological Models", I. G. Mattingly, CURRENT TRENDS IN LINGUISTICS, T. A. Sebeok, ed., Vol. 12, Mouton, the Hague, 1974. Online, this site.

K:		for the Klatt survey, "Review of Text-to-Speech Conversion for English", D. H. Klatt, Journal of the Acoustical Society of America, 82.3, 737-793, Sept 1987. Online, this site.

	SSSHP Contents \| Labs \| Abbr. \| Index

Smithsonian Speech Synthesis History Project
National Museum of American History \| Archives Center
Smithsonian Institution \| Privacy \| Terms of Use

History of the Project

INSTRUCTIONS FOR SUBMITTING MATERIAL

Permission Statement

Tapes

Notes on copying tape recordings

Outlines

Outline Format

LOG OF ITEMS RECEIVED