NMAH | Smithsonian Speech Synthesis History Project (im

are matched. If these elements are naturally spoken words, the output rate cannot exceed ordinary speaking rates without distortion, and (on account of the difficulty of abutting the words closely without losing intelligibility) will actually be considerably slower. Yet the blind user would be happy with an output several times faster than natural speech. A potential solution is suggested by the fact that speech can be synthesized at two or three times normal speed without much loss of intelligibility (Cooper et al. 1969).

In addition to these practical applications, synthetic speech is used for psychological research into the nature of speech perception itself. Synthetic stimuli can be produced which are simple and closely controlled, and which in some cases could not have been produced by a human speaker at all. Such stimuli have been used to study categorical and continuous perception of speech sounds (Liberman et al. 1957), differences between perception of speech and non-speech signals (Liberman et al. 1961) and hemispheric localization of speech processing (Shankweiler and Studdert-Kennedy 1967). These investigations, apart from their intrinsic interest, are an essential preliminary to synthesis by rule.

Another research application of synthesis is the close imitation of natural speech utterances. It is of some interest to know just how faithfully a particular synthesizer can simulate natural speech, without any assumptions being made about the structure of speech or language beyond those built into the synthesizer. Such investigations explore the limitations of the synthesizer. If the best imitations which could be achieved in this way were indeed quite poor, this fact would discourage any endeavor making use of synthetic speech. Fortunately, at least one investigator, Holmes (1961; see also Holmes et al. 1964) using a formant synthesizer, has been able to synthesize sentences which are extremely natural and virtually impossible to distinguish from their originals. This is good evidence that progress in other applications of speech synthesis is at any rate not limited by the quality of the synthesizer.

2. HISTORICAL DEVELOPMENT OF SYNTHESIS BY RULE TECHNIQUES

The applications we have just summarized are quite recent. The traditional motivation for research in speech synthesis has been simply to explain how man used his vocal tract to produce connected speech. In a broad sense, such research is synthesis by rule, though it was a long time before the notion of a rule became obvious, and the importance of an explicit formulation of the rules was recognized.

The idea of an artificial speaker is very old, an aspect of man's long-standing fascination with humanoid automata. Gerbert (d. 1003), Albertus Magnus (1198-1280) and Roger Bacon (1214-1294) are all said to have built speaking heads (Wheatstone 1837). However, historically attested speech synthesis begins with Wolfgang von Kempelen (1734-1804), who published an account of his twenty years of research in 1791 (see also Dudley and Tarnoczy 1950). Von Kempelen's

	SSSHP Contents \| Labs
Smithsonian Speech Synthesis History Project
National Museum of American History \| Archives Center
Smithsonian Institution \| Privacy \| Terms of Use