SSSHP Contents | Labs

 KLATT 1987, p. 737b 
Go to Page | Contents Index | Bibl. | Page- | Page+
 

CONTENTS

Abstract and Citation 737a
Illustrations List
Introduction below
A.Linguistic framework 738
 
I. Phonemes-to-speech conversion 739
A. Early synthesizers: Copying speech 741
1. The source-filter theory of speech generation 742
2. Models of the vocal tract transfer function 742
3. Models of the voicing source 744
4. Articulatory models 747
5. Automatic analysis/resynthesis of natural waveforms 749
B. Acoustic properties of phonetic segments 749
C. Segmental synthesis-by-rule programs 752
1. Formant-based rule programs 752
2. Articulation-based rule programs 756
3. Rule compilers 757
4. Concatenation systems 758
D. Prosody and sentence-level phonetic recoding 759
1. Intensity rules 760
2. Duration rules 760
3. Fundamental frequency rules 761
4. Allophone selection 763
 
II. Text-to-phonemes conversion 767
A. Text formatting 768
B. Letter-to-phoneme conversion 768
1. Prediction of lexical stress from orthography 771
2. Exceptions to the rules 772
3. Morphemic decomposition 772
4. Proper names 773
C. Syntactic analysis 773
D. Semantic analysis 774
 
III. Hardware implementation 775
 
IV. Perceptual evaluation of text-to-speech systems 775
A. Intelligibility of isolated words 776
B. Intelligibility of words in sentences 777
C. Reading comprehension 777
D. Naturalness 778
E. Suitability for a particular application 778
 
V. Special applications 779
A. Talking aids for the vocally handicapped 779
B. Training aids 780
C. Reading aids for the blind 780
D. Medical applications 780
 
VI. Conclusions 781
Acknowledgments 783
Appendix: Demonstration 783
Notes 786
Bibliography Bibl.
 
 

INTRODUCTION

The intent of this review is to trace the history of progress toward the development of systems for converting text to speech, giving credit along the way to those responsible for the important ideas that have led to present successes. Emphasis is placed on the theory behind current algorithms. The account of this theory, in conjunction with an extensive bibliography, can serve to bring someone new to the field "up to speed" fairly rapidly, even though to some extent existing commercial systems are hidden behind a cloud of proprietary trade secrets. Perceptual data that measure the intelligibility of current systems are summarized, and a brief attempt is made to estimate the potential of the technology for practical application, especially in areas of social need. A final purpose of this undertaking is to identify the weakest links in present systems for the conversion of unrestricted
 

Go to Page | Contents Introduction | Index | Bibl. | Page- | Page+

 KLATT 1987, p. 737b 
SSSHP Contents | Labs
Smithsonian Speech Synthesis History Project
National Museum of American History | Archives Center
Smithsonian Institution | Privacy | Terms of Use