INTERNATIONAL BUSINESS MACHINES CORP. (IBM) - continued
Armonk, New York
SSSHP 155 IBM TASS-II ELECTRONIC CHARACTERIZATION - INDEX
A collection of circuit diagrams and calibration information.
Using this information on function generator operation, synthesizer
circuits, and calibration, it should be possible to recreate by
simulation the synthetic speech produced from the function generator
patterns for the 1965/1966 time period. The function generator
patterns would need to be scanned and resampled to 20 dpi along the
pattern and 150 dpi or more across the pattern (for better than 1%
accuracy.) Some imaginative programming should be able to pull out
individual channel values from the scan file and scale them to the
frequency and amplitude values required by the TASS-II controls.
The diphone library (Lib 4) for this period was not preserved
separately, but was converted into diphone library 5 with calibra-
tion suitable for the TASS-III synthesizer.
CONTENTS
Page Description
1. TASS-II Block Diagram and Level Calibration
The block "HP 466A" was a wide-band 20 dB amplifier. The
output of the synthesizer was the sum of four signals from
the four paths through the synthesizer. Four calibration
diphones allowed the level of each of the four paths to be
set individually. The diphones were:
Chan: 1 2 3 4 5 6 7 B1 B2 Output Level
CALEV1 52 34 42 50 72 14 14 0 0 Buzz-only -15.4 dbm
CALEV2 52 34 42 50 14 14 76 0 0 Fn only -24.5 dbm
CALEV3 52 34 42 50 14 74 14 0 0 Asp only -19.0 dbm
CALEV4 52 34 42 50 14 74 31 0 1 Fric. Only -30.0 dbm
These values tune the synthesizer to a calibration vowel and
the voice fundamental to 120 Hz.
The order of the filters F6 to KH was determined by trial and
error to maximimize the signal-to-noise ratio (S/N = 47.6
db). If one filter boosts the signal in a frequency range,
the following filter should attenuate the signal in the same
frequency range to reduce the peak-to-peak signal swing.
F1 should be late in the series to prevent control-signal
feed-through (called "thump") from being boosted by later
filters. The high-Q filters F2 and F3 must not be tuned
within about 200 Hz of one another if there is any excitation
because the cascaded gain will cause high peak-to-peak
signals and distortion.
2A,B. Hiss Generator
A reversed-bias diode was used as a noise source. The noise
signal was amplified and symmetrically clipped at a variable
amplitude determined by Channel 6 (Ah)
2C. Compensation Network for Hiss Generator
The Hiss amplitude was made a log-function of the Channel 6
control voltage.
3A,B. Buzz Generator
The first voice source was a 100-microsecond pulse of
variable amplitude, as set by Channel 5 of the synthesizer.
The frequency was varied by controlling the voltage (from
Channel 4) on a unijunction transistor pulse generator.
3C. Attachment to Buzz Generator Schematic
Later modification of the circuits.
3D. Hiss (Noise) Modulator
When voiced fricatives were to be synthesized, this circuit
modulated the amplitude of the Hiss Generator, reducing it to
zero 25% of the time.
4A. Ramp Voice Generator, Ramp Generator
A later voice source generating a more natural ramp waveform.
This part of the circuit generated the ramp waveform at a
frequency determined by Channel 4 of the synthesizer (F0).
The output went to the circuit 4B.
4B. Ramp Voice Generator, Amplitude Modulator
This circuit controlled the amplitude of the voice generator
by chopping the signal at a high rate. The low-pass filter
recovered the waveform.
4C. Ramp Voice Generator, Amplitude Control
This circuit controlled the circuit of 4B from Channel 5 of
the synthesizer (A0). The separate "dither out" circuit was
for introducing a small amount of random noise into the
frequency and amplitude of the voice generator.
4D. Ramp Voice Generator, Noise Modulator
This circuit modulated the Hiss Generator when synthesizing
voiced fricatives.
4E,F. Preference Test of Ramp Voice Wave Shape
Results of A/B preference test of a range of wave shapes. Tape
recording played over the telephone. Highest ranking was for
a percent-rise/percent-fall ratio of 5 and a percent-on ratio
of 40.
5A. Formant Circuits
Many of the synthesizer filters are formed from a single
complex pole pair resonance. This sheet shows the
relationship between the pole location and the frequency
response. The cascade of filters F6 through KH in Figure 1
represent the resonances of the vocal tract for voiced sounds.
6. Nasal Formant
A fixed resonance of 250 Hz with a 3 dB bandwidth of 100 Hz
for simulating the resonance of the nasal tract. When the
Nasal Amplitude control (Channel 7) was absent, the input to
the filter taken to zero with an RC time-constant.
7. TASS-II F4
This circuit provided a fixed resonance of 3500 Hz with a 3
dB bandwidth of 140 Hz when synthesizing a male voice.
8. Filters F5 and KH
Filter F5 was tuned to 4000 Hz for a male voice to match the
4000 Hz setting for KH. The combined transfer function
compensated for missing higher frequency resonances of human
speech.
9. TASS-II F6
This circuit provided additional high frequency boost to
compensate for miscellaneous losses in the circuitry. The
center frequency was 4000 Hz with a bandwidth of 800 Hz.
10. TASS-II VP1 and VP2
Low-pass filters to shape the voice spectrum. VP1 was 50 Hz;
VP2 was 2500 Hz.
11. TASS-II VP3
High-pass filter to shape the voice spectrum. VP3 was 300
Hz.
12A,B. Summing Amps/Output Amp
Summing amplifiers Sigma 1, 2, and 3. Sigma 1 includes an 800
Hz RC high-pass to filter the Hiss Generator for asperation
through the formant chain. Last block is an audio amplifier
with level control.
13A. Relay Board
Relay driver that operated with binary control signals B1 and
B2.
13B,C. Specification for 1-millisecond relay.
14A,B. Formant Generator, Variable Gain Amplifier
Basic tunable filter for formants F1, F2, F3. Frequency
changes as internal gain is varied by variable-ratio chopping
at points "to Diode Mod #1" and "to Diode Mod #2". Varying
the gain at two points in cascade causes the filter frequency
to vary linearly with the chopping pulse-ratio.
14C,D. Formant Generator, Pulse Ratio Modulator & Choppers
Circuit for generating the chopping voltage for circuits
14A,B as a function of Channel 1, 2, and 3 control voltages.
14E,F. Q-Control
A light bulb/photoconductor combination was used to reduce
the Q (increase the bandwidth) of formant F2, under the
condition of binary controls B1=1 and B2=0.
15A. Fricative Filter, Audio Section
Tunable filter for fricative zero and poles P1 and P2.
Implemented with electrically variable inductors. Only one
of the two 3 kHz high pass filters was used.
15B. Fricative Filter, Control Circuit
Single Channel 7 control signal tunes the fricative zero and
poles in prescribed paths to simulate /sh, s, th, f/.
15C. Specification for fricative filter inductors.
16. TASS-II Function Generator
Cathode ray tube scanner of the 7-channel function generator.
Normal scan rate was 100 sweeps per second (a 10 millisecond
sampling period.) See also p.20, below.
17. TASS-II Card Punch Coupler
The control pattern could be stepped in 0.05-inch increments
(20 samples/inch) so that the channels could be scanned and
the values digitized and punched into cards. This sampling
was equivalent to 100 samples/second when the control pattern
was moving at the normal 5 inches/second.
18. TASS-II Tape Readback System
Digital values from the computer were written to a magnetic
tape in a continuous stream, without record gaps. This
circuit converted the digital values to time-domain patterns
that simulated the output of the function generator
photomultiplier output. In this way, the computer could
control the synthesizer.
19A. Calibration Data for Synthesizer-II
Calibration data for the TASS-II synthesizer in 1966. The
calibration is in terms of digital values from the computer
vs value of the synthesizer function being controlled.
19B-D. Frequency of peak amplitude vs digital value for first
three formants, F1, F2, and F3.
19E. Frequency of Buzz/Voice Generator vs digital value.
19F,G. Amplitude of Buzz/Voice Generator vs digital value, log
and linear scales.
19H. Amplitude of Hiss Generator vs digital value, log scale.
19I. Amplitude of Nasal Formant vs digital value, log and linear
scales.
19J. Frequency of peak amplitude vs digital value for pole 1 (P1)
of the Fricative Filter
When Channel 7 control voltage was not being supplied, the
filter returned to a value of 5000 Hz. There was a
hysteresis in the tuning so that the peak frequency was
higher than the calibration as the control count decreased.
This small effect should be imperceptible in synthetic
fricatives.
19K. Bandwidth of Fricative Filter
3 dB bandwidth of Fricative Filter zero (Z), and poles (P1
and P2) at three tuning points, for three fricatives /SH, SX,
FX/.
19L. Frequency tuning of Fricative Filter
Channel 7 tuned pole P1. Zero and pole P2 were made to move
in relation to P1 according to the graph. As P1 was tuned
above 5000 Hz, P2 was tuned more rapidly to a high value to
eliminate it from the transfer function.
19M-O. Fricative Filter transfer functions for standard fricative
diphones.
19P. F5+KH Transfer Function
Transfer function for the KH filter and the theoretical
transfer functions for correcting a four-pole synthesizer for
the missing higher-frequency poles (from Fant). The 4 kHz
function was used for the synthesizer's male voice.
19Q. TASS-II Frequency Response
Transfer function of the formant chain. Bandwidths were
reset temporarily to 100 Hz to evaluate the effectiveness of
the spectral correction (formant amplitudes should all be the
same).
19R,S. TASS-II Count to Value Calibration Tables for Ah and Fh
Calibration values for the non-linear graphs 19H and 19J-L.
Tracking formulas for Fricative Filter Z and P2 vs P1.
19T. S-Plane Equations for IBM TASS-II Fricative Filter
20. TASS-II Function Generator Scanning and Calibration
Calibration pattern with digital values from pattern-to-
computer conversion process. Digital values can be used to
relate function generator patterns to tuning of the
synthesizer. The digital values are decimal, ranging from 12
to 98.
As shown in the figure, the CRT scan (see also p.16) started at
the bottom of the pattern, creating a series of pulses in the
photomultiplier as the larger black lines were crossed. The
electronics created a synthesizer control voltage for each
channel that was proportional to the time between the scan
reaching the lower black line of a channel and the scan reaching
the lower edge of the 1/16" black masking tape representing the
desired value for that channel. Any second pulse within that
channel from another piece of black masking tape would be
ignored, with the following two exceptions: black tape in the
exact positions marked "B1" and "B2" would be detected and a
voltage generated to operate binary controls B1 and B2.
H. D. Maxey, 2001
|