Analysis of the Energy Distribution in Speech' 



By I. B. CRANDALL and D. MacKENZIE 



Synopsis: The frequency distribution of energy in speech has been de- 

 termined for six speakers, four men and two women, for a 50-syllable 

 sentence of connected speech, and also for a list of 50 disconnected syllables. 

 The speech was received by a condenser transmitter whose voltage output, 

 amplified 3,000 fold, was impressed on the grids of twin single stage amplifi- 

 ers. The unmodified output of one of these amplifiers was measured by a 

 thermocouple and was a known function of the total energy received by the 

 transmitter, corrections being made for the slight variation with frequency 

 of the response of the circuit. The output of the other amplifier was limited 

 by a series resonant circuit to a narrow band of frequencies, the energy in 

 this band being measured by a second thermocouple. The damping of the 

 resonant circuit was so chosen that sufficient resolving power and suiificient 

 energy, sensitiveness were obtained over the range from 75 to 5,000 cycles 

 per second; and 23 frequency settings were made to cover this range. For 

 each syllable simultaneous readings were recorded on the two thermocouples 

 at each frequency setting. The consecutive syllables were pronounced de- 

 liberately by each speaker, maintaining as nearly as possible the normal 

 modulation of the voice. Corrections were applied to offset the unavoidable 

 variations in total energy incidental to repetition of a given syllable. 

 13,800 observations were made for all speakers. The energy distribution 

 curves obtained are essentially the same for connected as for disconnected 

 speech, and indicate that differences between individuals are more important 

 than variations due to the particular test material chosen. A composite 

 curve drawn from the individual curves shows a great concentration of 

 speech energy in the low frequencies, a result which would not be expected 

 from data previously published by others. The actual results contain 

 a factor due to standing waves between the speaker's mouth and the 

 transmitter, a complication always present in telephoning; this could not 

 be eliminated. 



The rate of energy output in speech for the normally modulated voice, was 

 determined from the readings for total energy and was found to be about 

 125 ergs per second. 



IN the study of speech and its reproduction by mechanical apparatus 

 it is necessary to consider its composition from several different 

 points of view. We desire first of all to know the actual frequency 

 distribution of the total energy in speech, as well as the separate dis- 

 tributions for each individual sound. We also desire to know the 

 apparent distribution of energy, that is, the distribution as perceived 

 by the ear. Finally, we wish to know the importance of each fre- 

 quency, that is, the contribution to "articulation" or "quality" 

 in the exact reproduction of speech which can be traced to the energy 

 of each elementary band of frequencies in the speech range. In all 

 three cases certain frequency functions are used to represent these 

 distributions. The advantage of considering these difTerent frequency 

 distribution functions separately has already been indicated by one 

 of the present writers.^ 



1 Reprinted from The Physical Review, N.S., Vol. XIX. No. 3, March, 1922. 

 »"The Composition of Speech," Phys. Rev., X, p. 74, 1917. 



116 



