The Bell System Technical Journal 



July, 1931 



Some Physical Characteristics of Speech and Music * 



By HARVEY FLETCHER 



Kinematic and statistical descriptions of the physical aspects of speech 

 and music are given in this paper. As the speech or music proceeds, the 

 kinematic description consists in giving the principal melodic stream, 

 namely, the pitch variation and also the intensity and the quality variations. 

 For speech and song, the quality changes are principally described by giving, 

 besides the main melodic stream, two secondary melodic streams correspond- 

 ing, respectively, to the resonant pitches of the throat and mouth cavities. 

 To this must also be added the positions of the stops and the high pitched 

 components of the fricative consonant sounds as functions of the time. The 

 statistical description consists in giving the average, the peak, and the 

 probable variations of the power involved as the various kinds of speech and 

 music proceed. These general ideas are illustrated by numerous experi- 

 mental data taken by various instrumental devices which have been evolved 

 in the Laboratories during the past fifteen years. 



A speech or musical sound is transmitted from the mouth of a speaker 

 or from a musical instrument through the air to the ear of the 

 listener by means of a pressure wave, a succession of condensations 

 and rarefactions of the air. Such a wave spreads in all directions 

 away from the source of sound and soon encounters solid objects which 

 cause reflections. These reflected waves combine with the original 

 one and thus modify the pressure changes taking place at any point. 

 In this paper we shall be concerned chiefly with the pressure changes 

 which take place before reflections occur. 



Speech is composed of fundamental sounds called vowels and 

 consonants. As a conversation proceeds there is a constant shifting 

 from one of these sounds to another, only one of them being sounded 

 at one time. Most of these sounds may be continued as a steady 

 tone and hence may be designated as continuants. The others require 

 that the sound stream be interrupted and are therefore called stops. 

 The first class includes the long and short vowels, the diphthongs, the 

 semi-vowels, and the fricative consonants, the sounds a, i, ou, 1 and s 

 being typical, respectively, of each of these groups. The pure stops 

 are p, t, ch, and k. In producing the corresponding voiced stops, 

 b, d, j and g, the voiced stream is not entirely interrupted, although 

 the tones from the vocal cord are very much subdued. A conversation, 



* Presented as invited paper in Symposium on Acoustics, American Phys. Soc, 

 Dec. 30-31, 1930, Cleveland, Ohio. Published in Rev. of Modern Physics, April, 1931. 



349 



