SPEECH 



I7II 



the intelligibility threshold for speech is markedly 

 raised by low-frequency tones, as might be expected 

 from the phenomena of masking in general. The 

 effect is maximal at and above the frequency of the 

 masking tone. High-frequency tones, on the other 

 hand, have little masking effect. In the case of masking 

 by white noise, the rise in threshold is directly pro- 

 portional to noise intensity, at least above sound 

 pressure levels of 40 db. The speech-to-noise energy 

 ratio at the masked threshold is broadly constant over 

 a wide range of sound intensities (52). 



Masking is generally considered to be a purely 

 peripheral effect. There is however evidence to 

 suggest that it may have a central component (53, 

 6g, 79, 80). Thus Licklider (79) has shown that tin- 

 phase relations of speech and noise at the two ears 

 affect the intelligibility of speech at any given value 

 of the speech-to-noise energy ratio. More recently, 

 Hirsh (53) has stressed that the binaural masked 

 threshold for speech depends upon the interaural 

 phase relations of the speech and those of the nine, 

 when these are identical, the threshold is high and 

 both speech and noise have the same location. But 

 when the interaural phase relation of the speech is 

 reversed relative to that of the noise, the threshold is 

 low and there is a difference in location. Some 

 relations of masking to recruitment and other factors 

 relevant to clinical audiometry have been examined 

 ('7. 54. 55)' 



Binaural Speech Perception 



The relations between binaural listening, auditory 

 localization and the perception of speech have been 

 intensively studied. If information is conveyed to the 

 subject simultaneously from two differenl sources, 

 interpretation of the competing messages is markedly 

 affected by the degree of horizontal separation of the 

 sound sources (15, 16, 101). Further, ii has been 

 shown thai (he effects of feeding two different mes- 

 sages simultaneously, one to each ear, are very 

 different from those that obtain if the same messages 

 are 'mixed' on a tape recording and the two cars 

 stimulated identically (21, 24). In the case of a 

 'mixed' message, some degree of separation can be 

 effected but interpretation is markedly inadequate. 

 In the case of independent messages, on the other 

 hand, no difficulty is experienced in listening to one 

 or the other message at will or in repeating it con- 

 currently. Under these conditions, however, the 

 subject can report virtually no information conveyed 

 by the 'rejected' message, other than the tongue 



(English or foreign) and the sex of the speaker ('sta- 

 tistical recognition"). But if both messages are very- 

 brief some short-term 'storage' of information may be 

 demonstrated. Thus in an experiment by Broadbent 

 (16) three dibits (say 736) were fed to one ear and 

 three different digits (say 245) simultaneously to the 

 other. It was found that the subject could as a rule 

 repeat all six digits correctly, although almost always 

 in the order 736245 or 245736. This phenomenon is 

 interpreted by Broadbent in terms of a short-term 

 storage mechanism adapted to deal with the re- 

 strictions imposed upon the organism by limited 

 channel capacity. Some further implications of this 

 view have been discussed (17 19). 



The integration of data from the two ears in per- 

 ception of a unitary 'acoustic field' has been studied 

 by Cherry & Taylor (24). In the first place, they have 

 attempted to measure the time required to 'switch 

 attention 1 from one car to the Other in listening to 

 simultaneous messages. | heir curves relating articu- 

 lation score to switching frequency are found to show 

 .1 slurp dip (indicating marked deterioration of 

 recognition) at a switching period of between 0.2 and 

 ; see. depending on the observer dig. t I. This dip 

 is held by the authors to mark the transition between 

 switching of attention from ear to ear and binaural 

 listening to what is taken to be unitary speech. At the 

 same time, other explanations of the effect cannot be 

 ruled out. In the second place, they have investi- 

 gated the range of dclav between two identical 

 messages fed independently to the two ears con- 

 sistent with perception of .1 single speech source. 

 With delay periods varying between 1 and 50 msec, 

 certain subjective effects of binaural directivencss 

 with apparent shift in location of the sound sources 

 were reported. At an interval of about 15 msec, 

 however, a striking phenomenon is encountered: the 

 hitherto unilarv sound source appears suddenly to 

 dissociate into two independent sources of sound 

 (fig. 2). This is of particular interest in so far as the 

 delay period is surprisin<_ilv loim, far exceeding any 

 period of delay which could occur under natural 

 conditions of audition. Further studies of these and 

 related effects m.iv well throw valuable light on the 

 organization of the 'acoustic field.' 



It is most improbable that these various binaural 

 effects in simultaneous speech perception can be 

 explained in terms of masking or related forms of 

 peripheral interference. They would appear to 

 depend on a central selective mechanism closely 

 related to 'attention,' the neurophysiological basis 

 of which is unknown. It is true that some possible 



