294 



BELL SYSTEM TECHNICAL JOURNAL 



TABLE II 

 Occurrence of Parts of Speech 



* Derived from data on less than 500 conversations. 



treated separately in the analysis for speech sounds. An exception to 

 this is that each form of the auxiliary verbs "be," "can," "may," etc., 

 was counted as a separate word. 



It is of interest to find that of approximately 80,000 words so ob- 

 tained, only 2,240, or less than 3 per cent, are different words. If each 

 of the modifications of a word is counted as a different word the number 

 of different words is increased to 2,822 ; but even on this basis less than 

 4 per cent of the total words are different words. Even among the 

 nouns the number of different nouns is only a tenth of the total number 

 of nouns. The five minor parts of speech shown in the last four lines 

 of Table II form only 5 per cent of the different words and yet make 

 up 57 per cent of the total words. The nouns, which constitute 46 

 per cent of the different words, contribute only 15 per cent of the total 

 words. Such figures indicate clearly that conversation is based on a 

 framework built up of a relatively small number of different words, 

 arranged in many patterns, which supports the more variegated words 

 which convey most of the meaning. 



A more detailed idea of this framework is given by Tables III-o and 

 1 1 1-6, which contain a list of the words which were observed in at least 

 1 per cent of the conversations. In Table Ill-a the words are arranged 

 in order according to the total number of times they were recorded. 

 This is approximately, but not quite, the same as the order of the num- 

 ber of conversations in which they occurred as may be seen by exam- 

 ining the numbers following each word. In Table 1 1 1-6 the same words 

 are arranged alphabetically, for ease in reference. The list comprises 

 737 words out of the 2,240 different words recorded. The importance 

 of the list lies in the fact, as will be shown later, that these words almost 

 completely determine the relative frequency with which the elementary 



