What is Information Theory? 9 



information capacity and, in fact, the information capacity of a 

 storage system is defined by the equation, 



C = log n, 

 where n is tlie number of distinguishable states. This makes the 

 capacity of a compound storage system equal to the capacity of a 

 basic storage unit multiplied by the number of units in the system. 

 If the logarithm is taken to the base 2, then C is the equivalent 

 number of binary storage units (bits); and if the logarithm is 

 taken to base 10, then the information capacity is given in units 

 called Hartleys. For example, the capacity of a knob with 32 

 click positions is equal to that of five two-position switches (five 

 bits). A ten-position knob, on the other hand, has a capacity of 

 one Hartley, and two ten-position knobs capable of being placed 

 in 100 different states have an information capacity of two Hart- 

 leys. Since storage elements which are binary in nature (two 

 positions) are much less susceptible to error and are easier to 

 mechanize, it is more common to deal with binary units (bits) of 

 information than with decimal units of information (Hartleys). 



So far we have discussed information storage and, correspond- 

 ingly, information capacity. There is an important distinction, 

 however, between information capacity and information content. 

 The information content of a message may be defined as the ! 

 minimum capacity required for storage. To illustrate this impor- 

 tant point, consider a two-state message such as a reply to some 

 question which admits only yes or no. If someone in this audi- 

 torium is asked, "Are you a doctor?", then a reply admits of two 

 possible message states and it will certainly be possible to store 

 the reply in one binary storage unit. Intuition tells us that the 

 message contains one bit of information, for, by itself, it cannot 

 be stored any more efficiently. However, our previous discussion 

 has demonstrated that a bit of information should substantially 

 reduce uncertainty. In view of the fact that most of the people 

 in this auditorium are doctors, I could simply guess "yes" for 

 each person questioned and be correct most of the time. Thus, 

 one would expect the average information per question to be 

 less than one bit, as indeed it is. 



Using a numerical example from Woodward, suppose that 128 

 people in this auditorium are questioned and the 128 binary 



