A Primer on Information Theory " 43 



This means that information functions can be estimated successfully as 

 soon as the more common occurrences are categorized. The remaining 

 infrequent occurrences will not contribute very much, and that contribution 

 can be easily bracketed between values based on numbers of categories which 

 are certainly too small and too large. 



5. Small Effects of Small Variations in Probability — The curve of the 

 function F(p) =^ —p log p has a flat top. Small changes in probability in 

 this region have small effects. 



Consider the simplest case, of two categories. If their probabilities are 

 equal, then //= 1. If the ratio of the probabilities is 1:2, then 7/= .92. If 

 the ratio is 1 :3, a very considerable deviation from equality, H is still .81. 



For a larger number of categories, the insensitivity of H against probability 

 distortion is still mOre pronounced. If one replaces equiprobable alternatives 

 by probabilities staggered arithmetically or geometrically, stipulating only 

 that the span between the extreme value should be not more than one order 

 of magnitude, then the resulting changes in //are quite small. 



This implies that the assumption of equiprobability, which gives an upper 

 bound as stated in rule 1, will not go very far from the true value unless proba- 

 bilities are radically unbalanced. The stretch bracketed between an upper bound 

 based on equiprobability, and a lower bound based on a distortion undoubtedly 

 stronger than the real one, will not be very large. 



6. Alternative Ways of Estimating Information Functions — In systems 

 with several nodes, the compound infonnation functions can always be esti- 

 mated in several ways. For instance, in a two-node communication system, 

 the quantity which is the function of greatest interest, the amount of information 

 transmitted, T(x;y), can be computed in three alternative ways: as differences 

 between input uncertainty and equivocation, as difference between output 

 uncertainty and ambiguity, or as difference between the sum of uncertainties 

 of input and output and the uncertainty of their union. It usually is worthwhile 

 to inspect the data very carefully to estabhsh which of the set of functions can 

 be most easily and most accurately computed. In many cases, the quantities 

 most readily computed are not those which result directly from the plan of obser- 

 vation or experimentation. For instance, in most experiments it would be 

 natural to measure output uncertainty and ambiguity, but it is easier to measure 

 input uncertainty and equivocation. 



7. Substitution of Related Quantities — In many cases where it is not practical 

 to compute the proper information measures, one can compute information 

 measures associated with related quantities. Take the case of estimating the 

 amount of information v/liich an individual can transmit after a single glance 

 at a display. This quantity is very difficult to determine; but, it is fairly easy 

 to determine the amount of information which can be elicited from an individual 

 by a short interrogation procedure after he has had a glance at the display. 

 This function is not quite the one we want, but presumably closely related to 

 it. Another example: in the case of mental arithmetic, we have no way of 

 estimating the actual amount of information processed, but we can readily 

 estimate the amount of information which must be processed if computations 

 are done in the way in which the subject claims he computes. In cases of this 

 kind one will use the measurable quantity instead of the desired one. Of 



