The Cryptographic Approach to the Problem of Protein Synthesis 



67 



Since RNA serves as an intermediary between DNA and proteins, we have here 

 two problems. First, how is RNA formed by DNA ? Second, how are proteins 

 synthesized by RNA? The first problem may turn out not to be very difficult 

 because of the close similarity between the two molecules. For example, 

 RNA may be a non-regenerated half of DNA with small changes in sugars 

 and in one of the bases. It may be that the absence of the oxygen atom in RNA's 

 sugar is responsible for the failure to form a double-stranded configuration. 

 However, we still do not know the answer to this question. 



The second problem concerning the synthesis of proteins by RNA mole- 

 cules presents more challenge to the imagination. How can a sequence formed 

 by four different units (four bases) be translated in a unique way into a sequence 

 formed by twenty units (twenty amino acids)? Here is a possibility which 

 seems to us to be very likely. Suppose one plays a game of poker in which 

 only three cards are dealt, and pays attention only to the suit of the card. How 

 many different hands will one have? Well, one can have a 'flush', i.e. three 

 cards of the same suit. There are four different flushes: three hearts, three 

 spades, etc. Then one can have a 'pair', i.e. two cards of the same kind, and 

 one different. How many of those are there? One has four choices for the 

 suit of the pair, and three choices for the third card. Thus, there are altogether 

 twelve possibilities. The poorest hand will be a 'bust', i.e. three different suits. 

 There are four different busts: no hearts, no diamonds, etc. We have altogether 

 twenty different possibilities. This 'magic number' 20 is just the number of 

 amino acids participating in the primary process of protein synthesis. We 

 may imagine that each amino acid in the synthesized protein is determined 

 by a triplet of bases in the RNA template. 



Since the distances between neighboring amino acids in the extended 

 polypeptide chain are equal to the distances of neighboring bases in the poly- 

 nucleotide chain (both being equal to 37 A), it was at first natural to suppose 

 that the correlation between the two chains looks in a way shown in Fig. 2, 



RNA-Template 



where individual bases are shown by circles and the amino acids by triangles. 

 This represents the so-called over-lapping code in which the neighboring amino 

 acids have in common two bases in the RNA template. If the transfer of 

 information from nucleic acid to protein is carried out according to such an 

 overlapping code, there must exist a definite inter-symbol correlation between 

 the amino acids constituting protein molecules. Thus, for example, if a certain 

 amino acid is determined by two adenines and some other base, its neighbors 

 will be preferably amino acids which also contain adenine in their template 

 transcript. In order to see whether or not such a correlation between the 



