129 

 amino acid sequence is shown in Fig. 7-5. The sequence has 

 an untranslated 3' end of 277 bases following a TAA stop 

 codon and has three to four probable polyadenylation sites 

 prior to the beginning of a nine base poly A-tail. 

 N- terminal amino acid sequence 



The first 7 N-terminal amino acid residues of purified 

 SP-I were found tobeGEAMTRN. These amino acids 

 corresponded to the deduced amino acid sequence starting at 

 position number 24 (underlined in Fig. 7-5) . 

 Analysis of cDNA coding for the S -protein 



GCG sequence analysis. The GCG sequence analysis 

 program (26) was used for analysis of the presumed mature S- 

 protein cDNA sequence starting at amino acid residue number 

 24. The overall isoelectric point was 5.96 for the 

 bifunctional protein (isoelectric points of 5.04 for the 

 AroD functional domain and 6.99 for the AroE functional 

 domain) . A M^ of 60,388 (556 amino acids) was determined 

 for the proposed mature protein (M^ of 28,515, 260 amino 

 acids was determined for the AroD functional domain and M^ 

 of 34,651, 320 amino acids for the AroE functional domain). 

 There were no remarkable differences in amino acid 

 composition when the AroD and AroE domains were compared. 

 The bias of codon usage was similar to that of other higher 

 plant genes, with A and T being favored in the third 

 position of most alternative codons. Rare codons were TGC 

 (one) , CGC (zero) , CCG (zero) , and ACG (two) . 



1 



