Along with many other features in the cosmids, this 
effort turned up new potential operon(s) covering 
1 3 consecutive long ORFs without homologues in 
any databases. This is by far the longest operon 
found in E. coli or Salmonella. 
To increase the number of oligonucleotides syn- 
thesized in parallel and decrease the cost of synthe- 
sis, solid-phase DNA synthesis has been multi- 
plexed. Synthesis in this new system occurs on the 
surface of pins that dip into appropriate common 
troughs containing reagent, including modified 
monomers. A solenoid array controls these pins set 
on a 96-welI 9-mm spacing, and a stepping motor 
positions the troughs. The synthetic scale has been 
miniaturized down ~ 1,000-fold to the 100-pmol 
range. The products pass quality control tests, in- 
cluding priming dideoxy-sequencing reactions, 
polymerase chain reactions, kinase labeling, and gel 
electrophoresis. 
Computational Methods 
The large number of sequence films produced by 
muhiplexing are digitized with film scanners, al- 
lowing linkage of automatic base assignments and 
high-resolution images in a database. The computer 
program REPLICA uses internal standards from mul- 
tiplexing to establish lane alignment, lane-specific 
reaction rules, and deconvolution models respon- 
sive to variations in the interband distance at differ- 
ent points on the sequencing runs. Images with 
overlapping data can be viewed side by side to facili- 
tate decision making and automatic multisequence 
alignments. Image, contig, and sequence assign- 
ments from a multigigabyte database now correlate 
and display in <4 s. A sequence assembly system for 
ordered and shotgun data called GTAC has been im- 
proved and benchmark tested, using data from 1 2 
cosmid projects with >300 kb of raw data each and 
from various simulations. The run times are nearly 
linear (A^'^) with increasing project size (A?). A sam- 
ple data set of 2 million bases of raw DNA sequence 
with 2% error assembles in 9 h on a desktop com- 
puter (DEC VS 3100). 
Linking DNA and Protein Databases 
To test predictions of protein structure and abun- 
dance based on genomic sequence and consensus 
motifs, proteins are sequenced directly from two- 
dimensional gel spots. This also allows the two- 
dimensional databases on gene expression levels for 
various cell types and cell environments to be corre- 
lated to candidate DNA, RNA, and protein regulatory 
motifs. About 400 E. co/? amino-terminal sequences 
have been obtained and compared with all DNA da- 
tabase sequences translated in all six frames. About 
20% are unknown or match uncharacterized ORFs. 
Results on total cell extracts and various subcellular 
fractions indicate that proteins present anywhere 
from 100,000 copies to <1 copy per cell can be 
analyzed in this system. 
Measurements of Genomes Using Single 
Ion Channels 
Since patch-clamp conductance can measure sub- 
tle changes in single protein-ion channels in biologi- 
cal lipid bilayers, the possibility of studying DNA 
passing through or past such channels has been ex- 
plored. This may not only provide a view of interest- 
ing genome movements — such as viral DNA injec- 
tion, conjugative single-stranded DNA transfer, and 
polymerase function — but may ultimately lead to 
ways to read long DNA sequences rapidly. The ini- 
tial focus will be on X DNA injection into LamB pro- 
tein pores. 
Pathological Genome Sequences 
Dr. Church's laboratory has begun surveying hu- 
man DNA polymorphisms in G protein-coupled re- 
ceptor genes associated with behavioral abnor- 
malities and differential drug responsiveness. In 
conjunction with the genome group at Collabora- 
tive Research Inc., the multiplex system is being 
applied to the analysis of genomes of Mycobacteria 
species involved in tuberculosis and leprosy. The 
differences between the E. coli standard strain (Kl 2 
EMG2) and related pathogenic enterics are being 
explored by subtractive methods, in collaboration 
with laboratories at Massachusetts General Hospital. 
Dr. Church is also Assistant Professor of Genet- 
ics at Harvard Medical School. 
Article 
Sikorav,J.-L., and Church, G.M. 1991 . Complemen- 
tary recognition in condensed DNA: accelerated 
DNA renaturation./Mo/ Biol 222:1085-1 108. 
170 
