Computer-Assisted Multiplex DNA Sequencing 



G. M. Church. G. Gryan. S. Kieffer-Higgins. L. Mintz, M. J. Rubenfield. 



and M. Temple 



Department of Genetics. Howard Hughes Medical Institute. Harvard Medical School, 



Boston. MA 02138-3800 



(617)732-7562 



Several laboratories are sequencing genomes (ranging from 1 to 15 Mbp) from each 

 phylogenetic kingdom. The genome closest to completion is E. coli (20% of 4.7 Mbp). 

 These sequences will define consensuses for classes of protein domains, evolutionary 

 conservation, and change. While participating in this quest, we have developed a new 

 multiplex DNA sequencing method |Church et al.. Science 240, 185-188 ( 1988)], In 

 multiplex DNA sequencing. 480 sequencing reaction sets, each tagged with specific 

 oligonucleotides, are run on a single gel in 1 2 pools of 40 and transferred to a 

 membrane. We hybridize 75 such membranes simultaneously. The resulting sequence 

 film images are digitized, and sequence interpretations are superimpo.sed on the 

 enhanced two-dimensional images for editing. The computer program (REPLICA) uses 

 internal standards from multiplexing to establish lane alignment and lane-specific 

 reaction rules by discriminant analysis. The automatic reading phase takes one hour per 

 film (3 kb) on a Vaxstation. Images with overlapping data can be viewed side by side to 

 facilitate decision making. Hash-table-based routines for linking up shotgun sequences 

 in the megabase range are compatible in speed with the rest of the software. 



87 



