|
|
|
|
Published online before print
June 12, 2003, 10.1101/gr.529803 Genome Res. 13:1765-1774, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
Methods Identification of Promoter Regions in the Human Genome by Using a Retroviral Plasmid Library-Based Functional Reporter Gene Assay1 Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA 2 Stanford Medical Informatics, Stanford University School of Medicine, Stanford, California 94305, USA 3 Stanford Human Genome Center, Stanford University School of Medicine, Stanford, California 94305, USA 4 Department of Computer Science, Stanford University, Stanford, California 94305, USA
Attempts to identify regulatory sequences in the human genome have involved experimental and computational methods such as cross-species sequence comparisons and the detection of transcription factor binding-site motifs in coexpressed genes. Although these strategies provide information on which genomic regions are likely to be involved in gene regulation, they do not give information on their functions. We have developed a functional selection for promoter regions in the human genome that uses a retroviral plasmid library-based system. This approach enriches for and detects promoter function of isolated DNA fragments in an in vitro cell culture assay. By using this method, we have discovered likely promoters of known and predicted genes, as well as many other putative promoter regions based on the presence of features such as CpG islands. Comparison of sequences of 858 plasmid clones selected by this assay with the human genome draft sequence indicates that a significantly higher percentage of sequences align to the 500-bp segment upstream of the transcription start sites of known genes than would be expected from random genomic sequences. We also observed enrichment for putative promoter regions of genes predicted in at least two annotation databases and for clones overlapping with CpG islands. Functional validation of randomly selected clones enriched by this method showed that a large fraction of these putative promoters can drive the expression of a reporter gene in transient transfection experiments. This method promises to be a useful genome-wide function-based approach that can complement existing methods to look for promoters.
With the sequencing of the human genome nearly complete, intense efforts
are being made to annotate the genome at sites such as the National Center for
Biotechnology Information (NCBI), Ensembl, and University of California Santa
Cruz (UCSC). Most of the annotation is of protein-coding regions of genes,
which comprise less than 4% of the human genome. The large-scale discovery and
study of regulatory sequences in the human genome remain a considerable
challenge. Regulatory sequences make up a small fraction of the genome that is
noncoding. Cis-regulatory elements such as promoters, enhancers,
insulators, silencers, matrix attachment regions, and locus control regions
play crucial roles in regulating the levels, sites, and timing of gene
expression (Pennacchio and Rubin
2001
Finding and dissecting critical promoter regions has been done for one or a
few genes by performing deletion analyses on suspected promoters in plasmid
constructs and transfecting them into cultured cells
(Myers et al. 1986
Sequence analysis of coregulated genes is another way of mining for
putative promoters and other regulatory elements, as coexpression of genes
often occurs by the use of common cis-acting sequences
(Roth et al. 1998 This paper describes a retroviral plasmid library-based approach to identify promoter regions in the human genome. This method selects for and detects the promoter function of isolated DNA fragments from a complex mixture in a cell culture assay. We analyzed the sequences of plasmid clones selected in this assay by using the draft sequence of the human genome. We identified promoters of known genes and predicted genes, as well as many other likely promoter regions based on the presence of features such as CpG islands. The combination of this retroviral plasmid library-based assay with computational analysis of sequences of putative promoter-containing clones promises to be a useful function-based approach that can complement existing methods to look for promoter regions and other regulatory elements.
We developed and tested a cell-based functional assay to identify promoter regions in the human genome. This assay uses a library of genomic DNA fragments cloned into a retroviral vector, which is advantageous because a single retroviral integration event occurs per target cell in an efficient manner. We constructed a genomic DNA library in a Moloney murine leukemia virus (MMLV) plasmid pSKF-6 (Fig. 1A) by cloning human genomic DNA fragments of 300600 bp upstream of a promoterless green fluorescent protein (GFP) reporter gene. pSKF-6 is a self-inactivating (SIN) vector with a defective 3' long terminal repeat (LTR) that replaces the functional 5' LTR upon reverse transcription of the virus, resulting in the transcriptional inactivation of the provirus in the infected cells. This eliminates the possibility of transcriptional activation of GFP by viral LTRs.
We transfected this human genomic library into the Phoenix-Eco packaging cell line to convert DNA to the RNA virus (Fig. 1B). HeLa-Eco target cell lines were infected with this viral library and screened by using fluorescence-activated cell sorting (FACS) for clones that can drive the expression of GFP. We isolated genomic DNA from sorted cells that were GFP-positive and recovered inserts containing putative promoters by PCR-amplification with primers that recognize the vector. We subcloned the pool of amplified PCR products, sequenced the insert from each clone, and performed detailed sequence analysis for 858 putative promoter-containing clones. We retested 130 clones individually for promoter activity in transient transfection reporter assays.
Selection for Putative Promoter Region Clones by FACS
Testing our system with the vector pSKF-6 with no promoter inserts
consistently gave ≤0.2% GFP+ cells (Fig.
2A). This indicates an occurrence of minimal positive position
effects due to integration of the viral DNA into regions of the genome
containing regulatory elements. This is in agreement with other studies that
used gene- or promoter-trap retrovirus-based vectors, where the frequency of
integrations that led to the expression of a reporter gene from an endogenous
active promoter in the host cell was between 0.04% and 0.5%
(Von Melchner et al. 1990
Sequence Analysis of 858 GFP+ Clones
Discovery of Putative Promoter Regions of Known Genes Of 858 sequences that we defined from clones selected by this method, 6% of GFP+ low clones and 15% of GFP+ high sequences aligned to the 2-kb segment upstream of the transcription start site of a known gene (category A in Table 1). Interestingly, 70% (61/87) of the sequences in category A overlap with CpG islands. Eighty-three percent (72/87) of the GFP+ sequences in category A align with the region just 500 bp upstream of the transcription start site of a known gene. Although some sequences, such as the putative promoter region of the ephrin-A4 gene, fall completely upstream of the transcription start site, others, such as the putative promoter region of the ribosomal protein L38 gene, contain the promoter and part of the 5'UTR of the gene (Table 2). This is not unexpected, as the 300600-bp genomic DNA fragments cloned into the library are large enough to contain a promoter and extend into the 5' UTR of a gene. The newly identified promoter sequences belong to a variety of genes coding for enzymes (e.g., phosphoglucomutase1), calcium binding proteins (e.g., peflin), membrane binding proteins (e.g., ubiquilin2), ribosomal proteins (e.g., S24), and histone family members (e.g., H2BFL; Table 2, Suppl. Table 3). Based on the total number of genes in the Refgene table from UCSC (14,402 genes), and the size of the August 2001 assembly of the human genome (2.88 Gb), we estimate that 0.3% of the human genome sequence falls within 500 bp upstream of the transcription start site of a Refgene. In our experiments, we identified 4% (15/418) GFP+ low clones in this category. This fraction (4%) is significantly higher than the rate of 0.3% expected if there was no selection for promoters (P < 2.24 x 10-32, X2 test). This result suggests that our method significantly enriched for DNA fragments that aligned to a 500-bp region just upstream of a Refgene transcription start site. Similarly, we identified 13% (57/440) GFP+ high clones that aligned to a 500-bp region upstream of the transcription start site of a Refgene. This fraction (13%) is not only significantly higher than 0.3%, but is also significantly higher than the observed ratio for the GFP+ low clones (P < 1 x 10-6, Fisher's Exact Test), indicating that the enrichment is better when cells are selected in the higher fluorescence range. This is not surprising considering that GFP+ high cells are more distinctly separated from the background fluorescence than GFP+ low cells. In one of our experiments, a single round of sorting of the top 10% of GFP+ high cells indicated that 8% of GFP+ clones contain putative promoter sequences that align within 500 bp upstream of the transcription start site of a known gene. Sorting the GFP+ high cells a second time after expanding them in culture resulted in 21% of GFP+ clones aligning to 500 bp upstream of the transcription start site of a known gene, with some of the putative promoter sequences being present more than once (data in Suppl. Table 4). These results indicate that a second round of selection provides better enrichment for promoters and minimizes false positives selected by FACS.
Identification of Likely Promoters Based on Gene Predictions and CpG
Islands
In humans, CpG islands are found at the 5' ends of at least half of
the genes and often contain the promoter and one or more exons
(Cross and Bird 1995
Detection of Other Sequences That Are Likely Promoters
The first exon and first intron are likely locations for an alternate
promoter for driving the expression of an alternative transcript of a gene.
Indeed, alternative promoters in exon 1 or intron 1 have been described for
several genes, such as NADH-cytb5 reductase and MDM2, a gene amplified in
cancer (for review, see Ayoubi and Van de
Ven 1996
Our analysis indicates that an additional 9% (77/858) of the sequences
align within the 2-kb upstream region of a gene predicted in only one
annotation database. An additional 8% (71/858) of sequences overlap with
retroviral-like transposable elements that are identified by the RepeatMasker
program (A. Smit and P. Green, unpubl.). About 8% of the human genome is
composed of retroviral-like LTRs that contain internal transcriptional
regulatory sequences for propagation by retrotransposition
(International Human Genome Sequencing
Consortium 2001
Additional Evidence for the Identification of Putative Promoters We also searched the 858 sequences for a potential TATA-box (TA[T/A][T/A][T/A] [T/A]). Although the TATA-box is generally around -30 bp from the transcription start site, we scanned the entire sequence, because the precise transcription start sites are unknown. Notably, 50% (427/858) of sequences have this TATA-box (Suppl. Table 2). This is significantly higher than what we would expect based on the nucleotide distribution of our sequences (P < 0.05, X2 test).
Validation of 130 Clones in Transient Transfection Reporter
Assays
For putative promoter regions within 2 kb upstream of a Refgene
transcription start site (category A), 86% (25/29) were able to drive the
expression of the reporter gene in this validation assay
(Fig. 3, Suppl. Table 5).
Fifty-six percent (10/18) of the clones that align to within 2 kb upstream of
predicted genes in two or more annotation tables (category B) were found to be
positive for luciferase activity. Seventy-three percent (19/26) of the clones
that are classified as putative promoters based on overlap with a CpG island
(category C) were luciferase or
We divided positive clones into three groups based on the fractional increase in activity of the reporter gene (Fig. 3; Suppl. Table 5). We observed likely promoters of varying strengths from all sequence analysis categories. It should be noted that more than half of those sequences with retrotransposable repeat elements that we tested showed promoter activity in this assay. We found that 32% (41/130) of the clones that we retested in the validation assay were not able to drive the expression of the reporter gene. This could be because they are not true promoters, and are part of the experimental noise that might be expected in a multistep enrichment process such as our method. Alternatively, it is possible that they are indeed promoters but are inactive in transient transfection assays in 293 cells, as they were originally discovered to have promoter activity in HeLa-Eco cells from our retroviral screen. The observation that 68% (89/130) of the clones can be functionally validated on a clone-by-clone basis in transient transfection reporter gene assays provides verification that our retroviral screen indeed selects for genomic DNA fragments that have promoter activity.
We present here a functional selection for potential promoter sequences in the human genome that uses a retroviral plasmid library in which the inserts are genomic restriction fragments that are subcloned upstream of a promoterless reporter gene. By using flow cytometry, this method selects for cells that have green fluorescence as a result of integration of a provirus containing a putative promoter that drives the expression of the GFP gene. The method is versatile, because a stock of a retroviral library can be used to isolate novel sequences with potential promoter activity in different cell lines as well as primary cells, and at distinct stages of development. In addition, the approach is likely to be generally useful for isolating other transcriptional regulatory elements, such as enhancers.
Strategies that employ retroviral promoter traps have been used in
mammalian cell lines and transgenic mice to isolate a few promoters, to
identify developmental genes, and to study differentially regulated genes
(Von Melchner et al. 1990
Although retroviruses can integrate their DNA into a large number of
genomic sites, it is unclear whether they have preferred integration targets
in the host genome. Some studies show that transcriptionally active DNA is not
a preferred target for retroviral integration, whereas other studies reveal
that retroviruses such as HIV preferentially integrate in active genes and
regional hotspots (Weidhaas et al.
2000 We identified promoters of known genes, predicted genes, and other sequences that are very likely to be promoters. We also obtained sequences that may not have promoter activity in the genome, but either behave as promoters in the context of our system or are background artifacts that might be generated during FACSsorting of large numbers of cells. It is likely that our method is biased towards selecting for stronger promoters, and that the presence of a suitable enhancer in our retroviral vector would improve the chances of selecting for weaker promoters.
On analyzing sequences from GFP+ high clones, we see approximately 50-fold
enrichment for sequences that align to the 500-bp region upstream of the
transcription start site of known genes, compared with what we would expect
from random genomic sequences. This observation by itself provides convincing
evidence that our approach enriches for promoters. For most known genes
(Refgenes), links from the human genome browser at UCSC
(Kent et al. 2002
A large number of analyzed sequences contain repeats such as LINES, SINES,
and LTRs that comprise more than 40% of the human genome
(Smit 1999 Because the method significantly enriches for promoter regions, it could be used on a more extensive scale to create databases of functional promoters in various species. It could also be useful for characterizing the DNA structure of promoters, by providing novel sequences in which to search for common motifs. Analysis of sequence data from our promoter-trapping scheme aids in the identification or confirmation of promoters, short first exons, and new genes in the genome. It also provides clues on alternative promoters that might mediate differential or tissue-specific gene expression. We are currently modifying the retroviral vector system to test whether this approach can be extended to identify other cis-regulatory sequences, such as enhancers and insulators, on a genome-wide scale. The generation of databases of regulatory sequences will be a useful resource for understanding the complexities of gene regulation.
Construction of Suitable Retroviral Vectors We constructed retroviral vector pSKF-6 (Fig. 1A) derived from vector pSIN (Moloney murine leukemia virus-based) obtained from Dr. Gary Nolan, Stanford University. The U3 region in the 5' LTR, which contains transcriptional regulatory signals and the TATA box, is replaced by the CMV promoter to produce a higher titer of virus. The 3' LTR has a deletion of enhancer sequences and part of the TATA box in the U3 region. We inserted the gene encoding humanized green fluorescent protein (huGFP) in a 3' to 5' orientation relative to the viral LTRs. This ensures that the expression of huGFP is driven only by DNA that is subcloned into the multiple cloning site (MCS) and not by possible residual promoter activity from the defective viral LTR. The huGFP is a 722-bp enhanced codon-substituted humanized form of GFP that is 99% identical to EGFP (Clontech). Bovine growth hormone unidirectional polyadenylation signals downstream of GFP direct proper 3' end processing of the mRNA.
Construction of Control Vectors and a Genomic DNA Library
Production of High-Titer Retrovirus and Infection of Target Cell
Lines
Selection and Sequencing of Putative Promoter Regions Putative promoter-containing clones were recovered by genomic PCR in which we used Sin9F and R primers designed to the region flanking the BstXI sites in vector pSKF-6. Sin9F: 5'-acgcaagcttCAGTCTAGAGTCGGGCAGAT-3' and Sin9R: 5'-catactcgagCCTATAGGTGGGGTCTTTCA-3'. We subcloned the PCR products into the pBlue-TOPO vector (Invitrogen) and sequenced the inserts by using the T7 primer (5'-TAATACGACTCACTATAGGG-3'). Sequencing was done at the Stanford Human Genome Center on ABI 377 sequencers with Big Dye Terminator chemistry (ABI).
Sequence Analyses For analyzing the fraction of sequences that are conserved between human and mouse, we downloaded the human/mouse alignment from http://genome.ucsc.edu/goldenpath/28jun2002/vsMm2/axtbest/. The "axtbest" alignments are filtered so that only the best alignment of any given region is kept. We checked whether the region of the human genome to which our sequences mapped was aligned to the mouse genome. To calculate the fraction within the aligned region, we divided the number of nucleotides that are in conserved regions by the total number of nucleotides that can be aligned to the human genome. We searched sequences for the presence of a TATA-box (TA[T/A][T/A][T/A][T/A]) element. We calculated the expected number of clones in which the TATA-box element would appear by chance, based on the nucleotide frequencies of the clone sequences. The X2 test was used to determine whether the observed frequencies of the TATA-box element were significantly different (P < 0.05) from the expected.
Validation of Putative Promoter-Containing Clones Seventy thousand 293 or HeLa-Eco cells were seeded in 350 µL Complete DMEM in each of the wells of a 24-well plate. Following overnight incubation at 37°C, we used the Effectene reagent to transfect cells with 400 ng/well of firefly luciferase plasmid DNA containing the fragments to be tested. To normalize for transfection efficiency, the cells were cotransfected with 40 ng/well of Renilla luciferase control plasmid pRL-TK (Promega). Forty-eight h later the cells were lysed, and the lysates were assayed for firefly and Renilla luciferase activity in a Wallac 1420 plate luminometer (PerkinElmer). Plasmid pGL3-E+his3-ded was used as a reference standard where the average ratio of firefly luciferase counts to Renilla luciferase counts was set to 1. A putative promoter-containing clone was considered to be positive for luciferase if its activity threshold was greater than three times that of pGL3-E+his3-ded.
We tested a small number of clones by using the Galacto-Light Plus system
(Applied Biosystems), a chemiluminescent reporter gene assay for the detection
of
We thank Gary Nolan and his laboratory for providing us with the SIN retroviral vector, packaging cells, and advice on using the retroviral system; the Stanford FACS facility, the Sequencing Team of the Stanford Human Genome Center, Jun Li, and Gregory Ford for helpful advice, and Amit Indap for bioinformatics assistance. S.K.F. was supported by a Stanford University School of Medicine Dean's Postdoctoral Fellowship award for a portion of this work. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.529803.
5 Present address: Pharmacogenomics, Bristol-Myers Squibb Pharmaceutical
Research Institute, Princeton NJ 08543, USA.
6 Corresponding author. Article published online before print in June 2003. [Supplemental material is available online at www.genome.org. The sequence data from this study have been submitted to GenBank under accession nos. AY270202 [GenBank] AY271252.]
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403-410.[CrossRef][Medline] Antequera, F. and Bird, A. 1999. CpG islands as genomic footprints of promoters that are associated with replication origins. Curr. Biol. 9: 661-667.[CrossRef][Medline] Ayoubi, T. and Van de Ven, W. 1996. Regulation of gene expression by alternative promoters. FASEB J. 10: 453-460.[Abstract]
Bai, C., Connolly, B., Metzker, M.L., Hilliard, C.A., Liu, X.,
Sandig, V., Soderman, A., Galloway, S.M., Liu, Q., Austin, C.P., et al.
2000. Overexpression of M68/DcR3 in human gastrointestinal tract
tumors independent of gene amplification and its location in a four-gene
cluster. Proc. Natl. Acad. Sci.
97:
1230-1235.
Borges, K. and Dingledine, R. 2001. Functional
organization of the GluR1 glutamate receptor promoter. J. Biol.
Chem. 276:
25929-25938. Cross, C. and Bird, A.P. 1995. CpG islands and genes. Curr. Opin. Genet. Dev. 5: 309-314.[CrossRef][Medline]
Edelman, G.M., Meech, R., Owens, G.C., and Jones, F.S.
2000. Synthetic promoter elements obtained by nucleotide sequence
variation and selection for activity. Proc. Natl. Acad.
Sci. 97:
3038-3043. Ferrigno, O., Virolle, T., Djabari, Z., Ortonne, J., White, R.J., and Aberdam, D. 2001. Transposable B2 SINE elements can provide mobile RNA polymerase II promoters. Nat. Genet. 28: 77-81.[CrossRef][Medline]
Fickett, J.W. and Hatzigeorgiou, A.G. 1997. Eukaryotic
promoter recognition. Genome Res.
7: 861-878.
Friedrich, G. and Soriano, P. 1991. Promoter traps in
embryonic stem cells: A genetic screen to identify and mutate developmental
genes in mice. Genes & Dev.
5:
1513-1523. Hardison, R.C. 2000. Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet. 16: 369-372.[CrossRef][Medline] International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.[CrossRef][Medline] International Mouse Genome Sequencing Consortium. 2002. Initial sequencing and analysis of the mouse genome. Nature 420: 520-562.[CrossRef][Medline]
Jonsson, J., Wu, Q., Nilsson, K., and Phillips, R.A.
1996. Use of a promoter-trap retrovirus to identify and isolate
genes involved in differentiation of a myeloid progenitor cell line in vitro.
Blood 87:
1771-1779.
Kent, W.J. 2002. BLATThe BLAST-like alignment
tool. Genome Res. 12:
656-664.
Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H.,
Zahler, A.M., and Haussler, D. 2002. Human Genome Browser at
UCSC. Genome Res. 12:
996-1006.
Lakso, M., Sauer, B., Mosigner, B., Lee, E.J., Manning, R.W., Yu,
S.H., Mulder, K.L., and Westphal, H. 1992. Targeted oncogene
activation by site-specific recombination in transgenic mice. Proc.
Natl. Acad. Sci. 89:
6232-6236. Medico, E., Gambarotta, G., Gentile, A., Comoglio, P.M., and Soriano, P. 2001. A gene trap vector system for identifying transcriptionally responsive genes. Nat. Biotech. 19: 579-582.[CrossRef][Medline]
Myers, R.M., Tilly, K., and Maniatis, T. 1986. Fine
structure genetic analysis of a Ohler, U. and Niemann, H. 2001. Identification and analysis of eukaryotic promoters: Recent computational approaches. Trends Genet. 17: 56-60.[CrossRef][Medline] Pennacchio, L.A. and Rubin, E.M. 2001. Genomic strategies to identify mammalian regulatory sequences. Nat. Genet. Rev. 2: 100-109.
Praz, V., Perier, R., Bonnard, C., and Bucher, P.
2002. The Eukaryotic Promoter Database, EPD: New entry types and
links to gene expression data. Nucleic Acids Res.
30:
322-324.
Pruitt, K.D. and Maglott, D.R. 2001. RefSeq and
LocusLink: NCBI gene-centered resources. Nucleic Acids
Res. 29:
137-140. Roth, F.P., Hughes, J.D., Estep, P.W., and Church, G.M. 1998. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotech. 16: 939-945.[CrossRef][Medline] Scherf, M., Klingenhoff, A., and Werner, T. 2000. Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: A novel context analysis approach. J. Mol. Biol. 297: 599-606.[CrossRef][Medline] Schroder, A., Shinn, P., Chen, H., Berry, C., Ecker, J., and Bushman, F. 2002. HIV-1 integration in human genome favors active genes and local hotspots. Cell 110: 521-529.[CrossRef][Medline] Smit, A.F.A. 1996. The origin of interspersed repeats in the human genome. Curr. Opin. Genet. Dev. 6: 743-748.[CrossRef][Medline] Smit, A.F.A. 1999. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9: 657-663.[CrossRef][Medline]
Speek, M. 2001. Antisense promoter of human L1
retrotransposon drives transcription of adjacent cellular genes.
Mol. Cell. Biol. 21:
1973-1985.
Von Melchner, H., Reddy, S., and Ruley, H.E. 1990.
Isolation of cellular promoters by using a retrovirus promoter trap.
Proc. Natl. Acad. Sci.
87:
3733-3737. Wan, Y. and Nordeen, S.K. 2002. Identification of genes differentially regulated by glucocorticoids and progestins using a Cre/loxP-mediated retroviral-promoter-trapping strategy. J. Mol. Endocrinol. 28: 177-192.[Abstract]
Weidhaas, J.B., Angelichio, E.L., Fenner, S., and Coffin, J.M.
2000. Relationship between retroviral DNA integration and gene
expression. J. Virol.
74:
8382-8389. Whitelaw, E. and Martin, D.I.K. 2001. Retrotransposons as epigenetic mediators of phenotypic variation in mammals. Nat. Genet. 27: 361-365.[CrossRef][Medline]
http://www.ncbi.nlm.nih.gov/BLAST/; NCBI BLAST. http://genome.ucsc.edu/downloads.html/; UCSC Genome Bioinformatics. http://genome.ucsc.edu/goldenPath/06aug2001/database/; Golden Path build of draft human genome sequence.
Received June 14, 2002;
accepted in revised format April 1, 2003.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||