|
|
|
|
Genome Res. 14:18-28, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 Letter Mapping and Initial Analysis of Human Subtelomeric Sequence Assemblies1 The Wistar Institute, Philadelphia, Pennsylvania 19104, USA
Physical mapping data were combined with public draft and finished sequences to derive subtelomeric sequence assemblies for each of the 41 genetically distinct human telomere regions. Sequence gaps that remain on the reference telomeres are generally small,well-defined,and for the most part,restricted to regions directly adjacent to the terminal (TTAGGG)n tract. Of the 20.66 Mb of subtelomeric DNA analyzed, 3.01 Mb are subtelomeric repeat sequences (Srpt),and an additional 2.11 Mb are segmental duplications. The subtelomeric sequence assemblies are enriched >25-fold in short,internal (TTAGGG)n-like sequences relative to the rest of the genome; a total of 114 (TTAGGG)n-like islands were found,55 within Srpt regions,35 within one-copy regions,11 at one-copy/Srpt or Srpt/segmental duplication boundaries,and 13 at the telomeric ends of assemblies. Transcripts were annotated in each assembly,noting their mapping coordinates relative to their respective telomere and whether they originate in duplicated DNA or single-copy DNA. A total of 697 transcripts were found in 15.53 Mb of one-copy DNA,76 transcripts in 2.11 Mb of segmentally duplicated DNA,and 168 transcripts in 3.01 Mb of Srpt sequence. This overall transcript density is similar (within 10%) to that found genome-wide. Zinc finger-containing genes and olfactory receptor genes are duplicated within and between multiple telomere regions.
Telomeres are extraordinarily dynamic chromosomal structures. They are essential for genome stability and faithful chromosome replication and mediate a host of key biological activities, including cell cycle regulation, cellular aging, movements and localization of chromosomes within the nucleus, and transcriptional regulation of subtelomeric genes (Blasco et al. 1999
A conserved, (TTAGGG)n tract forms the DNA component of each chromosome terminus in humans (Moyzis et al. 1988
Subtelomeric DNA, along with pericentromeric chromosome regions, are preferential sites of segmentally duplicated DNA. Estimated to comprise
Large variant alleles of many human subtelomeric regions exist, and are believed to consist entirely of subtelomeric repeats (Wilkie et al. 1991 A complete reference sequence for each human subtelomere region is an essential starting point for analysis of their function and evolution. Here, we report the mapping and initial analysis of a complete set of subtelomeric sequence assemblies. Comprised of both draft and finished public sequence accessions available as of August 1, 2003, the draft fragments are properly ordered and the assemblies are positioned relative to the respective telomere. These properties permit a comparison of subtelomeric sequence organization at each of the separate human telomeres, and the proper placement of transcripts relative to subtelomeric sequence elements and terminal (TTAGGG)n tracts.
Preparation and Mapping of Subtelomeric Assemblies Subtelomeric clones and sequence accessions that were identified and connected to telomeres previously (Riethman et al. 2001 1.6 Mb (7.7%) of the assembled sequence was derived solely from half-YACs.
Figure 1 summarizes the present status of sequence completion for each subtelomeric region. Finished or draft sequence extends to the terminal (TTAGGG)n tract of reference sequences for 19 telomeres (2p, 4p, 7q, 8p, 8q, 9p, 9q, 10q, 11p, 11q, 15q, 16p, 17p, 17q, 18p, 18q, 21q, Xp/Yp, Xq/Yq). For four of these (8p, 9q, 11p, 16p), the completed reference sequence is that of the smallest of several polymorphic allelic variants (each variant differs in size by hundreds of kilobases). It is important to note that the current reference sequence for a given telomere region represents only one of several possible subtelomeric variants in the population for many of the telomeres (see Table 1). The variant regions appear to be comprised largely or wholly of segmentally duplicated subtelomeric sequences (Wilkie et al. 1991
Assemblies mapping to <20 kb or between 20 and 70 kb from the respective telomere are available for most of the remaining telomeres (Fig. 1). One telomere (20q) is marked by a sequence assembly that extends from single copy into a subtelomeric repeat region, but the size of this subtelomeric repeat region has not been determined. The five telomeres from the p-arms of the acrocentric chromosomes, which contain mainly repetitive DNA, were not characterized as part of this study. Half-YACs recovered from these regions, although somewhat unstable mitotically, are currently being used to characterize sequences contained in these heterochromatic telomere regions.
Sequence Organization of Subtelomeric DNA
The bulk of Srpt sequences are confined to the most distal regions of the subtelomere (Fig. 2), although there are several examples (2p, 2q, 5p, 7p, 8p, and 12p) where, in addition to a terminal block of Srpt, there are additional smaller segments interspersed within the adjacent one-copy DNA and segmentally duplicated DNA. Several of the incompletely sequenced telomeres lack Srpt in the assembled sequence (Table 1); because Srpts were identified in the half-YACs derived from these telomeres, a small Srpt region confined to close to the terminal (TTAGGG)n is anticipated for these telomeres. Segmental duplication blocks were often found adjacent to Srpts, but displayed a highly variable pattern of content and distribution at each chromosome end (Fig. 2; Table 1). Overall, 14.6% of the 20.66 Mb of subtelomeric DNA analyzed was comprised of Srpt and 10.2% of segmentally duplicated DNA, for a total of 24.8% segmental duplications of both types. Genome-wide, an estimated 5% of genomic DNA is believed to contain segmentally duplicated sequences (Bailey et al. 2002
Interstitial (TTAGGG)n-like sequence distribution was examined because of its potential role in subtelomeric recombination and telomere healing (Mondello et al. 2000 The 101 internal (TTAGGG)n-like sequence islands were analyzed in greater detail as shown in Figure 3. The sizes of (TTAGGG)n-like sequence islands (x-axis), number of occurrences for a given size of (TTAGGG)n tract (y-axis), similarity of (TTAGGG)n-like sequence islands to a perfect (TTAGGG)n tract (percent Divergence), and location of (TTAGGG)n-like sequence islands within the subtelomeric sequence organization as defined above (Srpt, one-copy, and boundary) are indicated in Figure 3. The internal subtelomeric (TTAGGG)n-like sequence islands ranged in size from 24 to 823 bp; most were in a rather tight size range of 151-200 bp. Those shorter than this size tended to be in one-copy sequence regions, those longer in Srpt sequence. The boundary (TTAGGG)n islands ranged from 57 to 257 bp in size. There were 55 (TTAGGG)n-like sequence islands in Srpt, 0 in Segmental duplications, and 35 in one-copy regions. Eleven (TTAGGG)n-like sequence islands were at boundaries (two at SD/Srpt, nine at Srpt/one-copy). Four (TTAGGG)n-like islands that occurred at the allele boundaries were within the internal Srpt regions of long subtelomeric alleles (and were counted as such for this analysis), but mapped to the precise coordinates of the termini of shorter alleles for these same telomeres (8p, 9q, 11p, 16p; see Fig. 2). This suggests that the longer alleles of these telomeres might have been formed by simple addition of a terminal subtelomeric sequence segment to a pre-existing telomere.
A comparison of the number of interstitial (TTAGGG)n-like islands found in subtelomeric DNA with those found genome wide shows that, in a normalized comparison (occurrences per 20.66 Mb), (TTAGGG)n-like islands are highly enriched (>25-fold) in subtelomeric regions. In addition, they tend to be both longer and more similar to perfect (TTAGGG)n tracts in subtelomeric DNA compared with elsewhere in the genome (Fig. 3). From an evolutionary perspective, this suggests that most subtelomeric interstitial (TTAGGG)n tracts have arisen more recently than those found elsewhere in the genome, have originated via a separate mechanism than (TTAGGG)n islands found elsewhere (e.g., see Azzalin et al. 2001
GC Content and Interspersed Repeat Composition of Subtelomeric Sequence Assemblies
Transcript Content of Subtelomeric Assemblies Transcripts were annotated in each subtelomeric assembly, noting their mapping coordinates relative to their respective telomere, and whether they originate in duplicated DNA or single-copy DNA. We used a database of unique transcripts representing each Unigene cluster (Schuler 1997 124,000 Unigene clusters) for our initial annotation. Repeat-masked subtelomeric assemblies were analyzed by BLAST, and transcripts with matches >50 bp with 85% or greater identity were collected and parsed into a second database. Each transcript within this candidate database was compared with its cognate unmasked subtelomeric assembly using the program Spidey (Wheelan et al. 2001
A total of 941 subtelomeric transcripts were annotated in this manner, 697 from one-copy genomic regions and 244 from segmentally duplicated DNA and subtelomeric repeat DNA. Overall, the subtelomeric region is slightly enriched in Unigene transcripts (48 transcripts/Mb) relative to the genome-wide average (41 transcripts/Mb). The enrichment of transcripts in subtelomeric DNA is consistent with earlier studies, (Saccone et al. 1993 Fifteen percent of the transcript matches localizing to one-copy regions either had apparent disruptions in their predicted ORFs or varied significantly (>1% in high-quality parts of the sequence) from the corresponding genomic sequence. These were designated "possible pseudogenes" (see Supplemental Table 3). However, given the frequency of sequence errors in the EST and mRNA database, as well as the draft nature of parts of the assemblies, these numbers are likely to change as experimental validation of the transcript annotations proceeds.
Similarly, an unknown but significant fraction of the transcripts embedded within the segmental duplications and subtelomeric repeats are likely to be pseudogenes (e.g., see Kermouni et al. 1995
Supplemental Table 3 identifies each of these transcripts, and Tables 3A and 3B summarize the subset of transcripts in duplicated DNA that correspond to named genes. Cross-boundary transcripts (Table 3B) contain part of a sequence from a duplicated genomic segment and part from a one-copy segment, or parts from a segmental duplication and from a subtelomeric repeat. These transcripts might represent transcribed pseudogenes generated by juxtaposition of progenitor transcript segments, or might generate new functionalities by virtue of exon shuffling upon duplication (Bailey et al. 2002
Subtelomeric gene families with members having nucleotide sequence similarity in the 70%-90% level include the immunoglobulin heavy-chain genes (found at 14q), olfactory receptor genes [one-copy regions of 1q, 5q, 10q, and 15q as well as previously characterized subtelomeric repeat DNA (1p, 6p, 8p, 11p, 15q, 19p, and 3q; Trask et al. 1998 Transcripts positioned closest to the telomere represent genes with the highest susceptibility to telomere deletions, rearrangements, and hypothesized position effects mediated by telomere (TTAGGG)n tract shortening and/or altered telomeric heterochromatin. Both the dosage (in the case of Srpt transcripts) and the true position of many of these genes relative to the telomere will be allele dependent, changing with different subtelomeric repeat composition and organization. Nonetheless, current data permit us to identify some representatives of most Srpt gene families, and nearly all of the most distal one-copy genes. The named one-copy transcripts closest (within 100 kb) of the telomeric end of each assembly are shown in Table 4. These distal one-copy transcripts, along with the Srpt and segmental duplication transcripts described above, should comprise the segment of the human transcriptome most susceptible to telomere truncations, rearrangements, and telomere-associated position effects.
Preparation of Subtelomeric Assemblies and Subtelomeric Maps Each subtelomeric assembly was prepared by DNA sequence comparison of finished sequence accessions, draft sequence accession pieces, and half-YAC (Riethman et al. 1989
Finished sequences for each telomere were used preferentially in the assemblies, with draft sequence fragments added as necessary to extend the assemblies. We used all or parts of NCBI assemblies from Build 34 first, then patched in draft sequences not included in the assembly. In regions in which NCBI Build 34 was inconsistent with our mapping data, we used individual accessions to complete the assemblies. The Sequencher assembler was used interactively to find and combine sequence overlaps among the imported pieces and between the half-YAC-derived sequences and the imported sequences. It was often necessary to break sequence fragments in VNTR-like regions and introduce a gap in one of the overlapping fragments (in effect, incorporating the larger sequence of a polymorphic VNTR) in order to obtain a contiguous assembly in this manner. Leftover draft sequence fragments were analyzed by BLAST to ensure that unique sequences were not missed in the assembly. A string of 100 Ns were placed between nonoverlapping, but adjacent draft sequence fragments. By use of the mapping data associated with the half-YAC-derived cosmid contigs, it was possible to uniquely orient and position most draft-sequence fragments. Subsequent comparison of each subtelomeric assembly against itself using Pattern-Hunter software (Ma et al. 2002 We did not make any special effort to trim high-quality overlapping sequence fragments (other than at the ends of overlapping draft fragments that were clearly error prone), but rather used the consensus from such overlap regions as our subtelomeric assembly. An N was placed in consensus positions, in which overlaps produced an ambiguous base (i.e., a SNP or a sequence error). Specific accessions as well as NCBI Build 34 contigs used in assembling each subtelomere sequence are indicated in Supplemental Table 1.
Analysis of Subtelomeric Sequence Composition and Organization
The output of each of these analyses was consolidated on a single interactive Genotator (Harris 1997
We thank the members of the International Human Genome Sequencing Consortium who participated in the sequencing of subtelomeric regions. Bob Moyzis, Jonathan Flint, and William Brown collaborated or provided reagents for the earlier stages of this work. John Rux and the Wistar Bioinformatics Facility provided programming and computational support. Financial support was provided by NIH HG00567 and CA 25874, and by the Commonwealth Universal Research Enhancement Program, PA Dept of Health. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1245004.
4 Corresponding author.
2 Present Address: Program in Molecular Biophysics, Johns Hopkins University, Philadelphia, PA 19104, USA;
3 Present Address: Cell and Molecular Biology Program, University of Pennsylvania, Baltimore, MD 21218, USA. [Supplemental material is available online at www.genome.org. Detailed maps,subtelomeric assemblies (FASTA format),and transcript annotations are also available at our laboratory Web site (http://www.wistar. upenn.edu/Riethman.]
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402. Amann, J., Valentine, M., Kidd, V., and Lahti, J.M. 1996. Localization of Chl1-related helicase genes to human chromosome regions 12p11 and 12p13: Similarity between parts of these genes and conserved human telomere-associated DNA. Genomics 32: 260-265.[CrossRef][Medline] Azzalin, C.M., Nergadze, S.G., and Giulotto, E. 2001. Human intrachromosomal telomeric-like repeats: Sequence organization and mechanisms of origin. Chromosoma 110: 75-82.[Medline]
Bailey, J.A., Yavor, A.M., Massa, H.F., Trask, B.J., and Eichler, E.E. 2001. Segmental duplications: Organization and impact within the current human genome project assembly. Genome Res. 11: 1005-1017.
Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D., Myers, E.W., Li, P.W., and Eichler, E.E. 2002. Recent segmental duplications in the human genome. Science 297: 1003-1007.
Baur, J.A., Zou, Y., Shay, J.W., and Wright, W.E. 2001. Telomere position effect in human cells. Science 292: 2075-2077.
Benson, G. 1999. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 27: 573-580.
Blasco, M.A., Gasser, S.M., and Lingner, J. 1999. Telomeres and telomerase. Genes & Dev. 13: 2353-2359. Bryan, T.M., Englezou, A., Gupta, J., Bacchetti, S., and Reddel, R.R. 1995. Telomere elongation in immortal human cells without detectable telomerase activity. EMBO J. 14: 4240-4248.[Medline]
Carlson, M., Celenza, J.L., and Eng, F.J. 1985. Evolution of the dispersed SUC gene family of Saccharomyces by rearrangements of chromosomal telomeres. Mol. Cell Biol. 5: 2894-2902. Cook, G.P., Tomlinson, I.M., Walter, G., Carter, N.G., Riethman, H.C., Winter, G., and Rabbitts, T.H. 1994. A map of the human immunoglobulin VH locus completed by analysis of the telomeric region of chromosome14q. Nat. Genet. 7: 162-168.[CrossRef][Medline] Donaldson, K.M. and Karpen, G.H. 1997. Trans-suppression of terminal deficiency-associated position effect variegation in a Drosophila minichromosome. Genetics 145: 325-337.[Abstract]
Fan, Y., Newman, T., Linardopoulou, E., and Trask, B.J. 2002. Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions. Genome Res. 12: 1663-1672. Feuerbach, F., Galy, V., Trelles-Sticken, E., Fromont-Racine, M., Jacquier, A., Gilson, E., Olivo-Marin, J.C., Scherthan, H., and Nehrbass, U. 2002. Nuclear architecture and spatial positioning help establish transcriptional states of telomeres in yeast. Nat. Cell Biol. 4: 214-221.[CrossRef][Medline] Flint, J., Thomas, K., Micklem, G., Raynham, H., Clark, K., Doggett, N.A., King, A., and Higgs, D.R. 1997a. The relationship between chromosome structure and function at a human telomeric region. Nat. Genet. 15: 252-257.[CrossRef][Medline]
Flint, J., Bates, G.P., Clark, K., Dorman, A., Willingham, D., Roe, B.A., Micklem, G., Higgs, D.R., and Louis, E.J. 1997b. Sequence comparison of human and yeast telomeres identifies structurally distinct subtelomeric domains. Hum. Mol. Genet. 6: 1305-1313. Fourel, G., Revardel, E., Koering, C.E., and Gilson, E. 1999. Cohabitation of insulators and silencing elements in yeast subtelomeric regions. EMBO J. 18: 2522-2537.[CrossRef][Medline]
Harris, N.L. 1997. Genotator: A workbench for sequence annotation. Genome Res. 7: 754-762. Henson, J.D., Neumann, A.A., Yeager, T.R., and Reddel, R.R. 2002. Alternative lengthening of telomeres in mammalian cells. Oncogene 21: 598-610.[CrossRef][Medline] Ijdo, J.W., Lindsay, E.A., Wells, R.A., and Baldini, A. 1992. Multiple variants in subtelomeric regions of normal karyotypes. Genomics 14: 1019-1025.[CrossRef][Medline] International Human Genome Sequencing Consortium (IHGSC). 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.[CrossRef][Medline]
Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The Human Genome Browser at UCSC. Genome Res. 12: 996-1006. Kermouni, A., Van Roost, E., Arden, K.C., Vermeesch, J.R., Weiss, S., Godelaine, D., Flint, J., Lurquin, C., Szikorza, J.P., Higgs, D.R., et al. 1995. The IL-9 receptor gene (IL9R): Genomic structure and chromosomal localization in the pseudoautosomal region of the long arm of the sex chromosomes, and identification of IL9R pseudogenes at 9qter, 10pter, 16pter, and 18pter. Genomics 29: 371-382.[CrossRef][Medline] Kvaloy, K. 1993. "The long arm telomeres of the human sex chromosomes." Ph.D thesis, Wadham College, Department of Biochemistry, University of Oxford, UK. Lundblad, V. and Wright, W.E. 1996. Telomeres and telomerase: A simple picture becomes complex. Cell 87: 369-375.[CrossRef][Medline]
Ma, B., Tromp, J., and Li, M. 2002. PatternHunter: Faster and more sensitive homology search. Bioinformatics 18: 440-445.
Macina, R.A., Negorev, D.G., Spais, C., Ruthig, L.A., Hu, X-L., and Riethman, H.C. 1994. Sequence organization of the human chromosome 2q telomere. Hum. Mol. Genet. 3: 1847-1853.
Macina, R.A., Morii, K., Hu, X.-L., Negorev, D.G., Spais, C., Ruthig, L.A., and Riethman, H.C. 1995. Molecular cloning and RARE cleavage mapping of human 2p, 6q, 8q, 12q, and 18q telomeres. Genome Res. 5: 225-232. Mah, N., Stoehr, H., Schulz, H.L., White, K., and Weber, B.H. 2001. Identification of a novel retina-specific gene located in a subtelomeric region with polymorphic distribution among multiple human chromosomes. Biochim. Biophys. Acta. 1522: 167-174.[Medline] Martin, C.L., Wong, A., Gross, A., Chung, J., Fantes, J.A., and Ledbetter, D.H. 2002. The evolutionary origin of human subtelomeric homologiesor where the ends begin. Am. J. Hum. Genet. 70: 972-984.[CrossRef][Medline] Martin-Gallardo, A., Lamerdin, J., Sopapan, P., Friedman, C., Fertitta, A.L., Garcia, E., Carrano, A., Negorev, D., Macina, R.A., Trask, B.J., et al. 1995. Molecular analysis of a novel subtelomeric repeat with polymorphic chromosomal distribution. Cytogenet. Cell Genet. 71: 289-295.[Medline] McCulloch, R., Rudenko, G., and Borst, P. 1997. Gene conversions mediating antigenic variation in Trypanosoma brucei can occur on variant surface glycoprotein expression sites lacking 70-bp repeat sequences. Mol. Cell Biol. 17: 833-843.[Abstract] Mefford, H.C. and Trask, B.J. 2002. The complex structure and dynamic evolution of human subtelomeres. Nat. Rev. Genet. 3: 91-102.[CrossRef][Medline] Mondello, C., Pirzio, L., Azzalin, C.M., and Giulotto, E. 2000. Instability of interstitial telomeric sequences in the human genome. Genomics 68: 111-117.[CrossRef][Medline] Monfouilloux, S., Avet-Loiseau, H., Amarger, V., Balazs, I., Pourcel, C., and Vergnaud, G. 1998. Recent human-specific spreading of a subtelomeric domain. Genomics 51: 165-176.[CrossRef][Medline] Morin, G.B. 1989. The human telomere terminal transferase enzyme is a ribonucleoprotein that synthesizes TTAGGG repeats. Cell 59: 521-529.[CrossRef][Medline]
Moyzis, R.K., Buckingham, J.M., Cram, S., Dani, M., Deaven, L.L., Jones, M.D., Meyne, J., Ratliff, R.L., and Wu, J.R. 1988. A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proc. Natl. Acad. Sci. 85: 6622-6626. Murnane, J.P., Sabatier, L., Marder, B.A., and Morgan, W.F. 1994. Telomere dynamics in an immortal human cell line. EMBO J. 13: 4953-4962.[Medline] Pryde, F.E. and Louis, E.J. 1999. Limitations of silencing at native yeast telomeres. EMBO J. 18: 2538-2550.[CrossRef][Medline] Reston, J.T., Hu, X.-L., Macina, R.A., Spais, C., and Riethman, H. 1995. Structure of the terminal 300 kb of DNA from human chromosome 21q. Genomics 26: 31-38.[CrossRef][Medline]
Riethman, H.C., Moyzis, R.K., Meyne, J., Burke, D.T., and Olson, M.V. 1989. Cloning human telomeric DNA fragments into Saccharomyces cerevisiae using a yeast-artificial-chromosome vector. Proc. Natl. Acad. Sci. 86: 6240-6244. Riethman, H., Birren, B., and Gnirke, A. 1997. Preparation, manipulation, and mapping of high molecular weight DNA. In Genome analysis: A laboratory manual, Volume 1: "Analyzing DNA" (eds. B. Birren et al.), pp. 83-248. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Riethman, H.C., Xiang, Z., Paul, S., Morse, E., Hu, X.L., Flint, J., Chi, H.C., Grady, D.L., and Moyzis, R.K. 2001. Integration of telomere sequences with the draft human genome sequence. Nature 409: 948-951.[CrossRef][Medline] Rizki, A. and Lundblad, V. 2001. Defects in mismatch repair promote telomerase-independent proliferation. Nature 411: 713-716.[CrossRef][Medline] Ruiz-Herrera, A., Garcia, F., Azzalin, C., Giulotto, E., Egozcue, J., Ponsa, M., and Garcia, M. 2002. Distribution of intrachromosomal telomeric sequences (ITS) on Macaca fascicularis (Primates) chromosomes and their implication for chromosome evolution. Hum. Genet. 110: 578-586.[CrossRef][Medline]
Saccone, S., De Sario, A., Weigant, J., Raap, A.K., Della Valle, G., and Bernardi, G. 1993. Correlations between isochores and chromosomal bands in the human genome. Proc. Natl. Acad. Sci. 90: 11929-11933. Schuler, 1997. Pieces of the puzzle: Expressed sequence tags and the catalog of human genes. J. Mol. Med. 75: 694-698.[CrossRef][Medline] Smit, A.F.A. and Green, P. RepeatMasker home page. http://ftp.genome.washington.edu/RM/RepeatMasker.html Sonnhammer, E.L.L. and Durbin, R. 1994. A workbench for Large Scale Sequence Homology Analysis. Comput. Applic. Biosci. 110: 301-307.
Trask, B.J., Friedman, C., Martin-Gallardo, A., Rowen, L., Akinbami, C., Blankenship, J., Collins, C., Giorgi, D., Iadonato, S., Johnson, F., et al. 1998. Members of the olfactory receptor gene family are contained in large blocks of DNA duplicated polymorphically near the ends of human chromosomes. Hum. Mol. Genet. 7: 13-26. van Geel, M., Eichler, E.E., Beck, A.F., Shan, Z., Haaf, T., van der Maarel, S.M., Frants, R.R., and de Jong, P.J. 2002. A cascade of complex subtelomeric duplications during the evolution of the hominoid and Old World monkey genomes. Am. J. Hum. Genet. 70: 269-278.[CrossRef][Medline]
van Overveld, P.G., Lemmers, R.J., Deidda, G., Sandkuijl, L., Padberg, G.W., Frants, R.R., and van Der Maarel, S.M. 2000. Interchromosomal repeat array interactions between chromosomes 4 and 10: A model for subtelomeric plasticity. Hum. Mol. Genet. 9: 2879-2884.
Wheelan, S.J., Church, D.M., and Ostell, J.M. 2001. Spidey: A tool for mRNA-to-genomic alignments. Genome Res. 11: 1952-1957. Wilkie, A.O.M., Higgs, D.R., Rack, K.A., Buckle, V.J., Spurr, N.K., Fischel-Ghodsian, N., Ceccherini, I., Brown, W.R.A., and Harris, P.C. 1991. Stable length polymorphism of up to 260 kb at the tip of the short arm of human chromosome 16. Cell 64: 595-606.[CrossRef][Medline]
ftp://ftp.ncbi.nih.gov/repository/UniGene/; database of best mRNA or EST sequence representative of each Unigene cluster.
Received February 6, 2003;
accepted in revised format November 4, 2003.
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||