Genome Research scroll

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 14:18-28, 2004
©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Riethman, H.
Right arrow Articles by Wei, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Riethman, H.
Right arrow Articles by Wei, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Letter

Mapping and Initial Analysis of Human Subtelomeric Sequence Assemblies

Harold Riethman1,4, Anthony Ambrosini1, Carlos Castaneda1,2, Jeffrey Finklestein1,3, Xue-Lan Hu1, Uma Mudunuri1, Sheila Paul1 and Jun Wei1

1 The Wistar Institute, Philadelphia, Pennsylvania 19104, USA


    ABSTRACT
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
Physical mapping data were combined with public draft and finished sequences to derive subtelomeric sequence assemblies for each of the 41 genetically distinct human telomere regions. Sequence gaps that remain on the reference telomeres are generally small,well-defined,and for the most part,restricted to regions directly adjacent to the terminal (TTAGGG)n tract. Of the 20.66 Mb of subtelomeric DNA analyzed, 3.01 Mb are subtelomeric repeat sequences (Srpt),and an additional 2.11 Mb are segmental duplications. The subtelomeric sequence assemblies are enriched >25-fold in short,internal (TTAGGG)n-like sequences relative to the rest of the genome; a total of 114 (TTAGGG)n-like islands were found,55 within Srpt regions,35 within one-copy regions,11 at one-copy/Srpt or Srpt/segmental duplication boundaries,and 13 at the telomeric ends of assemblies. Transcripts were annotated in each assembly,noting their mapping coordinates relative to their respective telomere and whether they originate in duplicated DNA or single-copy DNA. A total of 697 transcripts were found in 15.53 Mb of one-copy DNA,76 transcripts in 2.11 Mb of segmentally duplicated DNA,and 168 transcripts in 3.01 Mb of Srpt sequence. This overall transcript density is similar (within ~10%) to that found genome-wide. Zinc finger-containing genes and olfactory receptor genes are duplicated within and between multiple telomere regions.


Telomeres are extraordinarily dynamic chromosomal structures. They are essential for genome stability and faithful chromosome replication and mediate a host of key biological activities, including cell cycle regulation, cellular aging, movements and localization of chromosomes within the nucleus, and transcriptional regulation of subtelomeric genes (Blasco et al. 1999Go; Feuerbach et al. 2002Go). Specialized functions involving telomeric and subtelomeric DNA have evolved in a wide range of eukaryotes; for example, frequent subtelomeric gene conversion provides diversity for surface antigens in Trypanosomes (McCulloch et al. 1997Go), and rapidly evolving subtelomeric gene families confer selective advantages for closely related yeast strains (Carlson et al. 1985Go).

A conserved, (TTAGGG)n tract forms the DNA component of each chromosome terminus in humans (Moyzis et al. 1988Go). A specialized enzyme called telomerase can lengthen the telomere repeat motif by adding on motif-specific nucleotides in a DNA template-independent manner (Morin 1989Go). However, both telomerase-associated and telomerase-independent pathways for maintaining (TTAGGG)n repeats exist; the major telomerase-independent pathways are recombination based, sometimes involve coamplification of subtelomeric sequences along with the simple repeat tracts found at chromosome termini (Murnane et al. 1994Go; Bryan et al. 1995Go; Lundblad and Wright 1996Go; Henson et al. 2002Go) and can generate very long and heterogeneous stretches of (TTAGGG)n-containing repeats (Rizki and Lundblad 2001Go; Henson et al. 2002Go). Transcription of subtelomeric genes can be regulated by (TTAGGG)n tract length (Baur et al. 2001Go) and by subtelomeric repeat content and abundance, possibly by contributing specific sequence elements necessary for local silencing (Fourel et al. 1999Go; Pryde and Louis 1999Go) or by providing extended homology regions required for somatic pairing and heterochromatin formation (Donaldson and Karpen 1997Go).

Subtelomeric DNA, along with pericentromeric chromosome regions, are preferential sites of segmentally duplicated DNA. Estimated to comprise ~5% of the human genome, this class of low-copy repeat DNA is characterized by very high sequence similarity (90% to >99.5%) between homology tracts, and variable, but often very large tract lengths (1 kb to >200 kb). These large homology segments have complicated mapping and sequencing efforts, and caused a disproportionate number of assembly errors in the initial working draft sequence of the human genome. Segmental duplications can predispose associated chromosome segments to genetic instability and have been connected with several genetic diseases (Bailey et al. 2001Go). Evolutionarily recent duplicative transposition of these large DNA tracts has led to the generation of new gene families and to the formation of fusion transcripts with potentially new functions (Bailey et al. 2002Go). In this study, we define segmental duplications occurring in more than one subtelomeric region "subtelomeric repeats," and refer to all others simply as segmental duplications.

Large variant alleles of many human subtelomeric regions exist, and are believed to consist entirely of subtelomeric repeats (Wilkie et al. 1991Go; Macina et al. 1994Go, 1995Go; Trask et al. 1998Go; Mefford and Trask 2002Go). For example, Wilkie et al (1991Go) found that three alleles varying in length up to 260 kb exist at the 16p telomere. Trask et al. (1998Go) examined the structure and genomic distribution of a cosmid-sized block of segmentally duplicated subtelomeric DNA. They found that this block was consistently present at the 3q, 15q, and 19p telomeres in humans, was variably distributed at an additional subset of human telomeres, but was present in a single copy in nonhuman primate genomes. Similar studies have demonstrated more recently that the evolution of most primate subtelomeric regions has involved multiple, lineage-dependent duplications in recent evolutionary time (Martin et al. 2002Go; van Geel et al. 2002Go). The duplications have colonized many individual human subtelomeric regions in a variable fashion since the divergence of human and primate lineages.

A complete reference sequence for each human subtelomere region is an essential starting point for analysis of their function and evolution. Here, we report the mapping and initial analysis of a complete set of subtelomeric sequence assemblies. Comprised of both draft and finished public sequence accessions available as of August 1, 2003, the draft fragments are properly ordered and the assemblies are positioned relative to the respective telomere. These properties permit a comparison of subtelomeric sequence organization at each of the separate human telomeres, and the proper placement of transcripts relative to subtelomeric sequence elements and terminal (TTAGGG)n tracts.


    RESULTS AND DISCUSSION
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
Preparation and Mapping of Subtelomeric Assemblies
Subtelomeric clones and sequence accessions that were identified and connected to telomeres previously (Riethman et al. 2001Go) were used in this study to nucleate the assembly of new and more complete subtelomeric draft/finished sequence contigs for these regions. Most of the sequence used in these assemblies (>98%) was acquired by the IHGSC as part of the finishing phase of the Human Genome Project, from clones contributed by our lab as well as from clones identified independently in the chromosome-specific projects and mapped relative to telomeres in our lab. Each sequence assembly is oriented from telomeric end (nucleotide position 1) to centromeric end. Maps and tables describing in detail the YAC, BAC, and cosmid clones supporting the sequence assemblies for each subtelomere region are provided as Supplemental material available online at www.genome.org (Suppl. Table 1; Suppl. Figs. 1ptel-Xqtel). Most of the subtelomeric sequences were ultimately derived from BAC sources (see Suppl. Table 1); ~1.6 Mb (7.7%) of the assembled sequence was derived solely from half-YACs.

Figure 1 summarizes the present status of sequence completion for each subtelomeric region. Finished or draft sequence extends to the terminal (TTAGGG)n tract of reference sequences for 19 telomeres (2p, 4p, 7q, 8p, 8q, 9p, 9q, 10q, 11p, 11q, 15q, 16p, 17p, 17q, 18p, 18q, 21q, Xp/Yp, Xq/Yq). For four of these (8p, 9q, 11p, 16p), the completed reference sequence is that of the smallest of several polymorphic allelic variants (each variant differs in size by hundreds of kilobases). It is important to note that the current reference sequence for a given telomere region represents only one of several possible subtelomeric variants in the population for many of the telomeres (see Table 1). The variant regions appear to be comprised largely or wholly of segmentally duplicated subtelomeric sequences (Wilkie et al. 1991Go; Trask et al. 1998Go; Bailey et al. 2002Go).



View larger version (29K):
[in this window]
[in a new window]
 
Figure 1 Telomere sequence gaps. Distance from terminal (TTAGGG)n tract to the subtelomeric sequence assemblies for each telomere is indicated. (Blue dot) Subtelomeric assemblies adjoin terminal (TTAGGG)n tract for a reference telomere; (green dot) subtelomeric assemblies end within 20 kb of the terminal (TTAGGG)n tract; (magenta dot) subtelomeric assemblies end between 20 and 70 kb from the terminal (TTAGGG)n tract; (yellow dot) a half-YAC clone has not been identified, the sequence assembly for this telomere extends from single-copy DNA into the subtelomeric repeat region, but the size of the subtelomeric repeat region has not been determined. (Black rectangles) The five telomeres from the p-arms of the acrocentric chromosomes, which contain mainly repetitive DNA unstable in yeast and bacteria, were not characterized as part of this study.

 


View this table:
[in this window]
[in a new window]
 
Table 1. Summary of Subtelomeric Assemblies

 
Assemblies mapping to <20 kb or between 20 and 70 kb from the respective telomere are available for most of the remaining telomeres (Fig. 1). One telomere (20q) is marked by a sequence assembly that extends from single copy into a subtelomeric repeat region, but the size of this subtelomeric repeat region has not been determined. The five telomeres from the p-arms of the acrocentric chromosomes, which contain mainly repetitive DNA, were not characterized as part of this study. Half-YACs recovered from these regions, although somewhat unstable mitotically, are currently being used to characterize sequences contained in these heterochromatic telomere regions.

Sequence Organization of Subtelomeric DNA
The overall sequence organization of each subtelomeric assembly was evaluated initially in terms of subtelomeric repeats, segmental duplications, satellite sequences, and internal (TTAGGG)n— like sequence content. First, a BLAST-able database of the subtelomeric assemblies was created; Srpt sequences were defined as any nonself sequence match >90% identity within the subtelomeric sequence database. Second, sequence comparisons between the subtelomeric assemblies and public databases were used to define additional homology segments; segmental duplications were defined as nonself sequence matches in the NR or HTGS databases, but absent in the Srpt database, that were >90% identity and >1 kb in length. The Whole Genome Shotgun Sequence Detection (WSSD) Database (Bailey et al. 2002Go) detects both subtelomeric repeat regions and segmental duplications, and was used to add segmental duplication regions where those sequences were not already identified by the BLAST analysis described above; for the most part, the regions identified by WSSD were consistent with our analyses, although each method missed a small percentage of the duplications. Finally, satellite-related and (TTAGGG)n-related sequences were identified using the high-sensitivity RepeatMasker parameters. TAR1 and SUBTEL are telomere-associated satellite sequences identified by RepeatMasker (other telomere-associated repeat sequences identified in the literature are components of the Srpt fraction of subtelomeric DNA). Each of these sequence elements are delineated on Figure 2; the total sizes of Srpt and segmental duplication within each assembly are indicated in Table 1.





View larger version (40K):
[in this window]
[in a new window]
 
Figure 2 Sequence organization of subtelomeric assemblies. The terminal fragment of each chromosome arm that contains a subtelomeric sequence assembly is depicted. (Yellow) Single-copy DNA; (green) segmentally duplicated DNA unique to a single subtelomere region; (blue) segmentally duplicated subtelomeric repeat DNA; (magenta) unsequenced subtelomeric DNA. Each black arrow within the rectangles depicts a single (TTAGGG)n-like sequence, with the arrow pointing in the 5' to 3' direction for the G-rich strand. The position of satellite sequences detected by RepeatMasker software is indicated above each telomere diagram. The positions of the shortest reference alleles for the 8p, 9q, 11p, and 16p telomeres are indicated by a black line segment terminating with an arrow beneath the sequence assemblies for these telomeres.

 
The bulk of Srpt sequences are confined to the most distal regions of the subtelomere (Fig. 2), although there are several examples (2p, 2q, 5p, 7p, 8p, and 12p) where, in addition to a terminal block of Srpt, there are additional smaller segments interspersed within the adjacent one-copy DNA and segmentally duplicated DNA. Several of the incompletely sequenced telomeres lack Srpt in the assembled sequence (Table 1); because Srpts were identified in the half-YACs derived from these telomeres, a small Srpt region confined to close to the terminal (TTAGGG)n is anticipated for these telomeres. Segmental duplication blocks were often found adjacent to Srpts, but displayed a highly variable pattern of content and distribution at each chromosome end (Fig. 2; Table 1). Overall, 14.6% of the 20.66 Mb of subtelomeric DNA analyzed was comprised of Srpt and 10.2% of segmentally duplicated DNA, for a total of 24.8% segmental duplications of both types. Genome-wide, an estimated 5% of genomic DNA is believed to contain segmentally duplicated sequences (Bailey et al. 2002Go), indicating a fivefold enrichment of segmentally duplicated DNA in the subtelomeric regions analyzed. The nucleotide sequence similarity of duplicons in both the Srpt and segmental duplications varied from 90% to >99%, and occurred in sequence blocks that often, but not always, had sharply defined boundaries; more extensive comparative analysis of these regions and related sequences in nonhuman primate species (e.g., see Fan et al. 2002Go; Martin et al. 2002Go) are required to investigate their origin and evolution in detail.

Interstitial (TTAGGG)n-like sequence distribution was examined because of its potential role in subtelomeric recombination and telomere healing (Mondello et al. 2000Go; Azzalin et al. 2001Go; Ruiz-Herrera et al. 2002Go), and its hypothesized role as a boundary element for subtelomeric DNA compartments (Flint et al. 1997bGo). All significant RepeatMasker matches to the simple repeats (TTAGGG)n and (CCCTAA)n were counted as telomere-like sequence islands. A total of 114 matches were found within the subtelomeric sequence assemblies. The 5'-3'-orientation of the G-rich strand of the repeat is normally toward the telomere in the (TTAGGG)n tracts at the ends of chromosomes. Most of the telomere-like sequence islands followed this strand orientation (106/114 islands). Thirteen (TTAGGG)n islands corresponded to the beginning of terminal (TTAGGG)n tracts at the 2p, 4p, 7q, 9p, 10q, 11q, 16p, 17q, 18p, 18q, 21q, Xp/Yp, and XqYq telomeres; these were excluded from analysis of the interstitial, telomere-like sequence islands described below. The 5'-3' (TTAGGG) orientation of individual islands is indicated by the direction of the arrows representing the sequence islands in the telomere diagrams (Fig. 2).

The 101 internal (TTAGGG)n-like sequence islands were analyzed in greater detail as shown in Figure 3. The sizes of (TTAGGG)n-like sequence islands (x-axis), number of occurrences for a given size of (TTAGGG)n tract (y-axis), similarity of (TTAGGG)n-like sequence islands to a perfect (TTAGGG)n tract (percent Divergence), and location of (TTAGGG)n-like sequence islands within the subtelomeric sequence organization as defined above (Srpt, one-copy, and boundary) are indicated in Figure 3. The internal subtelomeric (TTAGGG)n-like sequence islands ranged in size from 24 to 823 bp; most were in a rather tight size range of 151-200 bp. Those shorter than this size tended to be in one-copy sequence regions, those longer in Srpt sequence. The boundary (TTAGGG)n islands ranged from 57 to 257 bp in size. There were 55 (TTAGGG)n-like sequence islands in Srpt, 0 in Segmental duplications, and 35 in one-copy regions. Eleven (TTAGGG)n-like sequence islands were at boundaries (two at SD/Srpt, nine at Srpt/one-copy). Four (TTAGGG)n-like islands that occurred at the allele boundaries were within the internal Srpt regions of long subtelomeric alleles (and were counted as such for this analysis), but mapped to the precise coordinates of the termini of shorter alleles for these same telomeres (8p, 9q, 11p, 16p; see Fig. 2). This suggests that the longer alleles of these telomeres might have been formed by simple addition of a terminal subtelomeric sequence segment to a pre-existing telomere.



View larger version (27K):
[in this window]
[in a new window]
 
Figure 3 Characteristics of Interstitial (TTAGGG)n-like sequences in subtelomeric assemblies. (TTAGGG)n-like sequences detected using RepeatMasker were classified according to origin within the defined subtelomeric sequence classes (Subtelomeric repeat, blue; single-copy, yellow; boundary, magenta; none were found within Segmental Duplications), similarity to a perfect (TTAGGG)n sequence (percent divergence), the length of the (TTAGGG)n tract (x-axis), and the number of occurrences for each size class (y-axis). The short black histograms show the distribution and relative abundance of nonsubtelomeric (TTAGGG)n hits in the human genome (627 hits in 3098 Mb of DNA, including an unusual cluster of 166 hits in a 9-Mb region of Yq11.22-Yq11.23), normalized to the 20.66 Mb of subtelomeric DNA analyzed for comparison.

 
A comparison of the number of interstitial (TTAGGG)n-like islands found in subtelomeric DNA with those found genome wide shows that, in a normalized comparison (occurrences per 20.66 Mb), (TTAGGG)n-like islands are highly enriched (>25-fold) in subtelomeric regions. In addition, they tend to be both longer and more similar to perfect (TTAGGG)n tracts in subtelomeric DNA compared with elsewhere in the genome (Fig. 3). From an evolutionary perspective, this suggests that most subtelomeric interstitial (TTAGGG)n tracts have arisen more recently than those found elsewhere in the genome, have originated via a separate mechanism than (TTAGGG)n islands found elsewhere (e.g., see Azzalin et al. 2001Go), or are under some selective pressure to maintain similarity to (TTAGGG)n (Flint et al. 1997bGo).

GC Content and Interspersed Repeat Composition of Subtelomeric Sequence Assemblies
RepeatMasker (Smit and Green, RepeatMasker at http://ftp.genome.washington.edu/RM/RepeatMasker.htmlGo) was used to analyze the sequences for interspersed repeats and for GC content. The summary results of this analysis are shown in Figure 4, and the detailed breakdown is given in Supplemental Table 2. When taken as a whole, the subtelomeric one-copy regions had an elevated GC content (47.9%), whereas Srpt and segmentally duplicated regions had a slightly elevated GC content (44.0% and 43.0%, respectively), relative to the genome-wide average of 41.6%. However, there were wide fluctuations in GC content at individual telomeres, ranging from 62.5% GC content of the one-copy region of 1p to 37.5% GC content in the one-copy region of 3p (see Supplemental Table 2); several of the most GC-rich subtelomere regions contained one or more clusters of G-rich minisatellites. Similarly, interspersed repeat content, taken as a whole, did not display dramatic biases relative to the genome-wide averages (Fig. 4), but very large subtelomere-specific biases and sometimes strand biases were seen in LINE, SINE, LTR, and DNA repeat content (see Supplemental Table 2). However, no universal patterns of interspersed sequence composition emerged that clearly distinguish subtelomeric DNA from other regions of the human genome.



View larger version (23K):
[in this window]
[in a new window]
 
Figure 4 Sequence composition of subtelomeric assemblies. The GC percent and major interspersed repeat sequence composition of the subtelomeric assemblies are shown. The interspersed repeat classes were calculated independently for each strand, and the genome-wide averages were calculated from NCBI Build 34 of the human genome.

 
Transcript Content of Subtelomeric Assemblies
Transcripts were annotated in each subtelomeric assembly, noting their mapping coordinates relative to their respective telomere, and whether they originate in duplicated DNA or single-copy DNA. We used a database of unique transcripts representing each Unigene cluster (Schuler 1997Go; ftp://ftp.ncbi.nih.gov/repository/UniGene/Go; Hs.seq.uniq.Z file available from the Unigene build available July 1, 2003 containing transcript sequences representing ~124,000 Unigene clusters) for our initial annotation. Repeat-masked subtelomeric assemblies were analyzed by BLAST, and transcripts with matches >50 bp with 85% or greater identity were collected and parsed into a second database. Each transcript within this candidate database was compared with its cognate unmasked subtelomeric assembly using the program Spidey (Wheelan et al. 2001Go). Those with >95% sequence identity over at least 50% of the transcript length were displayed on the Genotator browser (Harris 1997Go) and examined individually using Blixem (Sonnhammer and Durbin 1994Go). The same set of transcripts was displayed on the UCSC browser (Kent et al. 2002Go). The single transcript with the best nucleotide sequence match over the greatest proportion of the transcript in a given segment of the sequence was annotated. The complete set of transcripts with their corresponding coordinates within each subtelomere assembly, their percent identity within the matching sequence, and the proportion of the transcript covered by matching bases, is summarized in Table 2 and detailed in Supplemental Table 3.


View this table:
[in this window]
[in a new window]
 
Table 2. Summary of Subtelomeric Transcripts

 
A total of 941 subtelomeric transcripts were annotated in this manner, 697 from one-copy genomic regions and 244 from segmentally duplicated DNA and subtelomeric repeat DNA. Overall, the subtelomeric region is slightly enriched in Unigene transcripts (48 transcripts/Mb) relative to the genome-wide average (41 transcripts/Mb). The enrichment of transcripts in subtelomeric DNA is consistent with earlier studies, (Saccone et al. 1993Go; Flint et al. 1997aGo,bGo), although there is a great deal of variation in transcript concentration from telomere to telomere (Table 2).

Fifteen percent of the transcript matches localizing to one-copy regions either had apparent disruptions in their predicted ORFs or varied significantly (>1% in high-quality parts of the sequence) from the corresponding genomic sequence. These were designated "possible pseudogenes" (see Supplemental Table 3). However, given the frequency of sequence errors in the EST and mRNA database, as well as the draft nature of parts of the assemblies, these numbers are likely to change as experimental validation of the transcript annotations proceeds.

Similarly, an unknown but significant fraction of the transcripts embedded within the segmental duplications and subtelomeric repeats are likely to be pseudogenes (e.g., see Kermouni et al. 1995Go; Amann et al. 1996Go; Flint et al. 1997aGo), whereas others are likely to be members of gene families with many closely related, but nonidentical functional transcripts (e.g., Flint et al. 1997bGo; Mah et al. 2001Go; Fan et al. 2002Go). In most cases, it is very difficult to clearly identify pseudogenes in Srpt regions; there are many large-scale structural polymorphisms involving hundreds of kilobases of subtelomeric DNA, and it is likely that many variant copies of subtelomeric repeat loci exist in the human population, but are currently absent from sequence databases. For example, genomic Srpt loci that encode partial transcripts in a particular reference sequence might have cognate, unsequenced variant loci in the human population that encode full transcript sequences. Similarly, ESTs obtained from subtelomeric regions of some individuals will necessarily have precise sequences slightly different from those in the reference sequences if the EST was transcribed from a variant subtelomere segment absent in the current assembly. Finally, transcribed pseudogenes, as well as noncoding transcripts can clearly have important biological roles, and it is important to catalog them where found. A great deal of additional, detailed work is required to sort through each of the potential gene/pseudogene families embedded in Srpt and segmentally duplicated DNA to identify the genomic origins of particular transcripts and to determine whether/what fraction of the transcripts might encode functional proteins. Therefore, at this early stage, we think it is prudent to annotate all of the transcript matches to properly lay the groundwork for more detailed analyses.

Supplemental Table 3 identifies each of these transcripts, and Tables 3A and 3B summarize the subset of transcripts in duplicated DNA that correspond to named genes. Cross-boundary transcripts (Table 3B) contain part of a sequence from a duplicated genomic segment and part from a one-copy segment, or parts from a segmental duplication and from a subtelomeric repeat. These transcripts might represent transcribed pseudogenes generated by juxtaposition of progenitor transcript segments, or might generate new functionalities by virtue of exon shuffling upon duplication (Bailey et al. 2002Go; Fan et al. 2002Go); they include transcripts for an F-box protein, for a Zinc finger-containing protein, and for many unknown potential proteins (see Supplemental Table 3 for full list). It will ultimately be essential to acquire complete finished sequences for each distinct allele of each subtelomeric region in order to identify and analyze these genes and gene families, and to deconvolute the many instances of overclustered Unigenes and mRNAs derived from separate but highly similar duplicated genomic DNA fragments.


View this table:
[in this window]
[in a new window]
 
Table 3A. Named Transcripts in Srpt DNA

 


View this table:
[in this window]
[in a new window]
 
Table 3B. Named Cross-Boundary and Segmental Duplication Transcripts

 
Subtelomeric gene families with members having nucleotide sequence similarity in the 70%-90% level include the immunoglobulin heavy-chain genes (found at 14q), olfactory receptor genes [one-copy regions of 1q, 5q, 10q, and 15q as well as previously characterized subtelomeric repeat DNA (1p, 6p, 8p, 11p, 15q, 19p, and 3q; Trask et al. 1998Go)], and zinc-finger genes (4p, 5q, 8p, 8q, 12q, 19q). Transcripts for multiple members of these gene families were found within many of the individual subtelomeric regions (see Supplemental Table 3). The abundance of gene families in subtelomeric regions is a common feature of most eukaryotes, and may reflect a generally increased recombination and tolerance of subtelomeric DNA for rapid evolutionary change.

Transcripts positioned closest to the telomere represent genes with the highest susceptibility to telomere deletions, rearrangements, and hypothesized position effects mediated by telomere (TTAGGG)n tract shortening and/or altered telomeric heterochromatin. Both the dosage (in the case of Srpt transcripts) and the true position of many of these genes relative to the telomere will be allele dependent, changing with different subtelomeric repeat composition and organization. Nonetheless, current data permit us to identify some representatives of most Srpt gene families, and nearly all of the most distal one-copy genes. The named one-copy transcripts closest (within 100 kb) of the telomeric end of each assembly are shown in Table 4. These distal one-copy transcripts, along with the Srpt and segmental duplication transcripts described above, should comprise the segment of the human transcriptome most susceptible to telomere truncations, rearrangements, and telomere-associated position effects.


View this table:
[in this window]
[in a new window]
 
Table 4. Named Transcripts in Distal One-Copy Regions

 

    METHODS
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
Preparation of Subtelomeric Assemblies and Subtelomeric Maps
Each subtelomeric assembly was prepared by DNA sequence comparison of finished sequence accessions, draft sequence accession pieces, and half-YAC (Riethman et al. 1989Go; Kvaloy 1993Go) and cosmid end sequences (Riethman et al. 2001Go) from a given subtelomeric region. We extended the size of each subtelomeric region centromerically to include 500 kb of DNA, making use of public clone contigs and sequence overlaps of clones adjacent to the initial contig. Draft sequences were broken into their component pieces and imported along with all finished sequences into a telomere-specific Sequencher file containing all half-YAC-derived sequences and cosmid contig end sequences available.

Finished sequences for each telomere were used preferentially in the assemblies, with draft sequence fragments added as necessary to extend the assemblies. We used all or parts of NCBI assemblies from Build 34 first, then patched in draft sequences not included in the assembly. In regions in which NCBI Build 34 was inconsistent with our mapping data, we used individual accessions to complete the assemblies. The Sequencher assembler was used interactively to find and combine sequence overlaps among the imported pieces and between the half-YAC-derived sequences and the imported sequences. It was often necessary to break sequence fragments in VNTR-like regions and introduce a gap in one of the overlapping fragments (in effect, incorporating the larger sequence of a polymorphic VNTR) in order to obtain a contiguous assembly in this manner. Leftover draft sequence fragments were analyzed by BLAST to ensure that unique sequences were not missed in the assembly. A string of 100 Ns were placed between nonoverlapping, but adjacent draft sequence fragments. By use of the mapping data associated with the half-YAC-derived cosmid contigs, it was possible to uniquely orient and position most draft-sequence fragments. Subsequent comparison of each subtelomeric assembly against itself using Pattern-Hunter software (Ma et al. 2002Go) revealed no instances of what appeared to be assembly generated duplications in the sequence.

We did not make any special effort to trim high-quality overlapping sequence fragments (other than at the ends of overlapping draft fragments that were clearly error prone), but rather used the consensus from such overlap regions as our subtelomeric assembly. An N was placed in consensus positions, in which overlaps produced an ambiguous base (i.e., a SNP or a sequence error). Specific accessions as well as NCBI Build 34 contigs used in assembling each subtelomere sequence are indicated in Supplemental Table 1.

Analysis of Subtelomeric Sequence Composition and Organization
The sequence composition and organization of each subtelomeric assembly was analyzed in the following manner:

  1. RepeatMasker (Smit and Green (http://ftp.genome.washington.edu/RM/RepeatMasker.html) was used to detect interspersed and satellite repeat sequences, as well as the simple repeat (TTAGGG)n and overall GC content. The high-sensitivity setting (which requires a minimum match of 8 and a minimum score of 250) was used.
  2. Each repeat-masked subtelomeric assembly was used to query the NR, htgs, EST, and GSS divisions of GenBank (August 1, 2003).
  3. Tandem Repeats were identified using Tandem Repeats Finder (Benson 1999Go).
  4. GC content was determined and graphed using a sliding window of 500 bp.
  5. A database comprised of all of the subtelomeric assemblies was prepared and queried with each individual subtelomeric assembly using BLAST (Altschul et al. 1997Go) to identify subtelomeric repeat sequences (Srpt).
  6. To detect genes and potential genes, each masked assembly was used to query the NCBI database of sequences representative of Unigene clusters (Schuler 1997Go; Aug 1, 2003 database). Matches were mapped back to the unmasked assembly using Spidey (Wheelan et al. 2001Go) to generate gene models based upon these sequences.

The output of each of these analyses was consolidated on a single interactive Genotator (Harris 1997Go) browser to permit convenient visual displays of the different sorts of analysis for each region. BLAST hits displayed on Genotator were analyzed at the sequence level using Blixem (Sonnhammer and Durbin 1994Go). For regions in which transcript density was high, Spidey outputs were also downloaded onto the UCSC genome browser (Kent et al. 2002Go) to more easily compare multiple related transcripts across a given region.


    Acknowledgements
 
We thank the members of the International Human Genome Sequencing Consortium who participated in the sequencing of subtelomeric regions. Bob Moyzis, Jonathan Flint, and William Brown collaborated or provided reagents for the earlier stages of this work. John Rux and the Wistar Bioinformatics Facility provided programming and computational support. Financial support was provided by NIH HG00567 and CA 25874, and by the Commonwealth Universal Research Enhancement Program, PA Dept of Health.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    Footnotes
 
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1245004.

4 Corresponding author.
E-MAIL Riethman{at}wistar.upenn.edu; FAX (215) 898-3868.
Back

2 Present Address: Program in Molecular Biophysics, Johns Hopkins University, Philadelphia, PA 19104, USA; Back

3 Present Address: Cell and Molecular Biology Program, University of Pennsylvania, Baltimore, MD 21218, USA. Back

[Supplemental material is available online at www.genome.org. Detailed maps,subtelomeric assemblies (FASTA format),and transcript annotations are also available at our laboratory Web site (http://www.wistar. upenn.edu/Riethman.]


    REFERENCES
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 

Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402.[Abstract/Free Full Text]

Amann, J., Valentine, M., Kidd, V., and Lahti, J.M. 1996. Localization of Chl1-related helicase genes to human chromosome regions 12p11 and 12p13: Similarity between parts of these genes and conserved human telomere-associated DNA. Genomics 32: 260-265.[CrossRef][Medline]

Azzalin, C.M., Nergadze, S.G., and Giulotto, E. 2001. Human intrachromosomal telomeric-like repeats: Sequence organization and mechanisms of origin. Chromosoma 110: 75-82.[Medline]

Bailey, J.A., Yavor, A.M., Massa, H.F., Trask, B.J., and Eichler, E.E. 2001. Segmental duplications: Organization and impact within the current human genome project assembly. Genome Res. 11: 1005-1017.[Abstract/Free Full Text]

Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D., Myers, E.W., Li, P.W., and Eichler, E.E. 2002. Recent segmental duplications in the human genome. Science 297: 1003-1007.[Abstract/Free Full Text]

Baur, J.A., Zou, Y., Shay, J.W., and Wright, W.E. 2001. Telomere position effect in human cells. Science 292: 2075-2077.[Abstract/Free Full Text]

Benson, G. 1999. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 27: 573-580.[Abstract/Free Full Text]

Blasco, M.A., Gasser, S.M., and Lingner, J. 1999. Telomeres and telomerase. Genes & Dev. 13: 2353-2359.[Free Full Text]

Bryan, T.M., Englezou, A., Gupta, J., Bacchetti, S., and Reddel, R.R. 1995. Telomere elongation in immortal human cells without detectable telomerase activity. EMBO J. 14: 4240-4248.[Medline]

Carlson, M., Celenza, J.L., and Eng, F.J. 1985. Evolution of the dispersed SUC gene family of Saccharomyces by rearrangements of chromosomal telomeres. Mol. Cell Biol. 5: 2894-2902.[Abstract/Free Full Text]

Cook, G.P., Tomlinson, I.M., Walter, G., Carter, N.G., Riethman, H.C., Winter, G., and Rabbitts, T.H. 1994. A map of the human immunoglobulin VH locus completed by analysis of the telomeric region of chromosome14q. Nat. Genet. 7: 162-168.[CrossRef][Medline]

Donaldson, K.M. and Karpen, G.H. 1997. Trans-suppression of terminal deficiency-associated position effect variegation in a Drosophila minichromosome. Genetics 145: 325-337.[Abstract]

Fan, Y., Newman, T., Linardopoulou, E., and Trask, B.J. 2002. Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions. Genome Res. 12: 1663-1672.[Abstract/Free Full Text]

Feuerbach, F., Galy, V., Trelles-Sticken, E., Fromont-Racine, M., Jacquier, A., Gilson, E., Olivo-Marin, J.C., Scherthan, H., and Nehrbass, U. 2002. Nuclear architecture and spatial positioning help establish transcriptional states of telomeres in yeast. Nat. Cell Biol. 4: 214-221.[CrossRef][Medline]

Flint, J., Thomas, K., Micklem, G., Raynham, H., Clark, K., Doggett, N.A., King, A., and Higgs, D.R. 1997a. The relationship between chromosome structure and function at a human telomeric region. Nat. Genet. 15: 252-257.[CrossRef][Medline]

Flint, J., Bates, G.P., Clark, K., Dorman, A., Willingham, D., Roe, B.A., Micklem, G., Higgs, D.R., and Louis, E.J. 1997b. Sequence comparison of human and yeast telomeres identifies structurally distinct subtelomeric domains. Hum. Mol. Genet. 6: 1305-1313.[Abstract/Free Full Text]

Fourel, G., Revardel, E., Koering, C.E., and Gilson, E. 1999. Cohabitation of insulators and silencing elements in yeast subtelomeric regions. EMBO J. 18: 2522-2537.[CrossRef][Medline]

Harris, N.L. 1997. Genotator: A workbench for sequence annotation. Genome Res. 7: 754-762.[Abstract/Free Full Text]

Henson, J.D., Neumann, A.A., Yeager, T.R., and Reddel, R.R. 2002. Alternative lengthening of telomeres in mammalian cells. Oncogene 21: 598-610.[CrossRef][Medline]

Ijdo, J.W., Lindsay, E.A., Wells, R.A., and Baldini, A. 1992. Multiple variants in subtelomeric regions of normal karyotypes. Genomics 14: 1019-1025.[CrossRef][Medline]

International Human Genome Sequencing Consortium (IHGSC). 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.[CrossRef][Medline]

Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The Human Genome Browser at UCSC. Genome Res. 12: 996-1006.[Abstract/Free Full Text]

Kermouni, A., Van Roost, E., Arden, K.C., Vermeesch, J.R., Weiss, S., Godelaine, D., Flint, J., Lurquin, C., Szikorza, J.P., Higgs, D.R., et al. 1995. The IL-9 receptor gene (IL9R): Genomic structure and chromosomal localization in the pseudoautosomal region of the long arm of the sex chromosomes, and identification of IL9R pseudogenes at 9qter, 10pter, 16pter, and 18pter. Genomics 29: 371-382.[CrossRef][Medline]

Kvaloy, K. 1993. "The long arm telomeres of the human sex chromosomes." Ph.D thesis, Wadham College, Department of Biochemistry, University of Oxford, UK.

Lundblad, V. and Wright, W.E. 1996. Telomeres and telomerase: A simple picture becomes complex. Cell 87: 369-375.[CrossRef][Medline]

Ma, B., Tromp, J., and Li, M. 2002. PatternHunter: Faster and more sensitive homology search. Bioinformatics 18: 440-445.[Abstract/Free Full Text]

Macina, R.A., Negorev, D.G., Spais, C., Ruthig, L.A., Hu, X-L., and Riethman, H.C. 1994. Sequence organization of the human chromosome 2q telomere. Hum. Mol. Genet. 3: 1847-1853.[Abstract/Free Full Text]

Macina, R.A., Morii, K., Hu, X.-L., Negorev, D.G., Spais, C., Ruthig, L.A., and Riethman, H.C. 1995. Molecular cloning and RARE cleavage mapping of human 2p, 6q, 8q, 12q, and 18q telomeres. Genome Res. 5: 225-232.[Abstract/Free Full Text]

Mah, N., Stoehr, H., Schulz, H.L., White, K., and Weber, B.H. 2001. Identification of a novel retina-specific gene located in a subtelomeric region with polymorphic distribution among multiple human chromosomes. Biochim. Biophys. Acta. 1522: 167-174.[Medline]

Martin, C.L., Wong, A., Gross, A., Chung, J., Fantes, J.A., and Ledbetter, D.H. 2002. The evolutionary origin of human subtelomeric homologies—or where the ends begin. Am. J. Hum. Genet. 70: 972-984.[CrossRef][Medline]

Martin-Gallardo, A., Lamerdin, J., Sopapan, P., Friedman, C., Fertitta, A.L., Garcia, E., Carrano, A., Negorev, D., Macina, R.A., Trask, B.J., et al. 1995. Molecular analysis of a novel subtelomeric repeat with polymorphic chromosomal distribution. Cytogenet. Cell Genet. 71: 289-295.[Medline]

McCulloch, R., Rudenko, G., and Borst, P. 1997. Gene conversions mediating antigenic variation in Trypanosoma brucei can occur on variant surface glycoprotein expression sites lacking 70-bp repeat sequences. Mol. Cell Biol. 17: 833-843.[Abstract]

Mefford, H.C. and Trask, B.J. 2002. The complex structure and dynamic evolution of human subtelomeres. Nat. Rev. Genet. 3: 91-102.[CrossRef][Medline]

Mondello, C., Pirzio, L., Azzalin, C.M., and Giulotto, E. 2000. Instability of interstitial telomeric sequences in the human genome. Genomics 68: 111-117.[CrossRef][Medline]

Monfouilloux, S., Avet-Loiseau, H., Amarger, V., Balazs, I., Pourcel, C., and Vergnaud, G. 1998. Recent human-specific spreading of a subtelomeric domain. Genomics 51: 165-176.[CrossRef][Medline]

Morin, G.B. 1989. The human telomere terminal transferase enzyme is a ribonucleoprotein that synthesizes TTAGGG repeats. Cell 59: 521-529.[CrossRef][Medline]

Moyzis, R.K., Buckingham, J.M., Cram, S., Dani, M., Deaven, L.L., Jones, M.D., Meyne, J., Ratliff, R.L., and Wu, J.R. 1988. A highly conserved repetitive DNA sequence, (TTAGGG)n, present at the telomeres of human chromosomes. Proc. Natl. Acad. Sci. 85: 6622-6626.[Abstract/Free Full Text]

Murnane, J.P., Sabatier, L., Marder, B.A., and Morgan, W.F. 1994. Telomere dynamics in an immortal human cell line. EMBO J. 13: 4953-4962.[Medline]

Pryde, F.E. and Louis, E.J. 1999. Limitations of silencing at native yeast telomeres. EMBO J. 18: 2538-2550.[CrossRef][Medline]

Reston, J.T., Hu, X.-L., Macina, R.A., Spais, C., and Riethman, H. 1995. Structure of the terminal 300 kb of DNA from human chromosome 21q. Genomics 26: 31-38.[CrossRef][Medline]

Riethman, H.C., Moyzis, R.K., Meyne, J., Burke, D.T., and Olson, M.V. 1989. Cloning human telomeric DNA fragments into Saccharomyces cerevisiae using a yeast-artificial-chromosome vector. Proc. Natl. Acad. Sci. 86: 6240-6244.[Abstract/Free Full Text]

Riethman, H., Birren, B., and Gnirke, A. 1997. Preparation, manipulation, and mapping of high molecular weight DNA. In Genome analysis: A laboratory manual, Volume 1: "Analyzing DNA" (eds. B. Birren et al.), pp. 83-248. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

Riethman, H.C., Xiang, Z., Paul, S., Morse, E., Hu, X.L., Flint, J., Chi, H.C., Grady, D.L., and Moyzis, R.K. 2001. Integration of telomere sequences with the draft human genome sequence. Nature 409: 948-951.[CrossRef][Medline]

Rizki, A. and Lundblad, V. 2001. Defects in mismatch repair promote telomerase-independent proliferation. Nature 411: 713-716.[CrossRef][Medline]

Ruiz-Herrera, A., Garcia, F., Azzalin, C., Giulotto, E., Egozcue, J., Ponsa, M., and Garcia, M. 2002. Distribution of intrachromosomal telomeric sequences (ITS) on Macaca fascicularis (Primates) chromosomes and their implication for chromosome evolution. Hum. Genet. 110: 578-586.[CrossRef][Medline]

Saccone, S., De Sario, A., Weigant, J., Raap, A.K., Della Valle, G., and Bernardi, G. 1993. Correlations between isochores and chromosomal bands in the human genome. Proc. Natl. Acad. Sci. 90: 11929-11933.[Abstract/Free Full Text]

Schuler, 1997. Pieces of the puzzle: Expressed sequence tags and the catalog of human genes. J. Mol. Med. 75: 694-698.[CrossRef][Medline]

Smit, A.F.A. and Green, P. RepeatMasker home page. http://ftp.genome.washington.edu/RM/RepeatMasker.html

Sonnhammer, E.L.L. and Durbin, R. 1994. A workbench for Large Scale Sequence Homology Analysis. Comput. Applic. Biosci. 110: 301-307.

Trask, B.J., Friedman, C., Martin-Gallardo, A., Rowen, L., Akinbami, C., Blankenship, J., Collins, C., Giorgi, D., Iadonato, S., Johnson, F., et al. 1998. Members of the olfactory receptor gene family are contained in large blocks of DNA duplicated polymorphically near the ends of human chromosomes. Hum. Mol. Genet. 7: 13-26.[Abstract/Free Full Text]

van Geel, M., Eichler, E.E., Beck, A.F., Shan, Z., Haaf, T., van der Maarel, S.M., Frants, R.R., and de Jong, P.J. 2002. A cascade of complex subtelomeric duplications during the evolution of the hominoid and Old World monkey genomes. Am. J. Hum. Genet. 70: 269-278.[CrossRef][Medline]

van Overveld, P.G., Lemmers, R.J., Deidda, G., Sandkuijl, L., Padberg, G.W., Frants, R.R., and van Der Maarel, S.M. 2000. Interchromosomal repeat array interactions between chromosomes 4 and 10: A model for subtelomeric plasticity. Hum. Mol. Genet. 9: 2879-2884.[Abstract/Free Full Text]

Wheelan, S.J., Church, D.M., and Ostell, J.M. 2001. Spidey: A tool for mRNA-to-genomic alignments. Genome Res. 11: 1952-1957.[Abstract/Free Full Text]

Wilkie, A.O.M., Higgs, D.R., Rack, K.A., Buckle, V.J., Spurr, N.K., Fischel-Ghodsian, N., Ceccherini, I., Brown, W.R.A., and Harris, P.C. 1991. Stable length polymorphism of up to 260 kb at the tip of the short arm of human chromosome 16. Cell 64: 595-606.[CrossRef][Medline]


    WEB SITE REFERENCES
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 

ftp://ftp.ncbi.nih.gov/repository/UniGene/; database of best mRNA or EST sequence representative of each Unigene cluster.

Received February 6, 2003; accepted in revised format November 4, 2003.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
G. P. Wang, A. Ciuffi, J. Leipzig, C. C. Berry, and F. D. Bushman
HIV integration site selection: Analysis by massively parallel pyrosequencing reveals association with epigenetic modifications
Genome Res., August 1, 2007; 17(8): 1186 - 1194.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Rehmeyer, W. Li, M. Kusaba, Y.-S. Kim, D. Brown, C. Staben, R. Dean, and M. Farman
Organization of chromosome ends in the rice blast fungus, Magnaporthe oryzae
Nucleic Acids Res., October 18, 2006; 34(17): 4685 - 4701.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
B. Britt-Compton, J. Rowson, M. Locke, I. Mackenzie, D. Kipling, and D. M. Baird
Structural stability and chromosome-specific telomere length is governed by cis-acting determinants in humans
Hum. Mol. Genet., March 1, 2006; 15(5): 725 - 733.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
D. M. Baird, B. Britt-Compton, J. Rowson, N. N. Amso, L. Gregory, and D. Kipling
Telomere instability in the male germline
Hum. Mol. Genet., January 1, 2006; 15(1): 45 - 51.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. Li, C. J. Rehmeyer, C. Staben, and M. L Farman
TERMINUS--Telomeric End-Read Mining IN Unassembled Sequences
Bioinformatics, April 15, 2005; 21(8): 1695 - 1698.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. Misceo, M. F. Cardone, L. Carbone, P. D'Addabbo, P. J. de Jong, M. Rocchi, and N. Archidiacono
Evolutionary History of Chromosome 20
Mol. Biol. Evol., February 1, 2005; 22(2): 360 - 366.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. Ventura, S. Weigl, L. Carbone, M. F. Cardone, D. Misceo, M. Teti, P. D'Addabbo, A. Wandall, E. Bjorck, P. J. de Jong, et al.
Recurrent Sites for New Centromere Seeding
Genome Res., September 1, 2004; 14(9): 1696 - 1703.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow