Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gentles, A. J.
Right arrow Articles by Karlin, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gentles, A. J.
Right arrow Articles by Karlin, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 4, 540-546, April 2001

LETTER
Genome-Scale Compositional Comparisons in Eukaryotes

Andrew J. Gentles, and Samuel Karlin1

Mathematics Department, Stanford University, Stanford, California 94305, USA

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
DISCUSSION
METHODS
REFERENCES

We examined dinucleotide relative abundances and their biases in recent sequences of eukaryotic genomes and chromosomes, including human chromosomes 21 and 22, Saccharomyces cerevisiae, Arabidopsis thaliana, and Drosophila melanogaster. We found that dinucleotide relative abundances are remarkably constant across human chromosomes and within the DNA of a particular species. The dinucleotide biases differ between species, providing a genome signature that is characteristic of the bulk properties of an organism's DNA. We detail the relations between species genome signatures and suggest possible mechanisms for their origin and maintenance.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
DISCUSSION
METHODS
REFERENCES

The recent sequencing of the complete genomes of Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster, along with human chromosomes 21 and 22 and chromosomes 2 and 4 of Arabidopsis thaliana, provides new opportunities for studying higher eukaryote genome organization (C. elegans Sequencing Consortium 1998; Dunham et al. 1999; Lin et al. 1999; Mayer et al. 1999; Adams et al. 2000; Hattori et al. 2000). Every genome has a unique signature based on dinucleotide relative abundances (Karlin and Ladunga 1994). This genome signature is a characteristic of the genome as a whole and does not depend on knowledge of individual genes or alignment of homologous sequences. Instead, it reflects the response of the whole genome to overall selective pressures, operating through limits on compositional and/or structural variations in DNA. It is essentially constant in both coding and noncoding sequences and is independent of renaturation fraction (G + C isochores) and of base compositional fractions (Russell et al. 1976; Russell and Subak-Sharpe 1977). The mechanisms that determine and maintain the signature are not understood, but they could involve DNA replication and repair mechanisms and biases in DNA modification processes. They can operate on the whole genome through DNA structure (e.g., base-step stacking energies and DNA conformational tendencies), context dependent mutation, and DNA methylation patterns (for review, see Karlin 1998).

Dinucleotide Relative Abundances

The dinucleotide relative abundance is defined as
&rgr;<SUP>*</SUP><SUB>XY</SUB> = f<SUP>*</SUP><SUB>XY</SUB>&cjs0823;  f<SUP>*</SUP><SUB>X</SUB>f<SUP>*</SUP><SUB>Y</SUB>
where f*X is the frequency of the nucleotide X and f*XY is the frequency of the dinucleotide XY, calculated over a sequence concatenated with its inverted complement. (Throughout we refer to the dinucleotide pair XpY as XY.) rho * measures the abundance of dinucleotides relative to what would be expected from the component base frequencies. Hence, rho * (actually rho - 1) can also be referred to as the dinucleotide bias.

The vector of rho * values constitutes the genome signature. In practice, a given sequence is split into equal (typically 50-kb) segments and the signature is calculated for each. Distributions of rho * values for the 50-kb segments can be compared with each other within a species or between different species. Thus, it can be judged which dinucleotide pairs are relatively over- or underrepresented in the genome. Theoretical and empirical studies indicate that if the dinucleotide XY has a mean rho *XY <=  0.78, then XY is significantly underrepresented (suppressed), whereas rho *XY >=  1.23 indicates over-representation. Corresponding expressions can be constructed for tri- and tetranucleotide relative abundances but add little additional information, suggesting that DNA conformational stacking arrangements are determined mainly through the dinucleotide base-step configurations.

The genome signature is highly invariant across the DNA of an organism and is similar for closely related species. Strong support for the invariance of the signature within species comes from both sequence analysis and experimental studies of nearest-neighbor frequencies, which have shown that the set of dinucleotide relative abundance values for 50-kb DNA contigs is a characteristic of an organism's DNA and distinguishes it from other species (Russell et al. 1976; Russell and Subak-Sharpe 1977; Karlin and Burge 1995; Karlin 1998).

rho *XY Distributions Across Species

Each available data set (see Methods) was divided into nonoverlapping 50-kb samples and the rho *XY values determined for each sample. For every organism, one obtains a list of rho * values for each 50-kb sample for all dinucleotides XY. These are plotted as histograms of rho * values for each dinucleotide in Figure 1, which compares the distributions for human, S. cerevisiae, D. melanogaster, C. elegans, and A. thaliana. The distributions are all homogeneous within species and distinctly different between species. Histograms are superior to simple variance statistics. Individual p* values do not discriminate between, for example, yeast and Arabidopsis, between mouse and human, betwee4n the protists Plasmodium falciparum and Trypanosoma brucei, or among most prokaryotes. The whole genome signature vector (10 components) does discriminate these cases.


View larger version (32K):
[in this window]
[in a new window]
 
Figure 1   Distribution of rho * values for all 50-kb samples from human (red), Drosophila melanogaster (black), Saccharomyces cerevisiae (green), Caenorhabditis elegans (blue), and Arabidopsis thaliana (orange).

The most striking feature is the CG underrepresentation in human DNA. GC relative abundances tend to be in the normal range across eukaryotic species, except for Drosophila which has high rho *GC. Human DNA has higher relative abundances of CC/GG, AG/CT, and CA/TG dinucleotides than the other species, but neither dinucleotide pair is significantly biased. rho *CA/TG is slightly high in human but normal in Drosophila, yeast, Arabidopsis, and C. elegans. TA is modestly suppressed in all organisms, with human and C. elegans showing the lowest rho *TA. Yeast and Arabidopsis have very similar rho * values for all dinucleotides, with generally sharply peaked distributions and low variance, the exception being CG in Arabidopsis. In contrast human, C. elegans, and, to a lesser extent, Drosophila all exhibit a moderate spread in rho * values. AC/GT, AA/TT, and AT relative abundances do not differ much between species and are all in the normal (unbiased) range of rho * values.

Human Chromosomes 21 and 22

The recent completion of human chromosomes 21 and 22 makes them particularly interesting sequences to study. Both were partitioned into contiguous 50-kb windows and the rho *XY values for each window are plotted across the chromosomes in Figure 2. One can see immediately that, with minor exceptions, all dinucleotide biases are clearly invariant both across and between chromosomes. This is conspicuous in the rho * values for CG, GC, GA/TC, AC/GT, and AT. From around position 10 Mb to 25 Mb on chromosome 21, the AG/CT dinucleotide bias is slightly reduced compared to the rest of the chromosome and to chromosome 22. In addition, the chromosome-21 TA bias is slightly elevated over this region. The only other notable variation is around position 13.4 Mb of chromosome 22 in the 406-kb long contig NT002447. Closer inspection reveals that a large portion of this contig (GenBank accession no. AP000536) is dominated by a 47-kb tandem repeat of an ~ 50-bp subunit.


View larger version (51K):
[in this window]
[in a new window]
 
Figure 2   Variation of rho * across human chromosomes 21 (red) and 22 (black) for each unique dinucleotide. The horizontal scale is identical in all graphs. Each vertical scale graduation is 0.1 in rho *.

It is noteworthy that the genome signature does not change according to the predicted gene density on either chromosome; nor does it change as one approaches the centromeric heterochromatin or the telomeres. For example, there is a 7-Mb region of chromosome 21 from position 5 Mb to 12 Mb that has a low (G + C) content, no CG islands, a few Alu repeats, and low gene numbers relative to the rest of the chromosome (Hattori et al. 2000). Yet the signature does not vary across this region or relative to distant regions of chromosome 21.

{rho *XY} Comparisons

Table 1 shows the mean rho *XY values of nonoverlapping 50-kb samples for each dinucleotide pair in several eukaryotes and for each human chromosome. Mean rho *XY values are strongly conserved across all human chromosomes. The ranges are in CG, 0.18 to 0.31; GC, 0.96 to 1.02; TA, 0.66 to 0.75; AT, 0.84 to 0.89; CC/GG, 1.22 to 1.24; TT/AA, 1.11 to 1.13; TG/CA, 1.20 to 1.24; AG/CT, 1.15 to 1.24; AC/GT, 0.82 to 0.86; and GA/TC 0.98 to 1.00. The largest variation is in rho *CG, where the highest value is 0.31 for chromosome 19, followed by chromosomes 16 and 22 at 0.28. The lowest values occur for chromosomes Y (0.18), X (0.20), and 18 (0.21). There is a positive correlation between rho *XY values and the CG-island densities implied by in situ fluorescence hybridization of human chromosomes during metaphase (Cross and Bird 1995). Chromosomes 19 and 22 are rich in CG islands, whereas chromosomes 18, X, and Y are CG-island poor, in agreement with the rho *CG values noted above.

                              
View this table:
[in this window]
[in a new window]
 
Table 1.   Mean Eukaryotic rho * Values for All Available DNA Contigs at Least 50kb in Size

In common with all mammalian genomes, Mus musculus and the human chromosomes exhibit extreme CG underrepresentation, with rho *CG = 0.21 in mouse and 0.18-0.31 for human chromosomes. CG suppression is usually explained through the methylation-deamination-mutation hypothesis, whereby methylation of CG to 5-methylcytosine and subsequent deamination to thymine results, if unrepaired, in conversion of CG to TG/CA. The methylation hypothesis is supported by the fact that invertebrates that do not possess a methylase, such as Drosophila and C. elegans, do not exhibit significant CG dinucleotide bias (rho *CG = 0.92 for Drosophila and 0.96 for C.elegans). rho *CG is significantly low in A. thaliana (0.72) but not in yeast (0.80), concurring with the occurence of methylation in dicots such as Arabidopsis but with its absence from monocots. However, in human and mouse, TG/CA is only marginally overrepresented (rho *TG/CA = 1.20-1.24 and 1.20-1.23, respectively), in marked contrast to the extreme underrepresentation of CG. Moreover, CG is underrepresented in all animal mitochondria despite the lack of methylase activity in mitochondria. There is also no significant bias in TG/CA in animal mitochondria. This indicates that although methlyation may contribute to vertebrate CG suppression, it does not fully account for it.

All of the eukaryotes except Plasmodium falciparum (0.99) show low rho *TA, ranging from 0.56 in Leishmania major and 0.62 in C. elegans to 0.75 in Arabidopsis and Drosophila. TA is the least stable dinucleotide stacking pair and is prominent in some regulatory signals, such as the TATA box and 3' polyadenylation signal. Avoidance of spurious signal sequences and considerations of DNA stability could both act to suppress overall levels of TA. In coding regions, TA may be low because UA is disfavored in mRNAs, where it is relatively susceptible to cleavage by ribonucleases (Beutler et al. 1989). rho *GC is high in Drosophila (1.27), whereas C.elegans has high rho *TT/AA = 1.28. Mouse shows high rho *AG/CT (1.25), with human (range 1.15-1.22) hardly biased. The other most biased dinucleotide abundances are in Leishmania (rho *TG/CA = 1.25) and Plasmodium (rho *CC/GG = 1.51). Yeast is unusual among these eukaryotes in having no significantly biased dinucleotide relative abundances. All yeast rho *s are in the range 0.8-1.13 except for rho *TA, which qualifies as marginally underrepresented at 0.77.

Table 2 shows the unsymmetrized (single-strand) rho  values for CG and TA at different codon positions, introns, and intergenic regions in human DNA. Both CG and TA are suppressed in coding and noncoding regions, with TA being less biased in all cases. Introns and intergenic DNA exhibit stronger CG suppression than coding sequences but are less biased in TA. This is consonant with higher substitution rates in noncoding regions, which do not have the constraints on amino acid and codon usage, which affect coding sequences. The higher CG usage at codon positions 1,2---compared to 2,3 or 3,1---probably reflects the fact that in human proteins, arginine is more frequently coded for by CGN (3.2% of the time) than by an AGR codon (2.2%). Paradoxically, G is highest at codon position 1 (32%) and C is highest at position 3 (29%), yet CG is highly suppressed at positions 3,1. 

                              
View this table:
[in this window]
[in a new window]
 
Table 2.   CG and TA Dinucleotide Biases in Human Coding and Noncoding DNA

delta * Comparisons

It is useful to have a measure of the difference between the signatures of DNA sequences. For this purpose, we use the dinucleotide relative abundance distance, which for sequences p and q is defined as
&dgr;*(p,q) = <FR><NU>1</NU><DE>16</DE></FR> <LIM><OP>∑</OP><LL>XY</LL></LIM><FENCE>&rgr;<SUP>*</SUP><SUB>XY</SUB>(p) − &rgr;<SUP>*</SUP><SUB>XY</SUB>(q)</FENCE>,
  where the sum is over all dinucleotides XY. The value of delta * is quoted after multiplying by 1000. The average distance delta * between random sequences of length 50 kb is then ~ 10-20. In comparing DNA sequences, the mean delta * value is found for all pairwise comparisons of 50-kb contigs. This can be done within a species and between different species. Thus, a matrix of distances is built up, which is the mean delta * distance between 50-kb segments from each species or sequence. Extensive testing has shown that the delta * distance is not distorted by extreme biases in a single dinucleotide (Karlin and Ladunga 1994).

delta * Comparisons within Species

Human within-chromosome delta * scores range from 30 in chromosome 7 to 48 in chromosome 11. The range of delta * between chromosomes is from 30 (chromosome 18 vs. 13) to 54 (19 vs. Y), with 35-45 being typical. The delta * distance between chromosomes, therefore, is approximately the same as within chromosomes, despite the differences in base composition, gene density, and repeat frequencies between them. The delta *distances between and within Drosophila chromosomes range from 42 to 68. As shown in Figure 3, the left (L) and right (R) arms of chromosomes 2 and 3 are a close group, with delta * between 42 and 57. The Drosophila X chromosome is slightly more variable both within itself (delta * = 68) and in comparison to the other chromosomes, with a distance of 57 from 2L, 3L, and 3R and 65 from 2R. With the exception of the X chromosome, these values are similar to human within- and between-chromosome delta * values. Finally, in C. elegans the six chromosomes exhibit a range of delta * within themselves from 49 (chromosome 4) to 70 (chromosome 2). Between-chromosome distances are from 51 (chromosome X vs. 4 and 5) to 70 (3 vs. 2 and X). The delta * values thus exhibit the same invariance within a species as the dinucleotide rho * biases.


View larger version (25K):
[in this window]
[in a new window]
 
Figure 3   Drosophila melanogaster delta -distances within and between chromosomes X, 2 (right arm, R, and left arm, L), and 3 (R, L).

delta * Comparisons between Species

Figure 4 shows the mean delta * distances between the eukaryotes discussed above. P. falciparum chromosomes 2 and 3, L. major chromosome 1, and the complete E. coli genome are included for comparison.


View larger version (58K):
[in this window]
[in a new window]
 
Figure 4   delta * Distances between Homo sapiens, homsa; Mus musculus, musmu; Drosophila melanogaster, drome; Caenorhabditis elegans, caeel; Saccharomyces cerevisiae, sacce; Arabidopsis thaliana, arath; Escherichia coli, ecoli; Plasmodium falciparum, plafa; and Leishmania major, leima.

Human and mouse show moderate similarity (delta * = 58), as one would expect. Arabidopsis and yeast are close (delta * = 45); surprisingly, their delta * distance from each other is nearly as low as their mean within-species distances. E. coli is very distant from human (210), mouse (241), and both protoctists (196, 174). It is also distant from C. elegans (128), Arabidopsis (148), and yeast (122). Mysteriously, however, there is moderate similarity between the signatures of E. coli and D. melanogaster (delta * =74).

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
DISCUSSION
METHODS
REFERENCES

We have confirmed, through our analysis of the current complete eukaryotic genomes and chromosomes 21 and 22 of human, the constancy and validity of the genome signature for each species. Signature comparisons have revealed a number of intriguing relations between organisms. For example, bacterial phage genome signatures are strongly correlated with the nature of the host and the extent to which the phage uses the host-cell machinery (Blaisdell et al. 1996). Both broad-range and specialized plasmids in prokaryotes share moderate to close genome signature with their host (Campbell et al. 1999). Although mammalian mitochondria are close to each other in signature and reflect relationships parallel to those derived from nuclear DNA, they are not close to their host nuclear DNA, with typical delta * differences between 140 and 200 (Karlin and Mrázek 1997).

Among bacteria, there are signature similarities between closely related species (such as E. coli vs. Salmonella typhimirium and Streptococcus pyogenes vs. Lactococcus lactis) but no groupings that can be attributed to obvious causes such as the environment in which the bacteria live. Likewise, archaea do not form a coherent clade in terms of their signature; for example, halobacteria sp. and methanogens have extremely different genome signatures. Anomalies in the signature have been used to detect bacterial pathogenicity islands and laterally transferred operons in Helicobacter pylori and Mycobacterium tuberculosis (Karlin 1998) and in Neisseria meningitidis, Vibrio cholerae, Campylobacter jejuni, and E. coli (data not shown). Unmethylated CG shows normal dinucleotide bias in most proteobacteria and can provoke an immune response in mammals (Krieg et al. 1998). CG is also suppressed in most small (< 30 kb length) vertebrate viral genomes, except for a few togaviruses (Karlin et al. 1994). Another intriguing result is that the signature of mammalian retroviruses shows moderate similarity to the nuclear DNA into which they integrate with a range delta * = 70-90 (data not shown). This might have resulted from the processing of the viral genetic program by the host-cell machinery or a selective shift in the viral genome toward a genome signature that is more compatible with the host.

There are a number of unanswered questions concerning the nature of the genome signature. The homogeneity of the signature is clearly maintained by processes that operate at the scale of the whole genome. However, it is not known if the signature corresponds to a frozen event or if it is a dynamical feature of a genome that changes over time, albeit slowly. How did the signature arise for a given genome and how fast can it change? Many DNA repair enzymes recognize the shape of the DNA molecule rather than specific sequences (Echols and Goodman 1991; Kunkel 1992). Stacking energies, charge interactions, and conformational tendencies all bear on local DNA structure and thus influence the intrinsic curvature of DNA (Bolshoy 1995). In addition, the efficiency of DNA repair is affected by neighboring-base context.

    METHODS
TOP
ABSTRACT
INTRODUCTION
DISCUSSION
METHODS
REFERENCES

Data

The human, mouse, A. thaliana, P. falciparum, L. major, C. elegans, S. cerevisiae, and E. coli sequences were acquired from GenBank. Except for chromosomes 21 and 22, sequence sets for human chromosomes were produced using the lists of contigs maintained by the Computational Biosciences Section at Oak Ridge National Laboratory. Only contigs >= 50 kb in length were used. The complete D. melanogaster genome was obtained from the Gadfly database maintained by the Berkeley Drosophila Genome Project.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    NOTE ADDED IN PROOF

The p* values for human chromosomes are essentially unchanged when calculated across the recently released draft sequence of the complete human genome (International Human Genome Sequencing Consortium 2001).


    FOOTNOTES

1 E-MAIL karlin{at}math.stanford.edu; FAX (650) 725-2040.

Article and publication are at www.genome.org/cgi/doi/10.1101/gr.163101.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
DISCUSSION
METHODS
REFERENCES

  • Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A., Gocayne, J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A., Galle, R.F. 2000. The genome sequence of Drosophila melanogaster. Science 287: 2185-2195[Abstract/Free Full Text].
  • Beutler, E., Gelbart, T., Han, J.H., Koziol, J.A., and Beutler, B. 1989. Evolution of the genome and the genetic code: Selection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc. Natl. Acad. Sci. 86: 192-196[Abstract/Free Full Text].
  • Blaisdell, B.E., Campbell, A.M., and Karlin, S. 1996. Similarities and dissimilarities of phage genomes. Proc. Natl. Acad. Sci. 93: 5854-5859[Abstract/Free Full Text].
  • Bolshoy, A. 1995. Dinucleotides contribute to the bending of DNA in chromatin. Nat. Struct. Biol. 2: 446-448[CrossRef][Medline].
  • C. elegans Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282: 2012-2018[Abstract/Free Full Text].
  • Campbell, A., Mrázek, J., and Karlin, S. 1999. Genome signature comparisons among prokaryote, plasmid and mitochondrial DNA. Proc. Natl. Acad. Sci. 96: 9184-9189[Abstract/Free Full Text].
  • Cross, S.H. and Bird, A.P. 1995. CpG islands and genes. Curr. Opin. Genet. Dev. 5: 309-314[CrossRef][Medline].
  • Dunham, I., Shimizu, N., Roe, B.A., Chissoe, S., Hunt, A.R., Collins, J.E., Bruskiewich, R., Beare, D.M., Clamp, M., Smink, L.J. 1999. The DNA sequence of human chromosome 22. Nature 402: 489-495[CrossRef][Medline].
  • Echols, H. and Goodman, M.F. 1991. Fidelity mechanisms in DNA replication. Annu. Rev. Biochem. 60: 477-511[CrossRef][Medline].
  • Hattori, M., Fujiyama, A., Taylor, T.D., Watanabe, H., Yada, T., Park, H.S., Toyoda, A., Ishii, K., Totoki, Y., Choi, D.K. 2000. The DNA sequence of human chromosome 21. Nature 405: 311-319[CrossRef][Medline].
  • International Human Genome Sequencing Consortium. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921[CrossRef][Medline].
  • Karlin, S. 1998. Global dinucleotide signatures and analysis of genomic heterogeneity. Curr. Opin. Microbiol. 1: 598-610[CrossRef][Medline].
  • Karlin, S. and Burge, C. 1995. Dinucleotide relative abundance extremes: A genomic signature. Trends Genet. 11: 283-290[CrossRef][Medline].
  • Karlin, S. and Ladunga, I. 1994. Comparisons of eukaryotic genomic sequences. Proc. Natl. Acad. Sci. 91: 12832-12836[Abstract/Free Full Text].
  • Karlin, S. and Mrázek, J. 1997. Compositional differences within and between eukaryotic genomes. Proc. Natl. Acad. Sci. 94: 10227-10232[Abstract/Free Full Text].
  • Karlin, S., Doerfler, W., and Cardon, L.R. 1994. Why is CpG suppressed in the genomes of virtually all small eukaryotic viruses but not in those of large eukaryotic viruses? J. Virol. 68: 2889-2897[Abstract/Free Full Text].
  • Krieg, A.M., Yi, A.K., Schorr, J., and Davis, H.L. 1998. The role of CpG dinucleotides in DNA vaccines. Trends Microbiol. 6: 23-27[CrossRef][Medline].
  • Kunkel, T.A.. 1992. Biological asymmetries and the fidelity of eukaryotic DNA replication. Bioessays 14: 303-308[CrossRef][Medline].
  • Lin, X.Y., Kaul, S., Rounsley, S., Shea, T.P., Benito, M.I., Town, C.D., Fujii, C.Y., Mason, T., Bowman, C.L., Barnstead, M. 1999. Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. Nature 402: 761-768[CrossRef][Medline].
  • Mayer, K., Schuller, C., Wambutt, R., Murphy, G., Volckaert, G., Pohl, T., Dusterhoft, A., Stiekema, W., Entian, K.D., Terryn, N. 1999. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402: 769-777[CrossRef][Medline].
  • Russell, G.J. and Subak-Sharpe, J.H. 1977. Similarity of the general designs of protochordates and invertebrates. Nature 266: 533-536[CrossRef][Medline].
  • Russell, G.J., Walker, P.M., Elton, R.A., and Subak-Sharpe, J.H. 1976. Doublet frequency analysis of fractionated vertebrate nuclear DNA. J. Mol. Biol. 108: 1-23[Medline].

Received August 30, 2000; accepted in revised form February 5, 2001.


11:540-546 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
S. Gunewardena and Z. Zhang
A hybrid model for robust detection of transcription factor binding sites
Bioinformatics, February 15, 2008; 24(4): 484 - 491.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
G. Van der Auwera, J. Baute, M. Bauwens, I. Peck, D. Piette, M. Pycke, P. Asselman, and A. Depicker
Development and Application of Novel Constructs to Score C:G-to-T:A Transitions and Homologous Recombination in Arabidopsis
Plant Physiology, January 1, 2008; 146(1): 22 - 31.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Paz, V. Kirzhner, E. Nevo, and A. Korol
Coevolution of DNA-Interacting Proteins and Genome "Dialect"
Mol. Biol. Evol., January 1, 2006; 23(1): 56 - 64.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. C. Leman, Y. Chen, J. E. Stajich, M. A. F. Noor, and M. K. Uyenoyama
Likelihoods From Summary Statistics: Recent Divergence Between Species
Genetics, November 1, 2005; 171(3): 1419 - 1436.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Karlin
Colloquium Perspective: Statistical signals in bioinformatics
PNAS, September 20, 2005; 102(38): 13355 - 13362.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
T. Abe, H. Sugawara, M. Kinouchi, S. Kanaya, and T. Ikemura
Novel Phylogenetic Studies of Genomic Sequence Fragments Derived from Uncultured Microbe Mixtures in Environmental and Clinical Samples
DNA Res, January 1, 2005; 12(5): 281 - 290.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Z. Zhang, P. M. Harrison, Y. Liu, and M. Gerstein
Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome
Genome Res., December 1, 2003; 13(12): 2541 - 2558.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. Dieringer and C. Schlotterer
Two Distinct Modes of Microsatellite Mutation Processes: Evidence From the Complete Genomic Sequences of Nine Species
Genome Res., October 1, 2003; 13(10): 2242 - 2251.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Zhang and M. Gerstein
Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes
Nucleic Acids Res., September 15, 2003; 31(18): 5338 - 5348.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T. Abe, S. Kanaya, M. Kinouchi, Y. Ichiba, T. Kozuki, and T. Ikemura
Informatics for Unveiling Hidden Genome Signatures
Genome Res., April 1, 2003; 13(4): 693 - 702.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Echols, P. Harrison, S. Balasubramanian, N. M. Luscombe, P. Bertone, Z. Zhang, and M. Gerstein
Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes
Nucleic Acids Res., June 1, 2002; 30(11): 2515 - 2523.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
E. Lerat, P. Capy, and C. Biemont
The Relative Abundance of Dinucleotides in Transposable Elements in Five Species
Mol. Biol. Evol., June 1, 2002; 19(6): 964 - 967.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Chen, A. J. Gentles, J. Jurka, and S. Karlin
Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22
PNAS, February 20, 2002; (2002) 52692099.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Chen, A. J. Gentles, J. Jurka, and S. Karlin
Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22
PNAS, March 5, 2002; 99(5): 2930 - 2935.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Gentles, A. J.
Right arrow Articles by Karlin, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gentles, A. J.
Right arrow Articles by Karlin, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.