|
|
|
|
Published online before print
February 6, 2007, 10.1101/gr.5823007 Genome Res. 17:311-319, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00
Letter Sequencing and analysis of chromosome 1 of Eimeria tenella reveals a unique segmental organization1 Malaysia Genome Institute, UKM-MTDC Smart Technology Centre, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor DE, Malaysia; 2 Molecular Genetics Laboratory, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor DE, Malaysia; 3 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom; 4 Division of Microbiology, Institute for Animal Health, Compton Laboratory, Compton, Near Newbury, Berkshire, RG20 7NN, United Kingdom; 5 School of Biosciences and Biotechnology, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor DE, Malaysia; 6 Departamento de Parasitologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo SP, 05508-000, Brazil; 7 MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, United Kingdom; 8 Departamento de Ciências da Computação, Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo SP, 05508-000, Brazil; 9 Department of Medical Microbiology and Parasitology, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor DE, Malaysia
Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosomewhich appears to be representative of the genomeis gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.
Eimeria tenella is an obligate, intracellular protozoan parasite that infects epithelial cells of the intestinal tract (caeca) of the domestic fowl (Gallus gallus). Severe infections are common and give rise to coccidiosis, an enteritis that impedes growth, presents a high morbidity, and may increase the mortality rate of the affected flocks. Eimeria species are ubiquitous, and the poultry industry relies extensively on prophylactic medication with anticoccidial drugs, or on vaccination, to control infection in the 30 billion chickens reared annually worldwide. The combined cost of control and of losses caused by coccidiosis is estimated at £2 billion annually (Shirley et al. 2004 Eimeria spp. are also of wider scientific interest, falling as they do in the same phylum (Apicomplexa) as the malarial parasites (Plasmodium spp.), the zoonotic parasites Toxoplasma gondii and Cryptosporidium spp., and the cattle parasites Theileria spp. Comparative analyses of these latter organisms are already illuminating many aspects of apicomplexan biology, and the genome sequence of E. tenella, a non-cyst-forming coccidian, will add a new dimension to these analyses.
The pioneering work of Tyzzer (1929)
Complementing the genome-wide shotgun sequencing, parallel projects have been undertaken to generate the complete sequences of chromosomes 1 (
Mapping, sequencing, and assembly The sequence was assembled primarily from reads taken from chromosome-specific small-insert plasmid libraries, prepared from pulsed-field gel electrophoresis- (PFGE-) purified chromosome 1 (Chr1) of E. tenella (Houghton strain) (Chapman and Shirley 2003 To check the assembly, assist the resolution of repetitive regions, and orientate sequence scaffolds and isolated contigs, a high-resolution HAPPY map was constructed (see Methods). A total of 173 markers (taken from the early chromosome-specific sequence reads or, later, from contigs of the assembly) were mapped to a single linkage group. Markers that failed to link to the remainder were inferred to be predominantly contaminating sequences from other chromosomes; the majority appear to lie on Chr2 (data not shown), and the inferred level of contamination was typical for that seen when purifying chromosomes by PFGE. The remaining gaps in the assembly are classified as sequence gaps (spanned by at least two small-insert plasmid clones) and physical gaps (spanned by fewer than two clones). Where possible, sequence gaps were closed by resequencing and primer walking, or subcloning and sequencing at least one bridging clone. Several sequencing and physical gaps were also closed by skimming of BAC and fosmid clones. Details of the final assembly and the comparison with the HAPPY map are given in the Supplemental Material. The strong concordance between the order of markers on the HAPPY map and that found in the sequence implies a substantially correct assembly. The absence of any linked HAPPY markers beyond the ends of the assembly suggests that the assembly (including gaps) spans almost the full length of the chromosome, as does the presence of telomeric motifs at each end of the sequence (see below). The assembly consists of 889,314 bp of sequence, but spans 1,347,714 bp including the physical and sequence gaps. In each case, the maximum possible gap size (based on the sizes of bridging plasmid or BAC clones) is indicated in the EMBL entryan overestimate in most cases. When reasonable estimates are made (based on the actual lengths of bridging clones where known, or on the distribution of insert sizes in the clone libraries, and taking into account the overlaps between the clones and the linked sequences) of the actual lengths of the gaps, the total span of the assembly is 1015 kb, in close agreement with the chromosomal size of 1050 kb measured by PFGE. In calculating the content of the chromosome (e.g., base composition or density of genes and other features), we consider only the sequenced bases of the assembly.
Segmental structure of the chromosome
Sequence composition and information content Overall, the chromosome is 49.7% A+T. Although the nucleotide compositions of the P- and R-segments are similar (52.4% and 48.7% A+T, respectively), the P-segments have a fairly uniform composition, whereas the A+T content of the R-segments fluctuates widely along their length and mirrors the distribution of coding sequences (Fig. 1). The dinucleotide CpG is considerably under-represented in the R-segments: it occurs only 31% as often as its isomer GpC. Such under-representation is usually indicative of cytosine methylation at CpG sequences (since deamination of 5'-methyl cytosine leads to its replacement with thymidine unless actively preserved by selection), although there is no other evidence for or against methylation (which can regulate gene expression) in Eimeria. In contrast, CpG is only slightly under-represented in the P-segments, being present 64% as frequently as GpC. Looking at longer sequences (trinucleotides and above), the R-segments show a strongly skewed composition, with heavy over-representation of a small subset of sequences (Fig. 2). In contrast, the sequence of the P-segments is much more similar to random sequence, with no grossly over- or under-represented short sequences. These skews are also reflected in the information content (entropy) of the sequence (Fig. 1). The entropy of the P-segments is as high as that of random sequence, while that in the R-segments is lower and variable, reflecting a mixture of highly repetitive and motif-rich sequence.
Simple-sequence repeats and LINE-like elements A striking feature of the chromosome is the abundance of long tandem repeats of the trinucleotide CAG and of the heptamer AGGGTTT. Both are present almost exclusively in the R-segments, where together they make up 14% of the sequence (Fig. 1). CAG repeats are found preferentially in the predicted exons of the R-segments, where they occur about once every 200 bp on average. The heptamer motif has been identified as a telomeric repeat unit in Plasmodium (for review, see Figueiredo and Scherf 2005
Other simple-sequence repeats (SSRs) are also confined almost exclusively to the R-segments; indeed, the P-segments appear to have been "swept clean" of SSRs, containing only about one per 12 kb, as compared to one per 170 bp in the R-segments, one per 200 bp in Plasmodium falciparum (Gardner et al. 2002 The chromosome also has 57 regions with significant similarity to known LINE transposons, exclusively in the R-segments. There is some indication of clustering, especially of closely related elements, and with adjacent elements tending to lie in the same orientation. Most of these regions are small, and do not appear to encode functional genes.
Apicomplexan-specific palindromic octamer motif
Segmental duplications
Predicted transcripts The chromosome encodes 216 predicted proteins, distributed approximately equally between the repeat-rich R-segments and the P-segments. However, the characteristics of the genes in these two regions differ (Table 1).
The predicted genes in the R-segments ("R-genes") typically encode long proteins (447 amino acids on average) and have several fairly small introns, and the [A+T] content of their exons is lower than the average for the chromosome. About half of these proteins have predicted transmembrane domains, and, of these, about half also have predicted signal peptides. Many candidates for surface proteins are therefore included in this group. Although only 46% of these predicted proteins have similarity to proteins in other organisms, this is not unusual in newly sequenced genomes. Matches to ESTs from Eimeria species support 24% of these predictions, and, overall, 57% of the predictions are supported by either similarity or by EST data (or by both). Most (60%) of these predictions were made by two or more of the automated prediction tools. Therefore, we are confident that the majority of R-gene predictions are essentially correct. The genes in the P-segments ("P-genes") encode shorter proteins (309 amino acids on average) and have long introns, and their exons have a high [A+T] content. (The third P-segment is atypical, but it contains several large imperfect tandem duplications, which may skew the coding content of this segment.) Fewer of the P-genes (about one-quarter) have predicted transmembrane domains than is the case for R-genes, and far fewer (only six) have predicted signal peptides. The gene predictions in the P-segments are less well supported than those in the R-segments: only 43 (34%) are supported by similarities to other species or by EST data (mostly the latter), and the majority were predicted by only one of the automated prediction programs (predominantly GlimmerHMM). However, a detailed analysis (see Supplemental Material) suggests that GlimmerHMM is the most efficient prediction tool in this genome. It is possible that the P-genes in general are more likely to be Eimeria-specific, accounting for their lower support from similarity. Therefore, we cannot dismiss the majority of P-gene predictions as erroneous, although we suspect that mispredictions are more common among these than among the R-genes. E. tenella genes are typically much larger than those of Theileria annulata, P. falciparum or C. parvum (Table 1), owing mainly to numerous long introns, leading to a lower overall coding density. T. gondii has an equally low protein-coding fraction (and, like Eimeria, a relatively large genome), although this is due more to large intergenic regions than to large introns. We note that there is little apparent synteny between E. tenella Chromosome 1 and the genomes of other sequenced Apicomplexans (see Supplemental Material). No tRNAs or other non-protein transcripts were identified, but, as this chromosome represents only 1/60th of the genome, we assume that they lie on other chromosomes.
Simple-sequence repeats in predicted proteins and in expressed sequences Low-complexity amino acid tracts (e.g., LLLLQLLLLQLL LQQLLLLLLLQLLLLLQ) are also common, and are encoded by (CAG)n nucleotide repeats that have undergone frameshifting mutations or inversions. Other nucleotide simple-sequence repeats are rarely found in the predicted exons: neither the telomere-like repeat (AGGGTTT)n nor its complement occurs. These simple-sequence amino acid repeats and low-complexity tracts are confined almost exclusively to the proteins in the repeat-rich R-segments: only one P-segment protein contains such a region. Of the 90 predicted R-proteins, two-thirds (58) contain one or more repeats or low-complexity tracts, and most of these proteins contain several. We see no obvious general differences in the functions of repeat-containing and repeat-free proteins; however, the number of proteins with well-established functions is probably too small to make such a distinction. It is important to determine whether this abundance of apparent coding repeats is reflected in transcripts. Among the R-genes, support for predictions from EST or BLAST data is about equal for the repeat-containing transcripts (16/58, or 28% supported) and the non-repeat-containing transcripts (11/32, or 34% supported). Repeats are also abundant in ESTs from all developmental stages of E. tenella, and in other species of Eimeria (see Supplemental Material). We are therefore confident that abundant amino acid repeats are a real feature of Eimeria proteins, and not an artifact of the gene prediction process.
Interstrain variation associated with R-segments
We tried to determine whether the strain-to-strain variation in Chr1 was associated with the repeat-rich R-segments or with the repeat-poor P-segments. Present genetic maps of E. tenella (Shirley and Harvey 2000 The results are shown in Figure 3 and summarized in Table 2. None of the four P-segment probes revealed any difference between the three strains when digested with any of the five restriction enzymes tested. In marked contrast, all four R-segment probes revealed differences between the strains in one of the five digests.
Eimeria species are the most important global parasites of intensively reared livestock, causing coccidiosis in poultry, cattle, and sheep. Coccidiosis has most impact in the intensive poultry industry, where all flocks become infected with some or all of the seven species of avian Eimeria. As well as having a large economic impact, coccidiosis is a severe welfare problem causing weight loss, diarrhea, hemorrhage, anemia, and death. Resistance to all classes of anticoccidial drugs is universal, no new drugs are in the pipeline, and live-attenuated vaccines are relatively expensive. It is hoped, therefore, that genomic and post-genomic analysis will offer new insights into the biology of this parasite and reveal new targets for the development of drugs and vaccines.
The most striking feature of the chromosome is its segmented organization, which is reflected in all aspects of its content. About half the chromosome is in R-segments, which are rich in repeats of several types; the remainder is in P-segments, which are relatively featureless. An important question is whether this segmental organization is typical of the remainder of the genome or peculiar to this, the smallest chromosome. We find (data not shown) a closely similar segmentation on Chr2: a preliminary assembly reveals two R-segments, each of
We suspect that the unusual genome organization may serve to facilitate rapid evolution and diversification, a strategy of benefit to a parasite. Rearrangements within R-segments may be facilitated by the abundant CAG repeats between and within genes. E. tenella Chr1 has none of the characteristic subtelomeric regions that, in parasites such as P. falciparum and Trypanosoma brucei, harbor dynamic populations of genes involved in evading the host immune system (Gardner et al. 2002
In support of this model, strong evidence for genome plasticity comes from known size polymorphisms between different strains of E. tenella (Shirley 1994 Turning to the predicted proteins, a question remains over the accuracy of predictions in the P-segments, which have less support from interspecies similarity than do those in the R-segments. However, we cannot dismiss these as mis-predictions: perhaps the P-genes are peculiar to Eimeria accounting for the paucity of BLAST and EST support. We hope that this question will be resolved by analysis of the whole-genome shotgun data and by further studies of Eimeria transcripts.
The R-segment genes are more robustly predicted, and their proteins are striking in containing numerous repetitive amino acid tracts arising from CAG repeats within exons. There is little doubt that these tracts are real (and not artifacts of misprediction), as they are abundant in the ESTs of several Eimeria species (A. Gruber and A.M.B.N. Madeira, unpubl.). Although amino acid repeats are rare in vertebrate proteins (and are often associated with disease) (Gatchel and Zoghbi 2005 In conclusion, the segmental organization and repeat-richness of this chromosome (and, apparently, of the rest of the genome) raises many questions, only some of which can be tentatively answered. Comparison with other isolates of E. tenella or with other members of the genus may reveal whether this organization is associated with a dynamic, adaptable genome as we postulate. Forthcoming genome-wide analysis of E. tenella (albeit a shotgun sequence rather than assembled chromosomes) may more clearly reveal broad differences between the genes that populate the P- and R-segments. Finally, we note that the unusual long-range organization of this chromosome presents a strong argument for the completion and assembly of genome sequences beyond the shotgun level.
HAPPY mapping HAPPY mapping was performed on a panel of subgenomic aliquots of DNA prepared from E. tenella Houghton, essentially as described previously (Konfortov et al. 2000
Library construction
Sequencing, assembly, and gap closure Contigs were ordered based on at least two consistent paired reads before being subjected to BLASTN against BAC-end (ftp.sanger.ac.uk/pub/pathogens/Eimeria/tenella/BAC/) and fosmid-end (ftp.sanger.ac.uk/pub/pathogens/Eimeria/tenella/fosmid/) sequences. Relevant BAC and fosmid clones were obtained from the WTSI, and the clones were sized using PFGE. BAC-end and fosmid-end sequences and HAPPY markers were used to order the contigs into scaffolds as well as superscaffolds. Sequence gaps were closed where possible using primer walking, and difficult regions were sequenced using alternative chemistries. A variety of approaches (including subcloning and transposon-mediated sequencing of bridging BACs or fosmids, or long-range PCR) were also used to close gaps.
Prediction of transcripts, generation of gene models, and annotation
Multiple alignment between E. tenella putative open reading frames and apicomplexan counterparts was performed to identify protein motifs conserved in apicomplexan homologs but missing from the initial E. tenella gene models. Each model was edited with reference to EST matches from Eimeria spp. using Exonerate (Slater and Birney 2005
Additional models (not originating from automated predictions) were generated from EST mapping using a "sim4" output parsed with a custom Perl script (X. Wu, unpubl.) applying the following criteria: (1) The aligned part of the EST must be a single continuous sequence. (2) This aligned sequence must be at least two-thirds the length of the EST. (3) If the aligned part of the genomic DNA is also continuous (a single-exon alignment), the identity of the alignment must be >90%, and it must be
Signal peptides and transmembrane helices were predicted using Phobius (Kall et al. 2004
tRNAscan-SE (Lowe and Eddy 1997
Analysis of simple tandem repeats
Information content analysis
Other analyses
Restriction-fragment length polymorphism analysis
Five hundred nanograms of each genomic DNA (Houghton, Wisconsin, or Weybridge strains, purified as above) was digested using BamHI, BglII, EcoRI, HindIII, or XhoI (Invitrogen) and resolved by gel electrophoresis (1.0% [w/v] agarose in 1x TBE, 20 V, 16 h). DNA was Southern blotted and hybridized with the relevant labeled probes using standard protocols (Sambrook and Russell 2001
The Malaysian investigators thank Nor Muhammad Mahadi for his contribution and the staff of the Malaysia Genome Institute for their support. The Sanger Institute investigators thank David Harper and Paul Mooney for assistance with sequence analysis and exchange, and Bob Plumb for assistance with sequencing. P.H.D. thanks Derek Gatherer for helpful discussions on information content analysis. This work was supported by a Top-Down Grant (IRPA 09-02-02-002-BTK/TD/003) from the Ministry of Science, Technology and Innovation (MOSTI) in Malaysia and by FAPESP and CNPq in Brazil. The investigators of this study at the Sanger Institute were supported by the Wellcome Trust through their funding of the Pathogen Sequencing Unit; the sequencing was partly funded by BBSRC grant S17754.
10 Present address: Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA.
11 These authors contributed equally to this work.
E-mail phd@mrc-lmb.cam.ac.uk; fax 44-1-223-412-178. [Supplemental material is available online at www.genome.org. Software is available at http://www.mrc-lmb.cam.ac.uk/happy/HappyGroup/happyhomepage.html. The sequence data from this study have been submitted to EMBL under accession no. AM269894.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5823007
Anders, R.F.. 1986. Multiple cross-reactivities amongst antigens of Plasmodium falciparum impair the development of protective immunity against malaria. Parasite Immunol. 8: 529539.[Medline] Bankier, A.T., Spriggs, H.F., Fartmann, B., Konfortov, B.A., Madera, M., Vogel, C., Teichmann, S.A., Ivens, A., and Dear, P.H. 2003. Integrated mapping, chromosomal sequencing and sequence analysis of Cryptosporidium parvum. Genomics 1: 17871799. Benson, G.. 1999. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 27: 573580. Berriman, M., Ghedin, E., Hertz-Fowler, C., Blandin, G., Renauld, H., Bartholomeu, D.C., Lennard, N.J., Caler, E., Hamlin, N.E., and Haas, B., et al. 2005. The genome of the African trypanosome Trypanosoma brucei. Science 309: 416422. Blake, D.P., Smith, A.L., and Shirley, M.W. 2003. Amplified fragment length polymorphism analyses of Eimeria spp.: An improved process for genetic studies on recombinant parasites. Parasitol. Res. 90: 473475.[CrossRef][Medline] Bonfield, J.K., Smith, K.F., and Staden, R. 1995. A new DNA sequence assembly program. Nucleic Acids Res. 23: 49924999. Chapman, H.D. and Shirley, M.W. 2003. The Houghton strain of Eimeria tenella: A review of the type strain selected for genome sequencing. Avian Pathol. 32: 115127.[CrossRef][Medline] Eichinger, L., Pachebat, J.A., Glöckner, G., Rajandream, M.-A., Sucgang, R., Berriman, M., Song, J., Olsen, R., Szafranski, K., and Xu, Q., et al. 2005. The genome of the social amoeba Dictyostelium discoideum. Nature 435: 4357.[CrossRef][Medline] Figueiredo, L. and Scherf, A. 2005. Plasmodium telomeres and telomerase: The usual actors in an unusual scenario. Chromosome Res. 13: 517524.[CrossRef][Medline] Gardner, M.J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R.W., Carlton, J.M., Pain, A., Nelson, K.E., and Bowman, S., et al. 2002. Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419: 498511.[CrossRef][Medline] Gatchel, J.R. and Zoghbi, H.Y. 2005. Diseases of unstable repeat expansion: Mechanisms and common principles. Nat. Rev. Genet. 6: 743755.[Medline] Glöckner, G., Eichinger, E., Szafranski, K., Pachebat, J.A., Bankier, A.T., Dear, P.H., Lehmann, R., Baumgart, C., Parra, G., and Abril, J.F., et al. 2002. Sequence and analysis of chromosome 2 of Dictyostelium discoideum. Nature 418: 7985.[CrossRef][Medline] Griffiths-Jones, S., Bateman, A., Marshall, M., Khanna, A., and Eddy, S.R. 2003. Rfam: An RNA family database. Nucleic Acids Res. 31: 439441. Hall, N., Pain, A., Berriman, M., Churcher, C., Harris, B., Harris, D., Mungall, K., Bowman, S., Atkin, R., and Baker, S., et al. 2002. Sequence of Plasmodium falciparum chromosomes 1, 39 and 13. Nature 419: 527531.[CrossRef][Medline] International Human Genome Sequencing Consortium 2001. Initial sequencing and analysis of the human genome. Nature 409: 860921.[CrossRef][Medline] Kall, L., Krogh, A., and Sonnhammer, E.L. 2004. A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338: 10271036.[CrossRef][Medline] Konfortov, B.A., Cohen, H.M., Bankier, A.T., and Dear, P.H. 2000. A high-resolution HAPPY map of Dictyostelium discoideum chromosome 6. Genome Res. 10: 17371742. Korf, I.. 2004. Gene finding in novel genomes. BMC Bioinformatics 5: 59.[CrossRef][Medline] Kronegg, J. and Buloz, D. 1999. Detection/prediction of GPI cleavage site (GPI-anchor) in a protein (DGPI) Retrieved July 8, 2003 from http://129.194.185.165/dgpi/. Li, L., Brunk, B.P., Kissinger, J.C., Pape, D., Tang, K., Cole, R.H., Martin, J., Wylie, T., Dante, M., and Fogarty, S.J., et al. 2003. Gene discovery in the Apicomplexa as revealed by EST sequencing and assembly of a comparative gene database. Genome Res. 13: 443454. Lowe, T.M. and Eddy, S.R. 1997. tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25: 955964. Majoros, W.H., Pertea, M., and Salzberg, S.K. 2004. TigrScan and GlimmerHMM: Two open source ab initio eukaryotic gene-finders. Bioinformatics 20: 28782879. Ng, S.-T., Jangi, M.S., Shirley, M.W., Tomley, F.M., and Wan, K.-L. 2002. Comparative EST analyses provide insights into gene expression in two asexual developmental stages of Eimeria tenella. Exp. Parasitol. 101: 168173.[CrossRef][Medline] Pasechnik, A., Mylläri, A., and Salakoski, T. 2005. Dynamical visualization of the DNA sequence and its nucleotide content. In Proceedings of KRBIO 05, International Symposium on Knowledge Representation in Bioinformatics (eds. C. Bounsaythip et al.), pp. 4750, Espoo, Finland. Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., and Lopez, R. 2005. InterProScan: Protein domains identifier. Nucleic Acids Res. 33: W116W120. Rutherford, K., Parkhill, J., Crook, J., Horsnell, T., Rice, P., Rajandream, M.-A., and Barrell, B. 2000. Artemis: Sequence visualization and annotation. Bioinformatics 16: 944945. Sambrook, J. and Russell, D. 2001. Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 3d ed. Schofield, L.. 1991. On the function of repetitive domains in protein antigens of Plasmodium and other eukaryotic parasites. Parasitol. Today 7: 99105.[CrossRef][Medline] Shirley, M.W.. 1994. The genome of Eimeria tenella: Further studies on its molecular organisation. Parasitol. Res. 80: 366373.[CrossRef][Medline] Shirley, M.W. and Harvey, D.A. 2000. A genetic linkage map of the apicomplexan protozoan parasite Eimeria tenella. Genome Res. 10: 15871593. Shirley, M.W., Kemp, D.J., Pallister, J., and Prowse, S.J. 1990. A molecular karyotype of Eimeria tenella as revealed by contour-clamped homogenous electric field gel electrophoresis. Mol. Biochem. Parasitol. 38: 169174.[CrossRef][Medline] Shirley, M.W., Ivens, A., Gruber, A., Madeira, A.M., Wan, K.L., Dear, P.H., and Tomley, F.M. 2004. The Eimeria genome projects: A sequence of events. Trends Parasitol. 20: 199201.[CrossRef][Medline] Slater, G.S. and Birney, E. 2005. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 15: 31. Sobreira, T.J., Durham, A.M., and Gruber, A. 2006. TRAP: Automated classification, quantification, and annotation of tandemly repeated sequences. Bioinformatics 22: 361362. Tyzzer, E.E.. 1929. Coccidiosis in gallinaceous birds. Am. J. Hyg. 10: 269283. Verstrepen, K.J. and Klis, F.M. 2006. Flocculation, adhesion and biofilm formation in yeasts. Mol. Microbiol. 60: 515.[CrossRef][Medline] Verstrepen, K.J., Jansen, A., Lewitter, F., and Fink, G.R. 2005. Intragenic tandem repeats generate functional variability. Nat. Genet. 37: 986990.[CrossRef][Medline] Wan, K.-L., Chong, S.-P., Ng, S.-T., Shirley, M.W., Tomley, F.M., and Jangi, M.S. 1999. A survey of genes in Eimeria tenella merozoites by EST sequencing. Int. J. Parasitol. 29: 18851892.[CrossRef][Medline]
Received August 1, 2006; accepted in revised format January 3, 2007.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||