Genome Research Econo tag

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 13:1552-1553, 2003
©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Batalov, S.
Right arrow Articles by Fletcher, C. F.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Batalov, S.
Right arrow Articles by Fletcher, C. F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Abstract

Exon Structure Analysis, Ortholog Identification, and SNP Candidate Screening by Mapping Mouse RIKEN Sequences to Multiple Genome Assemblies

Serge Batalov1 and Colin F. Fletcher

Genomics Institute of the Novartis Research Foundation (GNF), San Diego, California 92121, USA

Mapping RIKEN full-length cDNAs (The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team 2002Go) to the genome assemblies enables a variety of analyses to be performed. First, exon structure can be determined, and coding and noncoding regions can be inferred, due to different average exon length. In some cases, the alternative exon structures may be identified at this stage. Second, chromosome position can be used to identify the correct ortholog in human, for which functional data may be available. Third, intronless genes can be identified and examined carefully to determine whether they are retransposition events, pseudogenes, or genomic contamination. Finally, high-quality sequence discrepancies can be identified as potential SNPs by use of the fact that RIKEN and Mouse Genome Sequencing Consortium sequenced the C57BL/6J mouse strain, whereas the four mouse strains sequenced by Celera included 129X1/SvJ, 129S1/SvImJ, DBA/2J, and A/J.

We have mapped 60,770 RIKEN clones by BLAT (Kent 2002Go) to the MGSC (Mouse Genome Sequencing Consortium 2002Go) genome assembly versions 1 and 3, Ensembl human genome assembly v.28 and the Celera mouse genome assembly releases R12 and R13, and human assembly Release R26i (http://cds.celera.com/; Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. The Counts of Unambiguously Mapped RIKEN Clones

 

The single-exon clones longer than 1 Kb are candidates for further investigation as bona fide intronless genes, retransposition events, or possible genomic DNA contamination. For this investigation, the expression profile of an intronless clone can be very informative (Su et al. 2002Go).

This analysis also allows us to roughly compare the completeness of the assemblies. Of 60,172 RIKEN cDNAs containing >100 non-masked bases, >99% were mapped at >70% length to both latest assemblies, much more complete than the earlier assemblies.

Figure 1 illustrates the comparison of the mapping to four assemblies. MGSC v.3 in green as left bars, Celera R13 in blue as right bars for each chromosome. Where RIKEN clones mapped to both assemblies, a cyan line connects the mapping positions, whereas triangles mark clones mapped exclusively to one assembly. The large-scale discrepancies are marked in red. One can observe a 10-Mb contig inversion on chromosome X (later detected and corrected by MGSC) and smaller ones on chr.5, 17, 18, 19, etc. The up-to-date scalable version of the mapping comparison is available at http://www.gnf.org/RIKEN/.



View larger version (78K):
[in this window]
[in a new window]
 
Figure 1 Mapping of RIKEN Clones. (Green) MGSC v.3; (blue) CMGD R13.

 

The extra bars represent superimposed syntenic regions identified by mapping to human assemblies, NCBI v.28 left from MGSC v.3, Celera R26h right from the mouse Celera R13. The two-digit electric color code for human chromosomes is shown at the bottom. Several cases of different syntenic assignment deserve further investigation.

Footnotes

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1458903.

1 Corresponding author.
E-MAIL batalov{at}gnf.org; FAX (858) 812-1570. Back

REFERENCES

The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and Phase II Team. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.[CrossRef][Medline]

Kent, W.J. 2002. BLAT—The BLAST-Like Alignment Tool. Genome Res. 12:656 -664.[Abstract/Free Full Text]

Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520 -562.[CrossRef][Medline]

Su, A.I., Cooke, M.P., Ching, K.A., Hakak, Y., Walker, J.R., Wiltshire, T., Orth, A.P., Vega, R.G., Sapinoso, L.M., Moqrich, A., et al. 2002. Large-scale analysis of the human and mouse transcriptomes. Proc. Natl. Acad. Sci. 99:4465 -4470.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?



This Article
Right arrow Extract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Batalov, S.
Right arrow Articles by Fletcher, C. F.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Batalov, S.
Right arrow Articles by Fletcher, C. F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2003 by Cold Spring Harbor Laboratory Press.