|
|
|
|
Genome Res. 13:1244-1249, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Resources Internet Contig Explorer (iCE)A Tool for Visualizing Clone Fingerprint MapsGenome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada
Fingerprinted clone physical maps have proven useful in various applications, supporting both whole-genome and region-specific DNA sequencing as well as gene cloning studies. Fingerprint maps have been generated for several genomes, including those of human, mouse, rat, the nematodes Caenorhabditis elegans and Caenorhabditis briggsae, Arabidopsis thaliana and rice. Fingerprint maps of other genomes, including those of fungi, bacteria, poplar, and the cow, are being generated. The increasing use of fingerprint maps in genomic research has spawned a need in the research community for intuitive computer tools that facilitate viewing of the maps and the underlying fingerprint data. In this report we describe a new Java-based application called iCE (Internet Contig Explorer) that has been designed to provide views of fingerprint maps and associated data. Users can search for and display individual clones, contigs, clone fingerprints, clone insert sizes and markers. Users can also load into the software lists of particular clones of interest and view their fingerprints. iCE is being used at our Genome Centre to offer up to the research community views of the mouse, rat, bovine, C. briggsae, and several fungal genome bacterial artificial chromosome (BAC) fingerprint maps we have either completed or are currently constructing. We are also using iCE as part of the Rat Genome Sequencing Project to manage our provision of rat BAC clones for sequencing at the Human Genome Sequencing Center at the Baylor College of Medicine.
DNA clone fingerprint maps (Olson et al. 1986
The software most heavily used for analysis of fingerprint data and display of fingerprint maps is FPC (Soderlund 1997 There are several existing services available to view physical maps via the Internet. Web-FPC offers a limited view of physical maps similar to FPC for maps such as rice, maize, sorghum, zebrafish, and Arabidopsis thaliana (http://www.genome.arizona.edu/software/fpc/). Other sites such as Ensembl (http://www.ensembl.org/) and NCBI (http://www.ncbi.nlm.nih.gov/mapview/) also offer views of BAC maps integrated with sequence and other information. However, these existing tools did not provide the full functionality we desired and were not adaptable to the more distant future needs we foresaw. Therefore, the Internet Contig Explorer (iCE) was devised to fill this need. The aim of iCE was not to recreate the broad scope of functions available already in FPC. Instead, our goal was to provide a viewing system sufficient to satisfy most of the investigators who wished to browse the fingerprint data and the maps built from them without the requirement and overhead of downloading and updating datasets. In designing iCE we considered our previous interactions with investigators and the most frequently requested types of information. As well, we found we had novel requirements for managing our provision of rat BAC clones for sequencing at the Human Genome Sequencing Center at the Baylor College of Medicine. Here we describe the design and implementation of iCE and illustrate some of the features of the software.
Software Design The iCE system was designed to meet the immediate needs of users to access physical map data, and provide an easily maintained and extensible platform capable of future expansion. For this reason, the Java programming language was chosen for developing the software and an SQL database was chosen for data storage. Both of these technologies are widely used in the software development community and provide robust and well-developed tools for development of the iCE system.
The iCE system is composed of two parts: a client Java application and an SQL database. The client application runs on the user's machine, accessing data stored remotely on the database server. The SQL data originates from one or more FPC (Soderlund 1997
Features On start-up, iCE connects via the Internet to the database server and downloads the list of clone names, contigs, and markers for the selected database and displays this information in the main viewing frame (see Fig. 1). In addition, user-defined lists are also shown: These user lists are arbitrary lists of clones not necessarily associated with a particular contig or marker. For example, these lists may represent clones selected for DNA sequencing. The main viewing frame also contains an Options tab where the user may customize the displays, for example, specifying the maximum number of clones to be displayed, allowing database comments to be shown or changing the magnification of the electrophoretic gel images.
The names of contigs, clones, markers, and user-defined lists are shown in list boxes on the main window (Fig. 1). To view an item from one of these lists, the user selects an item from the list with the mouse or types the name in the appropriate text field. When first requested, a contig, individual clone, or user list will be displayed in a new display window. If the item has already been displayed, the display will be brought to the front. If a contig is requested, the contig and associated data are downloaded. When a clone is selected, if the clone is already displayed in a contig, the contig window is brought to the front of the other windows and the clone is highlighted. Otherwise, data for the single clone is downloaded and the clone is displayed within a list window, titled Miscellaneous. For a selected marker, a list of all contigs containing clones associated with the marker are determined from the database. The user is then prompted to select contigs from the list to display.
Each contig or list of clones is displayed in a separate contig display frame, as show in Figure 2. Each contig display is listed in the drop-down box at the top of the main frame, allowing for convenient navigation between contigs. The left side of a contig display frame is divided horizontally into three areas. The area at the top displays the clones in the contig as colored boxes with clone names. The left and right ends of the boxes indicate the position of the clone as determined by consensus band map position in the original FPC database (Soderlund et al. 1997
Pop-up menus with additional options for modifying the clone and contig displays are called up by right clicking on a clone box or within the contig display, respectively. Buried clones may be displayed or hidden; the vertical order of the clones can be sorted by name, size, left or right position within the contig, or the position specified in a user list. Clones may also be copied to new, temporary contig displays to reduce the number of clones on an individual display and allow a more flexible comparison of restriction fragment locations between clones that may not be located in a single contig (described below). Detailed information for clones (Fig. 3) and contigs (Fig. 4) are also available from these pop-up menus, described below. The contig for a particular clone can also be requested from a pop-up menu, allowing convenient navigation from clones in user lists to their respective contigs.
The right side of a contig display contains the electrophoretic gel images (Marra et al. 1997 Restriction fragments of interest to the user can be marked by clicking near the colored lines; marked restriction fragments are indicated with an "x". These marked fragments are also identified on the clone details display so the size and mobility of marked fragments can be determined. Normally, the positions of restriction fragments for different clones are only compared for clones within the same contig as described above. To determine the restriction fragments shared between clones that are not in the same contig, the user can copy these clones to a separate display and perform the comparison using all clones on the display, regardless of position of the clone in the contig or similarity between clones. This is done using the "Confirm bands using all" button at the top of the gel images. This allows the user to determine shared restriction fragments between clones in arbitrary lists (such as user lists) and to quickly determine the number and size of shared restriction fragments between clones. To view detailed information for a clone (Fig. 3), the user selects "Show details" from the pop-up menu for a clone box. This display includes the number of restriction fragments (bands on the details display), position in the contig, markers, the parent clone (burying clone), and any underlying buried clones. The bottom of the display contains a table of the restriction fragment sizes and mobilities. The total size of the restriction fragments and the size of the fragments on selected rows are shown. Two columns of the table indicate unconfirmed fragments and fragments marked by the user on the gel image. Buttons above the table allow the user to select all fragments that are confirmed, unconfirmed, and marked. These buttons allow the user to quickly identify sizes of marked restriction fragments and size of restriction fragments that are unique to a clone, and probably represent DNA not found in neighboring clones. To view detailed information for a contig (Fig. 4), the user selects "Show statistics" from the pop-up menu for the contig (this menu is raised when the user right-clicks the contig background). This display includes all data available for the contig and the clones it contains. The top display shows information specific to the contig as generated by FPC, and statistics on the number of clones and the sizes and number of bands. The table at the bottom of the display shows information for each clone on a separate line. This includes clone name, left and right position in the contig, buried status, any clones that are buried within it and its parent clone (burying clone), size, and number of bands (restriction fragments), as well as the mobilities of its restriction fragments. Columns of data can be suppressed to prevent displaying undesired information. This data can be printed and also written to file for use by external software in a customizable format. For example, the restriction fragment mobilities for a selection of clones can be written to a file and read by spreadsheet software such as Excel or StarOffice.
We have described iCE, a new software system for viewing clone fingerprint mapping data at the British Columbia Cancer Agency Genome Sciences Centre and elsewhere. There are now maps for ten organisms available via iCE, and these are being used by the biological community. In the period from October 2001 to February 2002, over ninety different external users have accessed iCE databases, in more than five hundred sessions. iCE was intended to incur low maintenance costs by implementing two strategies: The iCE client was written using the Java programming language in an object-oriented paradigm, and the database system uses the industry-standard SQL protocol. Since its initial conception, the iCE system has undergone frequent changes in features and functionality without requiring significant changes to existing database structure or code. Work is underway to extend iCE in several directions. Most important are efforts to improve performance in speed of data access and display responsiveness. It is also desirable to allow users to continue to work with the iCE client without a constant Internet connection. Features continue to be added to allow for more convenient viewing and rearranging of data, as well as better management of arbitrary lists of clones to be used, for example, in a sequencing pipeline. The iCE software and comprehensive documentation are available at http://ice.bcgsc.ca. The iCE source code and related code for data-base management is available from the authors under license, at no cost, for academic use.
The iCE client application is written in Java 2 using the Java Development Kit (JDK) 1.3.1 (http://java.sun.com). The Java Runtime Environment (JRE) 1.3.1 is required on the client machine. The Borland JBuilder 4.0 development environment was used for code development (http://www.borland.com). The iCE database uses MySQL DBMS Ver 8.19 (http://www.mysql.com). The iCE client has been used successfully on Linux and Microsoft Windows 2000 on computers with an Intel Pentium III processor and 512 MB RAM, and on Apple computers running MacOS X. All contig, clone, and marker data originate in FPC databases and are converted to the SQL format using a custom application (fpc_sql) written in the C programming language.
We thank the many people who contributed to testing and implementation of iCE at the Genome Sciences Centre. Thanks to Justin Muir, Kirk Schoeffel, and Martin Krzywinski for installing the iCE Web server. Thanks also to Steven Ness for useful comments on an early version of the manuscript and to Mike Holman at Washington University Genome Sequencing Center for helpful early discussions. This work was funded by the National Human Genome Research Institute (USA). We are grateful to the staff of the British Columbia Cancer Agency Genome Sciences Centre for expert technical and administrative assistance. M.A.M. is a Michael Smith Foundation for Health Research Scholar. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.819303.
1 Corresponding author.
Coe, E., Cone, K., McMullen, M., Chen, S.-S., Davis, G., Gardiner, J., Liscum, E., Polacco, M., Paterson, A., Sanchez-Villeda, H. et al. 2002. Access to the maize genome: An integrated physical and genetic map. Plant Physiol.
128:
912.
Coulson, A., Sulston, J., Brenner, S., and Karn, J. 1986. Toward a physical map of the genome of the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. 83:
78217825. Coulson, A., Kozono, Y., Lutterbach, B., Shownkeen, R., Sulston, J., and Waterston, R. 1991. YACS and the C. elegans genome. BioEssays 13: 413417.[CrossRef][Medline] Gregory, S.G., Sekhon, M., Schein, J., Zhao, S., Osoegawa, K., Scott, C.E., Evans, R.S., Burridge, P.W., Cox, T.V., Fox, C.A., et al. 2002. A physical map of the mouse genome. Nature 418: 743750.[CrossRef][Medline]
Klein, P.E., Klein, R.R., Cartinhour, S.W., Ulanch, P.E., Dong, J., Obert, J.A., Morishige, D.T., Schlueter, S.D., Childs, K.L., Ale, M. et al. 2000. A High-throughput AFLP-based method for constructing integrated genetic and physical maps: Progress toward a sorghum genome map. Genome Res. 10:
789807.
Mao, L., Wood, T.C., Yu, Y., Budiman, M.A., Tomkins, J., Woo, S., Sasinowski, M., Presting, G., Frisch, D., Goff, S., et al. 2000. Rice transposable elements: A survey of 73,000 sequence-tagged-connectors Genome Res. 10:
982990.
Marra, M.A., Kucaba, T.A., Dietrich, N.L., Green, E.D., Brownstein, B., Wilson, R.K., McDonald, K.M., Hillier, L.W., McPherson, J.D., and Waterston, R.H. 1997. High-throughput fingerprint analysis of large-insert clones. Genome Res.
7:
10721084. Marra, M.A., Kucaba, T., Sekhon, M., Hillier, L., Martienssen, R., Chinwalla, A., Crockett, J., Fedele, J., Grover, H., Gund, C. et al. 1999. A map for sequence analysis of the Arabidopsis thaliana genome Nat. Genet. 22: 265270.[CrossRef][Medline] McPherson, J.D., Marra, M., Hillier, L., Waterston, R.H., Chinwalla, A., Wallis, J., Sekhon, M., Wylie, K., Mardis, E.R., Wilson, R.K. et al. 2001. A physical map of the human genome. Nature 409: 934941.[CrossRef][Medline]
Olson, M.V., Dutchik, J.E., Graham, M.Y., Brodeur, G.M., Helms, C., Frank, M., MacCollin, M., Scheinman, R., and Frank, T. 1986. Random clone strategy for genomic restriction mapping in yeast. Proc. Natl. Acad. Sci.
83:
78267830. Schein, J.E., Tanger, K.L., Chiu, R., Shin, H., Lengeler, K.B., MacDonald, W.K., Bosdet, I., Heitman, J., Jones, S.J., Marra, M.A., et al. 2002. Physical maps for genome analysis of serotype A and D strains of the fungal pathogen Cryptococcus neoformans. Genome Res. 9: 14451453. Soderlund, C., Longden, I. and Mott, R. 1997. FPC: a system for building contigs from restriction fingerprinted clones. CABIOS 13: 523535. Soderlund, C., Humphray, S., Dunham, I. and French, L. 2000. Contigs built with fingerprints, markers, and FPC V4.7. Genome Res. 11: 934941. Sulston, J., Mallett, F., Staden, R., Durbin, R., Horsnell, T., and Coulson, A. 1988. Software for genome mapping by fingerprinting techniques. CABIOS 4: 125132.
Zhu, H., Blackmon, B.P., Sasinowski, M., and Dean, R.A., 1999. Physical map and organization of chromosome 7 in the rice blast fungus, Magnaporthe grisea. Genome Res.
9:
739750.
http://www.genome.clemson.edu/fpc/; Web-based FPC physical maps. http://www.genome.arizona.edu/software/fpc/; FPC home page. http://www.ensembl.org/; Ensembl genome browser. http://www.ncbi.nlm.nih.gov/mapview/; NCBI Map Viewer. http://ice.bcgsc.ca; iCE home page. http://java.sun.com; JAVA home page, Sun Microsystems Inc. http://www.borland.com; Borland Software Corp. (JBuilder vendor). http://www.mysql.com; MySQL open source SQL database home page.
Received September 17, 2002;
accepted in revised format March 14, 2003.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||