|
|
|
|
Vol. 12, Issue 10, 1619-1623, October 2002
RESOURCES
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
The Conserved Domain Architecture Retrieval Tool (CDART) performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins. The algorithm finds protein similarities across significant evolutionary distances using sensitive protein domain profiles rather than by direct sequence similarity. Proteins similar to a query protein are grouped and scored by architecture. Relying on domain profiles allows CDART to be fast, and, because it relies on annotated functional domains, informative. Domain profiles are derived from several collections of domain definitions that include functional annotation. Searches can be further refined by taxonomy and by selecting domains of interest. CDART is available at http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi.
| |
INTRODUCTION |
|---|
|
|
|---|
The public release of multiple genomes has led to
a large amount of sequence data that requires
increasing expertise to query and understand. As of this writing, the
NCBI Entrez Protein Database (Wheeler et al. 2002
) contains more than
800,000 nonredundant protein sequences. This data growth is further
complicated by the fact that experimental evidence about proteins has
lagged the rapid growth of sequence data, leading to incorrect or
insufficiently precise annotation. Because many of these new sequences
are predicted, they are often labeled solely by sequence similarity.
This can lead to an incorrect inference if the annotator does not take into account factors such as the extent of the sequence similarity and
its relationship to functional domains and residues (Ponting and
Dickens 2001
) or if the similar protein is incorrectly annotated itself. Additionally, sequence similarity search algorithms, such as
BLAST and PSI-BLAST (Altschul et al. 1997
), implicitly deal with
functional domains, whereas explicit domain annotation can be of great
use in understanding homology, especially when searching iteratively.
One potential solution to these problems is to create new search
algorithms that allow scientists to efficiently and accurately
comprehend similarity based on functional domains.
Most proteins are composed from a finite lexicon of evolutionarily
conserved functional domains. Several efforts are under way to create
comprehensive databases of protein domains, including Pfam (Bateman et
al. 2002
), SMART (Letunic et al. 2002
), and CDD (Marchler-Bauer et al.
2002
). Figure 1 displays the number of proteins discovered each year and the number of presently known nonredundant domains found in these proteins. Although the number of
proteins discovered grows at increasingly higher rates, the number of
domains found appears to be asymptotically reaching a limit. Even with
a finite number of domains, however, efficiently searching for domains
in a large number of sequences can be computationally expensive.
Fortunately, a fast algorithm to find domains in sequences has been
developed, RPS-BLAST (reverse-position-specific BLAST; Marchler-Bauer
et al. 2002
).
|
CDART is a Web-based tool that uses the domain definitions and annotations from the CDD database (which imports alignments from SMART and Pfam) and RPS-BLAST to allow users to rapidly query the Entrez Protein database by domain. To use the tool, a user enters a protein of interest, and RPS-BLAST is run on the protein to deduce its domain architecture. This domain architecture is then used to query CDART's database to find proteins with similar domain architectures, and the resulting proteins are displayed in a compact list. To show how distant homologies can be easily found, we use the tumor suppressor BRCA1 as an example.
| |
RESULTS AND DISCUSSION |
|---|
|
|
|---|
An Example
The CDART query page can be found on the Internet at http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi. On this page, one may enter a sequence accession or FASTA formatted sequence. For this example we enter the accession for human BRCA1, NP_009225. Pressing the search button runs RPS-BLAST on the sequence, comparing the sequence to the domain definitions in the CDD database. The search completes in a few seconds, and the results are displayed in Figure 2.
|
The top section of the results Web page displayed in Figure 2 shows the domains found in BRCA1 using a beads-on-a-string style. The domains are an N-terminal zinc finger and two C-terminal BRCT domains. This domain annotation, which is taken from CDD, immediately indicates that this protein binds to DNA and that it interacts with other proteins via the BRCT domain, information particularly useful if the function of BRCA1 had not been known.
The middle section of the Web page lists domain architectures of proteins found in Entrez that contain at least one domain found in the query sequence. These architectures are defined by the sequence of unique domains, where sequentially repeated domains are collapsed into a single occurrence of the domain. This culling of repeated domains is done for several reasons: repeats may be duplicated more easily than other types of domain insertions; the choice of the beginning residue of a repeat can be arbitrary and affects the number of repeats found; and the number of repeats included in the definition of a domain can cause variation in the number of repeats found in a hit. The domain architectures are ranked by the total number of domain clusters in common with the query. A better ranking would take into account the evolution of domain architectures. Unfortunately, the evolution of domain architectures remains a subject of research at present, and, to our knowledge, no reconstruction has yet been encoded in a way that CDART could use for ranking.
If there is more than one protein with a given architecture, the group is represented by an example sequence and description. Clicking on the description results in a Web page that lists all proteins in the group. The "more>" link next to each protein runs RPS-BLAST on the protein to give a detailed sequence alignment to the domain definitions.
The list of similar proteins extends over several pages and can be examined by clicking on the page numbers at the bottom of the page. In this case, >800 proteins are found homologous to the query protein. Running BLASTP using the same query sequence returns 340 results. The increase in neighbors is due to the double comparison done in CDART (protein to domain then domain to protein) and the high sensitivity of the RPS-BLAST algorithm and domain profiles.
Subsetting by Domain
At the bottom of the results page shown in Figure 2 is a form to subset the results by domain. Similar or redundant domains are grouped together and are represented by the same symbol in the display. The grouping together of domains is accomplished by examining overlapping hits of the domains to proteins in nr, using the algorithm described in the Methods section.
For example, one can select the BRCT domain and click the subset button
to retrieve all proteins that contain the BRCT domain. Using PSI-BLAST,
this type of subsetting requires advance knowledge of where domains
exist on the query protein and restricting the search to the part of
the protein that contains the domain of interest. During each iteration
of PSI-BLAST, the user has to manually select sequences that construct
a Position Specific Scoring Matrix (the statistical profile of a
protein motif) with the desired sensitivity, a task made more difficult
if the sequences have been incorrectly annotated. In domain databases
like Pfam, SMART, and CDD, the sequences chosen and the extent of the
domain have been screened by a knowledgeable curator who edited the
multiple sequence alignments used to create the corresponding domain
profile. To illustrate this difference, querying PSI-BLAST with BRCA1
returns a large number of similar proteins, but the results are largely dictated by a large, uncharacterized domain in the center of the protein. To concentrate on a smaller domain like BRCT, the user would
have to know to manually limit the query to the BRCT domain. For
example, using PSI-BLAST without limiting the query fails to find PARP,
an NAD+ ADP-ribosyltransferase involved in a variety of
biological processes such as DNA repair, cell, cycle, transformation,
carcinogenesis, and apoptosis (Hanai et al. 1998
). Because it contains
zinc-finger and BRCT domains and participates in similar cellular
functions, PARP contains similarities to BRCA1 that may be of interest
to an investigator. The CDART query for BRCA1 lists PARP proteins in
the first few pages of results.
It is important to note, however, that CDART is limited to known domain definitions and that these domain definitions may not span the phylogenetic clade of interest. This coverage problem will be significantly reduced as more domains are discovered or defined and existing domain definitions are expanded. Presently 67% of the proteins in Entrez have one or more domain hits annotated by CDART.
Taxonomic Restriction
The list of similar proteins in DART can be restricted taxonomically by clicking on the "Subset by Taxonomy" button at the bottom of the results page. Figure 3 displays the form used to select the parts of the taxonomic tree that are of interest. The number of proteins found under each taxonomic node is displayed next to the taxonomic common name, and a checkbox allows selection of the node, which also selects all underlying taxonomic nodes. The user has two choices at this point, either selecting one or more of the general taxonomic nodes and clicking on the "Go Back" button to select all organisms under the selected nodes, or clicking on the "Choose" button, which allows the user to prune the taxonomic tree at the species level.
|
Using the example of BRCA1, the user can select the taxonomic node
"Eubacteria," and CDART will return a list of bacterial proteins
containing BRCA1 domains (see Fig. 3). Interestingly, there is a large
family of DNA ligases that contain a single copy of the BRCT domain.
Many of these ligases are involved in cell cycle checkpoint functions
responsive to DNA damage (Bork et al. 1997
).
Querying by Domain
The incorporation of CDD into NCBI's Entrez database allows the user to retrieve proteins that contain a particular domain. To do this, the user can go to the Entrez Domains database at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=cdd and search for a domain of interest. Once this domain is found, clicking on the "Proteins" link in the domain record launches CDART, which lists all of the proteins in nr that contain that domain.
Comparison With Other Resources
Because several groups have made domain-based resources available on the Internet, it is instructive to show how CDART compares with and improves on these resources.
The Karolinska Institutet version of Pfam at
http://www.cgr.ki.se/Pfam/ provides a search for proteins
with similar domain architectures appended to the standard
Pfam protein search. This tool is unique in that it includes domain
definitions from Pfam B, which is a computationally generated,
uncurated set of domain definitions, and also classifies proteins by
specific variations from the domain architecture of the query protein.
This search is limited to the SWISS-PROT protein database (Bairoch and
Apweiler 2000
) and does not explicitly search SMART. In comparison with CDART, this search does not group proteins by domain architecture, and
the number of domain architectures returned is limited. For example,
the PARP protein found in the CDART search detailed above does not
appear in the results of this search tool. Subsetting by taxonomy or
domain is not supported, although the user can manually query Pfam for
a particular domain combination.
The SMART database at http://smart.embl-heidelberg.de/ is a resource that allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. SMART includes a sequence analysis tool that, given a query sequence, displays SMART and Pfam domains, signal peptides, and protein homologs. This tool allows the user to search for proteins with identical domain architectures and organizations and show their taxonomy. Unlike CDART, the tool does not search for similar domain architectures, nor does it allow subsetting by taxonomy or by domain, although, like Pfam, the user can manually query for a particular domain combination.
The clustering algorithm used in CDART is necessary to reduce the
redundancy of domain definitions in CDD. The Interpro database (Apweiler et al. 2001
) at http://www.ebi.ac.uk/interpro/ clusters together and annotates a variety of protein signature databases, including the relationships between domains in Pfam and SMART. The
relationships between these protein signature databases are complex,
for example, between a motif and a domain, thus in Interpro they are
manually curated, although the curation is based in part on
automatically generated data. In contrast, because CDART only requires
similarity between the domain definitions in CDD, it is able to create
the relationships algorithmically.
| |
METHODS |
|---|
|
|
|---|
Creation of the CDART Database
An essential step to finding similar proteins is to calculate the
domain architectures of all available proteins. Proteins are extracted
from the NCBI nonredundant protein database (nr). RPS-BLAST is used to
apply the domain definitions from the Conserved Domain Database (CDD;
Marchler-Bauer et al. 2002
) to the nr protein set, and hits with an
e-value <0.01 are recorded. To filter out low-complexity
sequence, the seg filter is applied. To reduce false positives, each
hit must be at least 40% of the length of the domain definition. Hits
to sequences are then sorted by domain.
The CDD database contains redundant domain definitions and domain definitions that are closely related, and it simplifies the CDART results to reduce this redundancy. To do this, we perform an all-against-all comparison of each domain's sequence hits to the sequence hits of all other domains. The comparison looks for overlapping sequence hits, defined as a >50% overlap of the length of either domain's hit to the sequence. If >15% of the total number of hits to nr for either domain being compared is exceeded, then both domains are recorded as being similar. At the end of this comparison, all similar domains are clustered together using single linkage clustering.
Using the sequence hits and the clusters of similar domains, domain
architectures are calculated for each protein in nr. These architectures are an N-terminus to C-terminus listing of the domain clusters found in each protein with consecutive repeats of a domain cluster collapsed to a single repeat. The architectures are recorded in
the CDART database along with taxonomic information taken from the NCBI
Entrez Taxonomy Database (Wheeler et al. 2002
).
Querying CDART
When given a query sequence by the user, CDART runs RPS-BLAST to find domain hits. Each of these domain hits is assigned to a domain cluster. CDART then retrieves all domain architectures that contain any of the domain clusters in the query. These domain architectures are then ranked and listed by the total number of domain clusters in common with the query.
For display purposes, overlapping hits in both the query and similar sequences are eliminated using the following algorithm: The highest scoring hit in a sequence is selected. Any overlapping hits are discarded, where overlapping is defined as >50% overlap of either hit. Then the next highest scoring hit is selected and the process is repeated until there are no remaining overlaps.
| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http://smart.embl-heidelberg.de/; SMART program.
http://www.cgr.ki.se/Pfam/; Pfam program.
http://www.ebi.ac.uk/interpro/; Interpro program.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=cdd; Entrez CDD.
http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi; CDART.
| |
ACKNOWLEDGMENTS |
|---|
We greatly appreciate the programming assistance of the NCBI Information Engineering Branch. In particular, we thank Jim Ostell for technical help in creating Figure 1, the BLAST group for providing RPS-BLAST, and the Taxonomy group for a variety of useful resources, including the common tree program used to do the taxonomic subsetting. The NCBI Structure group provided many helpful comments in the creation of CDART. We are also grateful to the NIH intramural research program for support.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 Corresponding author.
E-MAIL lewisg{at}mail.nih.gov; FAX (301) 435-7794.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.278202.
| |
REFERENCES |
|---|
|
|
|---|
Received March 13, 2002; accepted in revised form August 7, 2002.
This article has been cited by other articles:
![]() |
B. Lee and D. Lee DAhunter: a web-based server that identifies homologous proteins by comparing domain architecture Nucleic Acids Res., July 1, 2008; 36(suppl_2): W60 - W64. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Simockova, R. Holic, D. Tahotna, J. Patton-Vogt, and P. Griac Yeast Pgc1p (YPL206c) Controls the Amount of Phosphatidylglycerol via a Phospholipase C-type Degradation Mechanism J. Biol. Chem., June 20, 2008; 283(25): 17107 - 17115. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Richardson and I. J. Oresnik L-Rhamnose Transport Is Sugar Kinase (RhaK) Dependent in Rhizobium leguminosarum bv. trifolii J. Bacteriol., December 1, 2007; 189(23): 8437 - 8446. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Doungudomdacha, A. Volgina, and J. M. DiRienzo Evidence that the cytolethal distending toxin locus was once part of a genomic island in the periodontal pathogen Aggregatibacter (Actinobacillus) actinomycetemcomitans strain Y4 J. Med. Microbiol., November 1, 2007; 56(11): 1519 - 1527. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Leon, A. Yee, A. R. Ortiz, J. Santoro, M. Rico, and M. A. Jimenez Solution structure of the hypothetical protein TA0095 from Thermoplasma acidophilum: A novel superfamily with a two-layer sandwich architecture Protein Sci., October 1, 2007; 16(10): 2278 - 2286. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. S. Franco, L. J. Mota, C. M. Soares, and I. de Sa-Nogueira Probing key DNA contacts in AraR-mediated transcriptional repression of the Bacillus subtilis arabinose regulon Nucleic Acids Res., July 9, 2007; 35(14): 4755 - 4766. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Jones, S. E. Greer-Phillips, and K. A. Borkovich The Response Regulator RRG-1 Functions Upstream of a Mitogen-activated Protein Kinase Pathway Impacting Asexual Development, Female Fertility, Osmotic Stress, and Fungicide Resistance in Neurospora crassa Mol. Biol. Cell, June 1, 2007; 18(6): 2123 - 2136. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Poysti, E. D. M. Loewen, Z. Wang, and I. J. Oresnik Sinorhizobium meliloti pSymB carries genes necessary for arabinose transport and catabolism Microbiology, March 1, 2007; 153(3): 727 - 736. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Nikolski and D. J. Sherman Family relationships: should consensus reign?--consensus clustering for protein families Bioinformatics, January 15, 2007; 23(2): e71 - e76. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchler-Bauer, J. B. Anderson, M. K. Derbyshire, C. DeWeese-Scott, N. R. Gonzales, M. Gwadz, L. Hao, S. He, D. I. Hurwitz, J. D. Jackson, et al. CDD: a conserved domain database for interactive domain family analysis Nucleic Acids Res., January 12, 2007; 35(suppl_1): D237 - D240. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. V FIGUEROA, E. PRECIGOUT, B. CARCY, and A. GORENFLOT Identification of Common Antigens in Babesia bovis, B. bigemina, and B. divergens Ann. N.Y. Acad. Sci., October 1, 2006; 1081(1): 382 - 396. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Moreira, F. Rodriguez-Valera, and P. Lopez-Garcia Metagenomic analysis of mesopelagic Antarctic plankton reveals a novel deltaproteobacterial group Microbiology, February 1, 2006; 152(2): 505 - 517. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. A. Shoemaker, A. R. Panchenko, and S. H. Bryant Finding biologically relevant protein domain interactions: Conserved binding mode analysis Protein Sci., February 1, 2006; 15(2): 352 - 361. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Paulsen and A. von Haeseler INVHOGEN: a database of homologous invertebrate genes Nucleic Acids Res., January 1, 2006; 34(suppl_1): D349 - D353. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Fisher, C. Almaguer, R. Holic, P. Griac, and J. Patton-Vogt Glycerophosphocholine-dependent Growth Requires Gde1p (YPL110c) and Git1p in Saccharomyces cerevisiae J. Biol. Chem., October 28, 2005; 280(43): 36110 - 36117. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Fukamatsu, S. Mitsui, M. Yasuhara, Y. Tokioka, N. Ihara, S. Fujita, and T. Kiyosue Identification of LOV KELCH PROTEIN2 (LKP2)-interacting Factors That Can Recruit LKP2 to Nuclear Bodies Plant Cell Physiol., August 1, 2005; 46(8): 1340 - 1349. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Krishnadev, N. Rekha, S. B. Pandit, S. Abhiman, S. Mohanty, L. S. Swapna, S. Gore, and N. Srinivasan PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families Nucleic Acids Res., July 1, 2005; 33(suppl_2): W126 - W129. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. van den Akker, Y. Vankan-Berkhoudt, P. J. M. Valk, B. Lowenberg, and R. Delwel The Common Viral Insertion Site Evi12 Is Located in the 5'-Noncoding Region of Gnn, a Novel Gene with Enhanced Expression in Two Subclasses of Human Acute Myeloid Leukemia J. Virol., May 1, 2005; 79(9): 5249 - 5258. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Raymond, S. Mahapatra, D. C. Crick, and M. S. Pavelka Jr. Identification of the namH Gene, Encoding the Hydroxylase Responsible for the N-Glycolylation of the Mycobacterial Peptidoglycan J. Biol. Chem., January 7, 2005; 280(1): 326 - 333. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchler-Bauer, J. B. Anderson, P. F. Cherukuri, C. DeWeese-Scott, L. Y. Geer, M. Gwadz, S. He, D. I. Hurwitz, J. D. Jackson, Z. Ke, et al. CDD: a Conserved Domain Database for protein classification Nucleic Acids Res., January 1, 2005; 33(suppl_1): D192 - D196. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Richardson, M. F. Hynes, and I. J. Oresnik A Genetic Locus Necessary for Rhamnose Uptake and Catabolism in Rhizobium leguminosarum bv. trifolii J. Bacteriol., December 15, 2004; 186(24): 8433 - 8442. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. Seron, J. Hill, and P. J. Linser A GPI-linked carbonic anhydrase expressed in the larval mosquito midgut J. Exp. Biol., December 15, 2004; 207(26): 4559 - 4572. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Pathak, L. M. Bogomolnaya, J. Guo, and M. Polymenis Gid8p (Dcr1p) and Dcr2p Function in a Common Pathway To Promote START Completion in Saccharomyces cerevisiae Eukaryot. Cell, December 1, 2004; 3(6): 1627 - 1638. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. E. Martin and K. Kirk The Malaria Parasite's Chloroquine Resistance Transporter is a Member of the Drug/Metabolite Transporter Superfamily Mol. Biol. Evol., October 1, 2004; 21(10): 1938 - 1949. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Ahmed, M. Torrado, R. D. Zinovieva, V. V. Senatorov, G. Wistow, and S. I. Tomarev Gene Expression Profile of the Rat Eye Iridocorneal Angle: NEIBank Expressed Sequence Tag Analysis Invest. Ophthalmol. Vis. Sci., September 1, 2004; 45(9): 3081 - 3090. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. P. Thomas, R. W. Loftus, and K. Z. Liu AVP-induced VIT32 gene expression in collecting duct cells occurs via trans-activation of a CRE in the 5'-flanking region of the VIT32 gene Am J Physiol Renal Physiol, September 1, 2004; 287(3): F460 - F468. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Niedick, N. Froese, A. Oumard, P. P. Mueller, M. Nourbakhsh, H. Hauser, and M. Koster Nucleolar localization and mobility analysis of the NF-{kappa}B repressing factor NRF J. Cell Sci., July 15, 2004; 117(16): 3447 - 3458. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchler-Bauer and S. H. Bryant CD-Search: protein domain annotations on the fly Nucleic Acids Res., July 1, 2004; 32(suppl_2): W327 - W331. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Veeramachaneni and W. Makalowski Visualizing Sequence Similarity of Protein Families Genome Res., June 1, 2004; 14(6): 1160 - 1169. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sambade and P. M. Kane The Yeast Vacuolar Proton-translocating ATPase Contains a Subunit Homologous to the Manduca sexta and Bovine e Subunits That Is Essential for Function J. Biol. Chem., April 23, 2004; 279(17): 17361 - 17365. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. H. Sibley and E. A. Raleigh Cassette-like variation of restriction enzyme genes in Escherichia coli C and relatives Nucleic Acids Res., January 26, 2004; 32(2): 522 - 534. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Gentle, K. Gabriel, P. Beech, R. Waller, and T. Lithgow The Omp85 family of proteins is essential for outer membrane biogenesis in mitochondria and bacteria J. Cell Biol., January 5, 2004; 164(1): 19 - 24. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Wheeler, D. M. Church, R. Edgar, S. Federhen, W. Helmberg, T. L. Madden, J. U. Pontius, G. D. Schuler, L. M. Schriml, E. Sequeira, et al. Database resources of the National Center for Biotechnology Information: update Nucleic Acids Res., January 1, 2004; 32(90001): D35 - 40. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Satola, P. L. Schirmer, and M. M. Farley Genetic Analysis of the Capsule Locus of Haemophilus influenzae Serotype f Infect. Immun., December 1, 2003; 71(12): 7202 - 7207. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. D. Bader, D. Betel, and C. W. V. Hogue BIND: the Biomolecular Interaction Network Database Nucleic Acids Res., January 1, 2003; 31(1): 248 - 250. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchler-Bauer, J. B. Anderson, C. DeWeese-Scott, N. D. Fedorova, L. Y. Geer, S. He, D. I. Hurwitz, J. D. Jackson, A. R. Jacobs, C. J. Lanczycki, et al. CDD: a curated Entrez d |