Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Huminiecki, L.
Right arrow Articles by Bicknell, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huminiecki, L.
Right arrow Articles by Bicknell, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 10, Issue 11, 1796-1806, November 2000

METHODS
In Silico Cloning of Novel Endothelial-Specific Genes

Lukasz Huminiecki, and Roy Bicknell1

Molecular Angiogenesis Laboratory, Imperial Cancer Research Fund, Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DS, UK

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

The endothelium plays a pivotal role in many physiological and pathological processes and is known to be an exceptionally active transcriptional site. To advance our understanding of endothelial cell biology and to elucidate potential pharmaceutical targets, we developed a new database screening approach to permit identification of novel endothelial-specific genes. The UniGene gene index was screened using high stringency BLAST against a pool of endothelial expressed sequence tags (ESTs) and a pool of nonendothelial ESTs constructed from cell-type-specific dbEST libraries. UniGene clusters with matches in the endothelial pool and no matches in the nonendothelial pool were selected. The UniGene/EST approach was then combined with serial analysis of gene expression (SAGE) library subtraction and reverse transcription polymerase chain reaction to further examine interesting clusters. Four novel genes were identified and labeled: endothelial cell-specific molecules (ECSM) 1-3 and magic roundabout (similar to the axon guidance protein roundabout). In summary, we present a powerful novel approach for comparative expression analysis combining two datamining strategies followed by experimental verification.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

In the postgenomic era, data analysis rather than data collection will present the biggest challenge to biologists. Efforts to ascribe biological meaning to genomic data, whether by identification of function, structure, or expression pattern, are lagging behind sequencing efforts (Boguski 1999). Here, we describe the use of two independent strategies for differential expression analysis combined with experimental verification to identify genes specifically or preferentially expressed in vascular endothelium.

The first strategy was based on an EST cluster expression analysis in the human UniGene gene index (Schuler et al. 1997). Recurrent gapped BLAST searches (Altschul et al. 1997) were performed at very high stringency against expressed sequence tags (ESTs) grouped into two pools. The two pools comprised endothelial cell and nonendothelial cell libraries derived from dbEST (Boguski et al. 1995). The second strategy used another datamining tool: SAGEmap xProfiler. xProfiler is a freely available online tool, which is a part of the NCBI's Cancer Genome Anatomy Project (CGAP) (Cole et al. 1995; Strausberg et al. 1997).

These two approaches alone produced a discouragingly high number of false positives. However, when both strategies were combined, predictions proved exceptionally reliable and four novel candidate endothelial-specific genes have been identified. For two of these genes, full-length cDNAs have been identified in sequence databases. Another gene (EST cluster) corresponds to a partial cDNA sequence from a large-scale cDNA sequencing project and contains a region of similarity to the intracellular domain of human roundabout homolog 1 (ROBO1).

    RESULTS
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

UniGene/EST Gene Index Screen

A pool of endothelial ESTs and a pool of nonendothelial ESTs were extracted using the Sequence Retrieval System (SRS) from dbEST. The endothelial pool consisted of 11,117 ESTs from nine human endothelial libraries (Table 1). The nonendothelial pool included 173,137 ESTs from 108 human cell lines and microdissected tumor libraries (Table 2). ESTs were extracted from dbEST, release April 2000. Multiple-FASTA files were transformed into BLAST searchable databases using the pressdb program. Table 3 shows the expression status of five known endothelial cell-specific genes in these two pools: von Willebrand factor (vWF; Ginsburg et al. 1985); two vascular endothelial growth factor receptors, fms-like tyrosine kinase 1 (FLT1; Shibuya et al. 1990) and kinase insert domain receptor (KDR; Matthews et al. 1991); tyrosine kinase receptor type tie (TIE1; Partanen et al. 1992); and tyrosine kinase receptor type tek (TIE2/TEK; Vikkula et al. 1996).

                              
View this table:
[in this window]
[in a new window]
 
Table 1.   Nine Human Endothelial Libraries from dbEST



                              
View this table:
[in this window]
[in a new window]
 
Table 2.   Nonendothelial dbEST Libraries



                              
View this table:
[in this window]
[in a new window]
 
Table 3.   Five Known Endothelial-Specific Genes in the dbEST Pools

Optimizing the BLAST E-value was crucial for the success of BLAST identity-level searches. Too high an E-value would result in gene paralogs being reported. In contrast, too low (stringent) an E-parameter would result in many false negatives, i.e., true positives would not be reported because of sequencing errors in EST data; ESTs are large-scale, low-cost single pass sequences and have a high error rate (Aaronson et al. 1996). In this work an E-value of 10e-20 was used in searches against the nonendothelial EST pool and a more stringent 10e-30 value was used in searches against the smaller endothelial pool. These values were deemed optimal after a series of test BLAST searches.

SAGE Data and SAGEmap xProfiler Differential Analysis

Internet-based SAGE library subtraction (SAGEmap xProfiler) was used as the second datamining strategy for the identification of novel endothelial-specific or preferentially endothelial genes. Two endothelial SAGE libraries (SAGE_Duke_HMVEC and SAGE_Duke_HMVEC + VEGF with a total of 110,790 sequences) were compared with 24 nonendothelial cell line libraries (full list in Table 4, total of 733,461 sequences). Table 5 shows the status of expression of the five reference endothelial-specific genes in these two SAGE pools.

                              
View this table:
[in this window]
[in a new window]
 
Table 4.   Twenty-four Nonendothelial Cell Serial Analysis Gene Expression (SAGE)-Cancer Genome Anatomy Project



                              
View this table:
[in this window]
[in a new window]
 
Table 5.   Five Known Endothelial-Specific Genes in the CGAP SAGE Pools

Combined Data Gives Highly Accurate Predictions

Twenty known genes were selected in the UniGene/EST screen (Table 6). These genes had no matches in the nonendothelial pool and at least one match in the endothelial pool. The list contained four endothelial-specific genes: TIE1 (Partanen et al. 1992), TIE2/TEK (Vikkula et al. 1996), LYVE1 (Banerji et al. 1999), and multimerin (Hayward et al. 1998), indicating ~20% accuracy of prediction. Other genes on the list, although certainly preferentially expressed in the endothelial cells, may not be endothelial specific. To improve on the prediction accuracy, we decided to combine UniGene/EST screen with the xProfiler SAGE analysis. Table 7 shows how data from the two approaches were combined. Identity-level BLAST searches were performed on mRNAs (known genes) or phrap-computed contigs (EST clusters representing novel genes) to investigate how these genes were represented in the endothelial and nonendothelial pool. Subsequent experimental verification by reverse transcription polymerase chain reaction (RT-PCR; Fig. 1) proved that the combined approach was 100% accurate, i.e., genes on the xProfiler list that had no matches the nonendothelial EST pool and at least one match in the endothelial pool were indeed endothelial specific.

                              
View this table:
[in this window]
[in a new window]
 
Table 6.   Results of the UniGene/EST Screen



                              
View this table:
[in this window]
[in a new window]
 
Table 7.   xProfiler Differential Analysis was Combined with Data from the UniGene/EST Screen Achieving 100% Certainty of Prediction


View larger version (45K):
[in this window]
[in a new window]
 
Figure 1   Experimental verification by reverse transcription polymerase chain reaction (RT-PCR). Candidate endothelial-specific genes predicted by the combination of the UniGene/EST screen and xProfiler serial analysis of gene expression (SAGE) differential analysis (Table 8) were checked for expression in three endothelial and nine nonendothelial cell cultures. Endothelial cultures were as follows: HMVEC (human microvascular endothelial cells), HUVEC (human umbilical vein endothelial cells) confluent culture, and HUVEC proliferating culture. Nonendothelial cultures were as follows: normal endometrial stromal (NES) cells grown in normoxia and NES grown in hypoxia, MDA 453 and MDA 468 breast carcinoma cell lines, HeLa, FEK4 fibroblasts cultured in normoxia and FEK4 fibroblasts cultured in hypoxia, SW480, and HCT116, the last two being colorectal epithelium cell lines. ECSM1 and ECSM2 showed complete endothelial specificity, whereas ECSM3 and magic roundabout were very strongly preferentially expressed in the endothelium. Interestingly, all these novel genes appear more specific than the benchmark endothelial-specific gene, von Willebrand factor.

    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

There have been several reports of computer analysis of tissue transcriptosomes. Usually an expression profile is constructed, based on the number of tags assigned to a given gene or a class of genes (Bernstein et al. 1996; Welle et al. 1999; Bortoluzzi et al. 2000). An attempt can be made to identify tissue-specific transcripts: for example, Vasmatzis et al. (1997) described three novel genes expressed exclusively in the prostate by in silico subtraction of libraries from the dbEST collection. Purpose-made cDNA libraries may also be used. Ten candidate granulocyte-specific genes have been identified by extensive sequence analysis of cDNA libraries derived from granulocytes and eleven other tissue samples, namely a hepatocyte cell line, fetal liver, infant liver, adult liver, subcutaneous fat, visceral fat, lung, colonic mucosa, keratinocytes, cornea, and retina (Itoh et al. 1998).

An analysis similar to the dbEST-based approach taken by Vasmatzis et al. (1997) is complicated by the fact that endothelial cells are present in all tissues of the body and endothelial ESTs are contaminating all bulk tissue libraries. To validate this, we used three well-known endothelial-specific genes---KDR, FLT1, and TIE-2---as queries for BLAST searches against dbEST. Transcripts were present in a wide range of tissues, with multiple hits in well-vascularized tissues (e.g., placenta, retina), embryonic tissues (liver, spleen), or infant tissues (brain). In addition, we found that simple subtraction of endothelial EST libraries against all other dbEST libraries failed to identify any specific genes (data not shown).

Two very different types of expression data resources were used in our datamining efforts. The UniGene/EST screen was based on EST libraries from dbEST. There are nine human endothelial libraries in the current release of dbEST, with a relatively small total number of ESTs (11,117). Some well-known endothelial-specific genes are not represented in this data set (Table 3). This limitation raised our concerns that genes with low levels of expression would be overlooked in our analysis. Therefore, we used another type of computable expression data: CGAP SAGE libraries. SAGE tags are sometimes called small ESTs (usually 10-11 bp in length). Their major advantage is that they can be unambiguously located within the cDNA: they are immediately adjacent to the most 3' NlaIII restriction site. Although there are only two endothelial CGAP SAGE libraries available at the moment, they contain an impressive total of ~111,000 tags---a data set approximately ten times bigger than the 11,117 sequences in the endothelial EST pool. The combined approach proved very accurate (Fig. 1; Table 8) when verified by RT-PCR. We report here identification of four novel highly endothelial-specific genes: endothelial cell-specific molecule 1 (ECSM1; UniGene entry Hs.13957), endothelial cell-specific molecule 2 (ECSM2; UniGene entry Hs.30089), endothelial cell-specific molecule 3 (ECSM3; UniGene entry Hs.8135), and magic roundabout (UniGene entry Hs.111518). For a comprehensive summary of data available on these genes, see Table 8.

                              
View this table:
[in this window]
[in a new window]
 
Table 8.   Summary of Available Information on ECSM1-3 and Magic Roundabout

ECSM1 has no protein or nucleotide homologs. It codes for a small protein of ~103 aa, the longest and most upstream open reading frame (ORF) identified in the contig sequence.

BLAST searches against the EMBL patent database revealed that ECSM2 corresponds to the cDNA from the patent "cDNA encoding novel polypeptide from human umbilical vein endothelial cell" (Shibayama et al. 1997), EMBL acc. E10591. A 205-aa polypeptide coded by this cDNA is a transmembrane protein with a suggested role in cell adhesion in that it is serine and proline rich, although no exact function has yet been identified.

ECSM3 was found to be identical with the matrix remodeling-associated gene 4 (MXRA4, cDNA sequence acc. AW888224) recently identified in a screen of 40,000 genes from 552 human cDNA libraries (M. Walker, pers. comm.). The strategy was based on the assumption that coexpression implies similar function (guilt by association). In total, eight novel genes coexpressed with 21 known matrix remodelling-associated genes were identified. In the human genome, ECSM3 is very closely associated with another endothelial-specific gene, AA4 (Clq/MBL/SPA receptor, C1qRp, Ly68). AA4 is a transmembrane protein expressed in vascular endothelial cells, aortic hematopoietic clusters, and fetal liver hematopoietic progenitors during fetal development, with a proposed role in the development of vascular and hematopoietic systems, especially cell-to-cell adhesion and/or signaling (Petrenko et al. 1999). By analyzing the 123,832-bp genomic clone AL118508, we found that ECSM3 is an immediate genomic neighbor of AA4. Both genes were contained within the 8000-bp sequence: MXRA4 has one exon, and AA4 has two exons and a small intron. MXRA4 contains a 402-bp region of strong homology (64.4% identity, E = 1.3e-24) to the 3' untranslated region (UTR) of the mouse (and not human) AA4 mRNA (acc. AF081789). Such an endothelial-specific gene cluster suggests existence of a functional gene expression domain (for a review on expression domains, see Dillon et al. 2000). It is also possible that MXRA4 is a recent evolutionary insertion into the AA4 locus, and it now exploits a part of the AA4 regulatory sequence located in the former 3'UTR of the AA4 gene. Because mouse AA4 genomic structure is not available, it's impossible to say whether a gene similar to MXRA4 is located in the vicinity. BLAST search of the full-length MXRA4 cDNA against the mouse EST database reveals only two similar ESTs that both belong to the mouse AA4 transcript, suggesting that MXRA4 is not present at all in the mouse genome.

BLAST searches for the Hs.111518 contig identified a cDNA clone (GenBank acc. AK000805) with a long ORF of 417 (accession no. BAA 91382). This sequence is rich in prolines and has several regions of low amino-acid complexity. BLAST PRODOM search (protein families database at Human Genome Project Resource Centre) identified a 120-bp region of homology to the cytoplasmic domain conserved family of transmembrane receptors involved in repulsive axon guidance (ROBO1 DUTT1 protein family; E = 4e-07). Homology was extended to 468 aa (E = 1.3e-09) when a more rigorous analysis was performed using ssearch (Smith and Waterman 1981), but the region of similarity was still restricted to the cytoplasmic domain. The ROBO1 DUTT1 family comprises the human roundabout homolog 1 (ROBO1), the mouse gene DUTT1, and the rat ROBO1 (Kidd et al. 1998, Brose et al. 1999). Because of this region of homology, we called the gene represented by Hs. 111518 magic roundabout. In addition, BLAST SBASE (protein domain database at Human Genome Project Resource Centre) suggested a region of similarity to the domain of the intracellular neural cell adhesion molecule long domain form precursor (E = 2e-11). It should be noted that the true protein product for magic roundabout is likely to be larger than the 417 aa coded in the AK000805 clone because the ORF has no apparent upstream limit, and size comparison to human roundabout 1 (1651 aa) suggests a much bigger protein.

Recently, intriguing associations between neuronal differentiation genes and endothelial cells have been discovered. For example, a neuronal receptor for vascular endothelial growth factor (VEGF) neuropilin 1 (Soker et al. 1998) was identified. VEGF was traditionally regarded as an exclusively endothelial growth factor. Processes similar to neuronal axon guidance are now being implicated in guiding migration of endothelial cells during angiogenic capillary sprouting. Thus, ephrinB ligands and EphB receptors are involved in demarcation of arterial and venous domains (Adams et al. 1999). It is possible that magic roundabout may be an endothelial-specific homolog of the human roundabout 1 involved in endothelial-cell repulsive guidance, presumably with a different ligand because similarity is contained within the cytoplasmic (i.e., effector) region and guidance receptors are known to have highly modular architecture (Bashaw and Goodman 1999).

It should be noted that expression of endothelial-specific genes is not usually 100% restricted to the endothelial cell. KDR and FLT1 are both expressed in the male and female reproductive tract: on spermatogenic cells (Obermair et al. 1999), on trophoblasts, and in decidua (Clark et al. 1996). KDR has been shown to define hematopoietic stem cells (Ziegler et al. 1999). FLT1 is also present on monocytes. In addition to endothelial cells, vWF is strongly expressed in megakaryocytes (Nichols et al. 1985; Sporn et al. 1985) and, in consequence, is present on platelets. Similarly, multimerin is present both in endothelial cells (Hayward et al. 1993) and platelets (Hayward et al. 1998). Generally speaking, endothelial and hematopoietic cells descend from same embryonic precursors: hemangioblasts and many cellular markers are shared between these two cell lineages (for review, see Suda et al. 2000). A surprising result of our RT-PCR analysis was that the genes identified here (ECSM1-3 and magic roundabout) appear to show greater endothelial specificity (Fig. 1) than does the classic endothelial marker von Willebrand factor.

As stated before, vascular endothelium plays a central role in many physiological and pathological processes and it is known to be an exceptionally active transcriptional site. Approximately 1000 distinct genes are expressed in an endothelial cell. In contrast, red blood cells were found to express 8 separate genes, platelets to express 22, and smooth muscle to express 127 (Adams et al. 1995). Known endothelial-specific genes attract much attention from both basic research and the clinical community. For example, endothelial-specific tyrosine kinases---TIE1, TIE2/TEK, KDR, and FLT1---are crucial players in the regulation of vascular integrity and angiogenesis (Sato et al. 1993,1995; Alello et al. 1995; Fong et al. 1995; Shalaby et al. 1995). Angiogenesis is now widely recognized as a rate-limiting process for the growth of solid tumors. It is also implicated in the formation of atherosclerotic plaques and restenosis. Finally endothelium plays a central role in the complex and dynamic system regulating coagulation and hemostasis.

Our combined datamining approach, together with experimental verification, is a powerful functional genomics tool. This type of analysis can be applied to many cell types, not just endothelial cells. The challenge of identifying the function of discovered genes remains, but bioinformatics tools such as structural genomics or homology and motif searches can offer insights that can then be verified experimentally.

    METHODS
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

Database Sequence Retrieval

Locally stored UniGene files (Build #111, release date May 2000) were used in the preparation of the final version of this paper. The UniGene Web site can be accessed at the http://www.ncbi.nlm.nih.gov/UniGene/. UniGene files can be downloaded from the ftp repository at ftp://ncbi.nlm.nih.gov/repository/unigene/. Representative sequences for the human subset of UniGene (the longest EST within the cluster) are stored in the file Hs.seq.uniq, whereas all ESTs belonging to the cluster are stored in a separate file called Hs.seq.

Sequences were extracted from the dbEST database accessed locally at the Human Genome Project Resource Centre using the SRS (SRS version 5) getz command. This was performed repeatedly using a PERL script for all the libraries in the endothelial and nonendothelial subsets, and sequences were merged into two multiple-FASTA files.

Selection Criteria for Nonendothelial EST Libraries

Selection of 108 nonendothelial dbEST libraries was largely manual. Initially, the list of all available dbEST libraries (http://www.ncbi.nlm.nih.gov/dbEST/libs_byorg.html) was searched using the keyword "cells" and the phrase "cell line". Although this search identified most of the libraries, additional keywords had to be added for the list to be full: "melanocyte," "macrophage," "HeLa," and "fibroblast." In some cases, the detailed library description was consulted to confirm that the library is derived from a cell line/primary culture. We also added a number of CGAP microdissected-tumor libraries. For that, Library Browser (available at http://www.ncbi.nlm.nih.gov/CGAP/hTGI/lbrow/cgaplb.cgi) was used to search for the keyword "microdissected."

UniGene Gene Index Screen

The UniGene gene transcript index was screened against the EST division of GenBank, dbEST. Both UniGene and dbEST were developed at the National Centre for Biotechnology Information (NCBI). UniGene is a collection of EST clusters corresponding to putative unique genes. It currently consists of four data sets: human, mouse, rat, and zebrafish. The human data set is comprised of approximately 90,000 clusters (UniGene Build #111, May 2000). By means of very high stringency BLAST identity searches, we aimed to identify those UniGene genes that have transcripts in the endothelial and not in the nonendothelial cell-type dbEST libraries. University of Washington BLAST2, which is a gapped version, was used as BLAST implementation. The E-value was set to 10e-20 in searches against the nonendothelial EST pool and to 10e-30 in searches against the smaller endothelial pool.

Although UniGene does not provide consensus sequences for its clusters, the longest sequence within the cluster is identified. Thus, this longest representative sequence (multiple-FASTA file Hs.seq.uniq) was searched using very high stringency BLAST against the endothelial and nonendothelial EST pool. If such representative sequence reported no matches, the rest of the sequences belonging to the cluster (UniGene multiple-FASTA file Hs.seq) followed as BLAST queries. Finally, clusters with no matches in the nonendothelial pool and at least one match in the endothelial pool were selected using PERL scripts analyzing BLAST textual output.

xProfiler SAGE Subtraction

xProfiler enables an online user to perform a differential comparison of any combination of 47 SAGE libraries with a total of ~2,300,000 SAGE tags using a dedicated statistical algorithm (Chen et al. 1998). xProfiler can be accessed at http://www.ncbi.nlm.nih.gov/SAGE/sagexpsetup.cgi. SAGE itself is a quantitative expression technology in which genes are identified by typically a 10- or 11-bp sequence tag adjacent to the cDNA's most 3' NlaIII restriction site (Velculescu et al. 1995).

The two available endothelial cell libraries (SAGE_Duke_HMVEC and SAGE_Duke_HMVEC + VEGF) defined pool A, and 24 (see Table 4 for list) nonendothelial libraries together built pool B. The approach was verified by establishing the status of expression of the five reference endothelial-specific genes in the two SAGE pools (Table 5) using Gene to Tag Mapping (http://www.ncbi.nlm.nih.gov/SAGE/SAGEcid.cgi). Subsequently, xProfiler was used to select genes differentially expressed between the pools A and B. The xProfiler output consisted of a list of genes with a 10-fold difference in the number of tags in the endothelial compared with the nonendothelial pool sorted according to the certainty of prediction. A 90% certainty threshold was applied to this list.

The other CGAP online differential expression analysis tool, Digital Differential Display (DDD), relies on EST expression data (source library information) instead of using SAGE tags. We attempted to use this tool similarly to SAGEmap xProfiler but have been unable to obtain useful results. Five out of nine endothelial and 64 out of 108 nonendothelial cell libraries used in our BLAST-oriented approach were available for online analysis using DDD (http://www.ncbi.nlm.nih.gov/CGAP/info/ddd.cgi). When such analysis was performed, the following were the 15 top scoring genes: annexin A2, actin gamma  1, ribosomal protein large P0, plasminogen activator inhibitor type I, thymosin beta  4, peptidyloprolyl isomerase A, ribosomal protein L13a, laminin receptor 1 (ribosomal protein SA), eukaryotic translation elongation factor 1 alpha  1, vimentin, ferritin heavy polypeptide, ribosomal protein L3, ribosomal protein S18, ribosomal protein L19, and tumor protein translationally controlled 1. This list was rather surprising as it did not include any well-known endothelial-specific genes, did not have any overlap with SAGE results (Table 8), and contained many genes that in the literature are reported to be ubiquitously expressed (i.e., ribosomal proteins, actin, vimentin, ferritin). A major advantage of our UniGene/EST screen is that instead of relying on source library data and fallible EST clustering algorithms, it actually performs identity-level BLAST comparisons in search of transcripts corresponding to a gene.

Mining Data on UniGene Clusters

To quickly access information about UniGene entries (e.g., literature references, sequence tagged sites, homologs, references to function), online resources were routinely used: NCBI's UniGene and LocusLink interfaces and Online Mendelian Inheritance in Man.

ESTs in UniGene clusters are not assembled into contigs, so before any sequence analysis, contigs were created using phrap assembler (for documentation on phrap, see http://bozeman.mbt.washington.edu/phrap.docs/phrap.html).

To analyze genomic contigs AC005795 (44,399 bp) and AL118508 (123,832 bp) containing ECSM1 and ECSM3, respectively, NIX Internet interface for multiapplication analysis of large unknown nucleotide sequences was used. For further information on NIX, see http://www.hgmp.mrc.ac.uk/NIX/. Alignments of ECSM1 and ECSM3 against AC005795 and AL118508 were obtained using the NCBI interface to the Human Genome: The NCBI Map Viewer. For further information on the NCBI Map Viewer, see http://www.ncbi.nlm.nih.gov/genome/guide/.

To search for possible transmembrane domains and signal sequences in translated nucleotide sequences, three Internet-based applications were used: DAS, http://www.biokemi.su.se/~server/DAS/ (Cserzo et al. 1997); TopPred2, http://www.biokemi.su.se/~server/toppred2/ (Heijne 1992); and SignalP, http://www.cbs.dtu.dk/services/SignalP/ (Nielsen et al. 1997).

Computing Resources

Computing resources of the Oxford University Bioinformatics Centre ( http://www.molbiol.ox.ac.uk) and the Human Genome Project Resource Centre (http://www.hgmp.mrc.ac.uk) were used.

Detailed information on PERL scripts used in this work, may be obtained from L.H. (lucash{at}icrf.icnet.uk).

Experimental Verification

To experimentally verify specificity of expression, we used RT-PCR. RNA was extracted from three endothelial and seven nonendothelial cell types cultured in vitro. Endothelial cultures were as follows: HMVEC (human microvascular endothelial cells), HUVEC (human umbilical vein endothelial cells) confluent culture, and HUVEC proliferating culture. Nonendothelial cultures were as follows: normal endometrial stromal (NES) cells grown in normoxia and NES grown in hypoxia, MDA 453 and MDA 468 breast carcinoma cell lines, HeLa, FEK4 fibroblasts cultured in normoxia and FEK4 fibroBLASTs cultured in hypoxia, SW480, and HCT116, the last two listed being colorectal epithelium cell lines.

If a sequence tagged site was available, dbSTS PCR primers were used and cycle conditions suggested in the dbSTS entry followed. Otherwise, primers were designed using the Primer3 program. Primers are listed in Table 9.

                              
View this table:
[in this window]
[in a new window]
 
Table 9.   List of Primers Used in Reverse Transcription Polymerase Chain Reactions

Tissue Culture Media, RNA Extraction, and cDNA Synthesis

Cell lines were cultured in vitro according to standard tissue culture protocols. In particular, endothelial media were supplemented with endothelial-cell growth supplement (ECGS; Sigma) and heparin (Sigma) to promote growth. Total RNA was extracted using the RNeasy Minikit (Qiagen) and cDNA synthesized using the Reverse-IT 1st Strand Synthesis Kit (ABgene).


    ACKNOWLEDGMENTS

We received extensive and patient help from many people in the British bioinformatics community, especially Drs. Sarah Butcher and John Peden from the Oxford University Bioinformatics Centre. We also thank Drs. Michael Göern and Ken Smith from the Imperial Cancer Research Fund laboratories for generous help with tissue culture techniques and preparation of RNA's for RT-PCR and Prof. Adrian Harris and Dr. Chris Norbury for stimulating discussions.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    FOOTNOTES

1 Corresponding author.

E-MAIL bicknelr{at}icrf.icnet.uk; FAX 44 (0)-1865-222431.

Article and publication are at www.genome.org/cgi/doi/10.1101/gr.150700.

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
METHODS
REFERENCES

  • Aaronson, J.S., Eckman, B., Blevins, R.A., Borkowski, J.A., Myerson, J., Imran, S., and Elliston, K.O. 1996. Toward the development of a gene index to the human genome: An assessment of the nature of high-throughput EST sequence data. Genome Res. 6: 829-845[Abstract/Free Full Text].
  • Adams, M.D., Kerlavage, A.R., Fleischmann, R.D., Fuldner, R.A., Bult, C.J., Lee, N.H., Kirkness, E.F., Weinstock, K.G., Gocayne, J.D., White, O. 1995. Initial assessment of human gene diversity and expression patterns based on 83 million nucleotides of cDNA sequence. Nature 377: 3-174[Medline].
  • Adams, R.H., Wilkinson, G.A., Weiss, C., Diella, F., Gale, N.W., Deutsch, U., Risau, W., and Klein, R. 1999. Roles of ephrinB ligands and EphB receptors in cardiovascular development: Demarcation of arterial/venous domains, vascular morphogenesis, and sprouting angiogenesis. Genes & Dev. 13: 295-306[Abstract/Free Full Text].
  • Aiello, L.P., Pierce, E.A., Foley, E.D., Takagi, H., Chen, H., Riddle, L., Ferrara, N., King, G.L., and Smith, L.E.H. 1995. Suppression of retinal neovascularization in vivo by inhibition of vascular endothelial growth factor (VEGF) using soluble VEGF-receptor chimeric proteins. Proc. Natl. Acad. Sci. 92: 10457-10461[Abstract/Free Full Text].
  • Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402[Abstract/Free Full Text].
  • Banerji, S., Ni, J., Wang, S.X., Clasper, S., Su, J., Tammi, R., Jones, M., and Jackson, D.G. 1999. LYVE-1, a new homologue of the CD44 glycoprotein, is a lymph-specific receptor for hyaluronan. J. Cell. Biol. 144: 789-801[Abstract/Free Full Text].
  • Bashaw, G.J. and Goodman, C.S. 1999. Chimeric axon guidance receptors: The cytoplasmic domains of slit and netrin receptors specify attraction versus repulsion. Cell 97: 917-926[CrossRef][Medline].
  • Bates, E.E., Ravel, O., Dieu, M.C., Ho, S., Guret, C., Bridon, J.M., Ait-Yahia, S., Briere, F., Caux, C., Banchereau, J. 1997. Identification and analysis of a novel member of the ubiquitin family expressed in dendritic cells and mature B cells. Eur. J. Immunol. 27: 2471-2477[Medline].
  • Bernstein, S.L., Borst, D.E., Neuder, M.E., and Wong, P. 1996. Characterization of the human fovea cDNA library and regional differential gene expression in the human retina. Genomics 32: 301-308[CrossRef][Medline].
  • Boguski, M.S. 1999. Biosequence exegesis. Science 286: 453-455[Abstract/Free Full Text].
  • Boguski, M.S. and Schuler, G.D. 1995. ESTablishing a human transcript map. Nat. Genet. 10: 369-371[CrossRef][Medline].
  • Bortoluzzi, S., d'Alessi, F., Romualdi, C., and Danieli, G.A. 2000. The human adult skeletal muscle transcriptional profile reconstructed by a novel computational approach. Genome Res. 10: 344-349[Abstract/Free Full Text].
  • Brose, K., Bland, K.S., Wang, K.H., Arnott, D., Henzel, W., Goodman, C.S., Tessier-Lavigne, M., and Kidd, T. 1999. Slit proteins bind Robo receptors and have an evolutionarily conserved role in repulsive axon guidance. Cell 96: 795-806[CrossRef][Medline].
  • Chen, H., Centola, M., Altschul, S.F., and Metzger, H. 1998. Characterization of gene expression in resting and activated mast cells. J. Exp. Med. 188: 1657-1668[Abstract/Free Full Text].
  • Clark, D.E., Smith, S.K., Sharkey, A.M., and Charnock-Jones, D.S. 1996. Localization of VEGF and expression of its receptors flt and KDR in human placenta throughout pregnancy. Hum. Reprod. 11: 1090-1098[Abstract/Free Full Text].
  • Cole, K.A., Krizman, D.B., and Emmert-Buck, M.R. 1999. The genetics of cancer---a 3D model. Nat. Genet. 21: 38-41[CrossRef][Medline].
  • Cserzo, M., Wallin, E., Simon, I., von Heijne, G., and Elofsson, A. 1997. Prediction of transmembrane alpha -helices in prokaryotic membrane proteins: The dense alignment surface method. Protein Eng. 6: 673-676.
  • Dillon, N. and Sabbattini, P. 2000. Functional gene expression domains: Defining the functional unit of eukaryotic gene regulation. Bioessays 7: 657-665.
  • Felbor, U., Gehrig, A., Sauer, C.G., Marquardt, A., Kohler, M., Schmid, M., and Weber, B.H.F. 1998. Genomic organization and chromosomal localization of the interphotoreceptor matrix proteoglycan-1 (IMPG1) gene: A candidate for 6q-linked retinopathies. Cytogenet. Cell. Genet. 81: 12-17[CrossRef][Medline].
  • Fong, G.H., Rossant, J., and Breitman, M.L. 1995. Role of the Flt-1 receptor tyrosine kinase in regulating the assembly of vascular endothelium. Nature 376: 65-69.
  • Gerhold, D. and Caskey, C.T. 1996. It's the genes! EST access to human genome content. Bioessays 18: 973-981[CrossRef][Medline].
  • Ginsburg, D., Handin, R.I., Bonthron, D.T., Donlon, T.A., Bruns, G.A., Latt, S.A., and Orkin, S.H. 1985. Human von Willebrand factor (vWF): Isolation of complementary DNA (cDNA) clones and chromosomal localization. Science 228: 1401-6[Abstract/Free Full Text].
  • Hayward, C.P., Bainton, D.F., Smith, J.W., Horsewood, P., Stead, R.H., Podor, T.J., Warkentin, T.E., and Kelton, J.G. 1993. Multimerin is found in the alpha -granules of resting platelets and is synthesized by a megakaryocytic cell line. J. Clin. Invest. 91: 2630-2639.
  • Hayward, C.P., Cramer, E.M., Song, Z., Zheng, S., Fung, R., Masse, J.M., Stead, R.H., and Podor, T.J. 1998. Studies of multimerin in human endothelial cells. Blood 91: 1304-1317[Abstract/Free Full Text].
  • Hayward, C.P.M., Rivard, G.E., Kane, W.H., Drouin, J., Zheng, S., Moore, J.C., and Kelton, J.G. 1996. An autosomal dominant, qualitative platelet disorder associated with multimerin deficiency, abnormalities in platelet factor V, thrombospondin, von Willebrand factor, and fibrinogen and an epinephrine aggregation defect. Blood 87: 4967-4978[Abstract/Free Full Text].
  • Heijne, G. 1992. Membrane Protein Structure Prediction, Hydrophobicity Analysis and the Positive-inside Rule. J. Mol. Biol. 225: 487-494[CrossRef][Medline].
  • Itoh, K., Okubo, K., Utiyama, H., Hirano, T., Yoshii, J., and Matsubara, K. 1998. Expression profile of active genes in granulocytes. Blood 15: 1432-1441.
  • Kidd, T., Brose, K., Mitchell, K.J., Fetter, R.D., Tessier-Lavigne, M., Goodman, C.S., and Tear, G. 1998. Roundabout controls axon crossing of the CNS midline and defines a novel subfamily of evolutionarily conserved guidance receptors. Cell 92 (2): 205-15[CrossRef][Medline].
  • Matthews, W., Jordan, C.T., Gavin, M., Jenkins, N.A., Copeland, N.G., and Lemischka, I.R. 1991. A receptor tyrosine kinase cDNA isolated from a population of enriched primitive hematopoietic cells and exhibiting close genetic linkage to c-kit. Proc. Natl. Acad. Sci. 88: 9026-9030[Abstract/Free Full Text].
  • Nichols, W.L., Gastineau, D.A., Solberg, L.A., and Mann, K.G. 1985. Identification of human megakaryocyte coagulation factor V. Blood 65: 1396-1406[Abstract/Free Full Text].
  • Nielsen, H., Engelbrecht, J., Brunak, S., and Heijne, G. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10: 1-6[Abstract/Free Full Text].
  • Obermair, A., Obruca, A., Pohl, M., Kaider, A., Vales, A., Leodolter, S., Wojta, J., and Feichtinger, W. 1999. Vascular endothelial growth factor and its receptors in male fertility. Fertil. Steril. 72: 269-275[CrossRef][Medline].
  • Partanen, J., Armstrong, E., Makela, T.P., Korhonen, J., Sandberg, M., Renkonen, R., Knuutila, S., Huebner, K., and Alitalo, K. 1992. A novel endothelial cell surface receptor tyrosine kinase with extracellular epidermal growth factor homology domains. Mol. Cell Biol. 12: 1698-1707[Abstract/Free Full Text].
  • Petrenko, O., Beavis, A., Klaine, M., Kittappa, R., Godin, I., and Lemischka, I.R. 1999. The molecular characterization of the fetal stem cell marker AA4. Immunity 10: 691-700[CrossRef][Medline].
  • Sato, T.N, Qin, Y., Kozak, C.A., and Audus, K.L. 1993. Tie-1 and tie-2 define another class of putative receptor tyrosine kinase genes expressed in early embryonic vascular system. Proc. Nat. Acad. Sci. 90: 9355-9358[Abstract/Free Full Text].
  • Sato, T.N., Tozawa, Y., Deutsch, U., Wolburg-Buchholz, K., Fujiwara, Y., Gendron-Maguire, M., Gridley, T., Wolburg, H., Risau, W., and Qin, Y. 1995. Distinct roles of the receptor tyrosine kinases Tie-1 and Tie-2 in blood vessel formation. Nature 376: 70-74[CrossRef][Medline].
  • Schuler, G.D. 1997. Pieces of the puzzle: Expressed sequence tags and the catalog of human genes. J. Mol. Med. 75: 694-698[CrossRef][Medline].
  • Shalaby, F., Rossant, J., Yamaguchi, T.P., Gertsenstein, M., Wu, X.F., Breitman, M.L., and Schuh, A.C. 1995. Failure of blood-island formation and vasculogenesis in Flk-1-deficient mice. Nature 376: 62-65[CrossRef][Medline].
  • Shibayama, S., Hirano, J., and Ono, H. 1997. cDNA encoding novel polypeptide from human umbilical vein endothelial cell. European Patent Office. Publication number: 0 682 113 A2.
  • Shibuya, M., Yamaguchi, S., Yamane, A., Ikeda, T., Tojo, A., Matsushime, H., and Sato, M. 1990. Nucleotide sequence and expression of a novel human receptor-type tyrosine kinase gene (flt) closely related to the fms family. Oncogene 5: 519-524[Medline].
  • Smith, T.F. and Waterman, M.S. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147: 195-197[CrossRef][Medline].
  • Soker, S., Takashima, S., Miao, H.Q., Neufeld, G., and Klagsbrun, M. 1998. Neuropilin-1 is expressed by endothelial and tumor cells as an isoform-specific receptor for vascular endothelial growth factor. Cell 92: 735-745[CrossRef][Medline].
  • Sporn, L.A., Chavin, S.I., Marder, V.J., and Wagner, D.D. 1985. Biosynthesis of von Willebrand protein by human megakaryocytes. J. Clin. Invest. 76: 1102-1106.
  • Strausberg, R.L., Dahl, C.A., and Klausner, R.D. 1997. New opportunities for uncovering the molecular basis of cancer. Nat. Genet. 15: 415-416.
  • Suda, T., Takakura, N., and Oike, Y. 2000. Hematopoiesis and angiogenesis. Int. J. Hematol. 71: 99-107[Medline]
  • Tamura, N., Itoh, H., Ogawa, Y., Nakagawa, O., Harada, M., Chun, T.H., Suga, T., Yoshimasa, T., and Nakao, K. 1996. cDNA cloning and gene expression of human type I-alpha cGMP-dependent protein kinase. Hypertension 27: 552-557[Abstract/Free Full Text].
  • Vasmatzis, G., Essand, M., Brinkmann, U., Byungkook, L., and Pastan, I. 1997. Discovery of three genes specifically expressed in human prostate by expressed sequence tag database analysis. Proc. Natl. Acad. Sci. 95: 300-304[Abstract/Free Full Text].
  • Velculescu, V.E., Zhang, L., Vogelstein, B., and Kinzler, K.W. 1995. Serial analysis of gene expression. Science 270: 484-487[Abstract/Free Full Text].
  • Vikkula, M., Boon, L.M., Carraway, K.L., 3rd, Calvert, J.T., Diamonti, A.J., Goumnerov, B., Pasyk, K.A., Marchuk, D.A., Warman, M.L., Cantley, L.C. 1996. Vascular dysmorphogenesis caused by an activating mutation in the receptor tyrosine kinase TIE2. Cell 87: 1181-1190[CrossRef][Medline].
  • Welle, S., Bhatt, K., and Thornton, C.A. 1999. Inventory of high-abundance mRNAs in skeletal muscle of normal men. Genome Res. 9: 506-513[Abstract/Free Full Text].
  • Ziegler, B.L., Valtieri, M., Porada, G.A., De Maria, R., Muller, R., Masella, B., Gabbianelli, M., Casella, I., Pelosi, E., Bock, T. 1999. KDR receptor: A key marker defining hematopoietic stem cells. Science 285: 1553-1558[Abstract/Free Full Text].

Received June 1, 2000; accepted in revised form August 29, 2000.


10:1796-1806 ©2000 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/00 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Arterioscler. Thromb. Vasc. Bio.Home page
E. Wallgard, E. Larsson, L. He, M. Hellstrom, A. Armulik, M. H. Nisancioglu, G. Genove, P. Lindahl, and C. Betsholtz
Identification of a Core Set of 58 Gene Transcripts With Broad and Specific Expression in the Microvasculature
Arterioscler. Thromb. Vasc. Biol., August 1, 2008; 28(8): 1469 - 1476.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
J.-N. Rybak, C. Roesli, M. Kaspar, A. Villa, and D. Neri
The Extra-domain A of Fibronectin Is a Vascular Marker of Solid Tumors and Metastases
Cancer Res., November 15, 2007; 67(22): 10948 - 10957.
[Abstract] [Full Text] [PDF]


Home page
Circ. Res.Home page
Y. Okada, K. Yano, E. Jin, N. Funahashi, M. Kitayama, T. Doi, K. Spokes, D. L. Beeler, S.-C. Shih, H. Okada, et al.
A Three-Kilobase Fragment of the Human Robo4 Promoter Directs Cell Type-Specific Expression in Endothelium
Circ. Res., June 22, 2007; 100(12): 1712 - 1722.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
M. A. Jarvis and J. A. Nelson
Human Cytomegalovirus Tropism for Endothelial Cells: Not All Endothelial Cells Are Created Equal
J. Virol., March 1, 2007; 81(5): 2095 - 2101.
[Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
V. Castronovo, D. Waltregny, P. Kischel, C. Roesli, G. Elia, J.-N. Rybak, and D. Neri
A Chemical Proteomics Approach for the Identification of Accessible Antigens Expressed in Human Kidney Cancer
Mol. Cell. Proteomics, November 1, 2006; 5(11): 2083 - 2091.
[Abstract] [Full Text] [PDF]


Home page
Vasc MedHome page
M. Fujiwara, M. Ghazizadeh, and O. Kawanami
Potential role of the Slit/Robo signal pathway in angiogenesis
Vascular Medicine, May 1, 2006; 11(2): 69 - 74.
[Abstract] [PDF]


Home page
Clin. Cancer Res.Home page
C. J.M. Best, J. W. Gillespie, Y. Yi, G. V.R. Chandramouli, M. A. Perlmutter, Y. Gathright, H. S. Erickson, L. Georgevich, M. A. Tangrea, P. H. Duray, et al.
Molecular Alterations in Primary Prostate Cancer after Androgen Ablation Therapy
Clin. Cancer Res., October 1, 2005; 11(19): 6823 - 6834.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Korn, S. Rohrig, S. Schulze-Kremer, and U. Brinkmann
Common denominator procedure: a novel approach to gene-expression data mining for identification of phenotype-specific genes
Bioinformatics, June 1, 2005; 21(11): 2766 - 2772.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
V. M. Bedell, S.-Y. Yeo, K. W. Park, J. Chung, P. Seth, V. Shivalingappa, J. Zhao, T. Obara, V. P. Sukhatme, I. A. Drummond, et al.
roundabout4 is essential for angiogenesis in vivo
PNAS, May 3, 2005; 102(18): 6373 - 6378.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
J. Hendrickx, K. Doggen, E. O. Weinberg, P. Van Tongelen, P. Fransen, and G. W. De Keulenaer
Molecular diversity of cardiac endothelial cells in vitro and in vivo
Physiol Genomics, October 4, 2004; 19(2): 198 - 206.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Pathol.Home page
S. L. Madden, B. P. Cook, M. Nacht, W. D. Weber, M. R. Callahan, Y. Jiang, M. R. Dufault, X. Zhang, W. Zhang, J. Walter-Yohrling, et al.
Vascular Gene Expression in Nonneoplastic and Malignant Brain
Am. J. Pathol., August 1, 2004; 165(2): 601 - 608.
[Abstract] [Full Text] [PDF]