June 2006
Volume 5

Center for Cancer Research: Frontiers in Science

 

 
CCR Home Print This Article Print All Articles Send Feedback Email This Article

Contents

 
From the Director: The CCR's Commitment to Partnerships and Sharing of Scientific Information Molecular Biology: Fusion Gene Transcripts in Expressed Sequence Tags Database Molecular Biology: Why Is DNA Like a Plumber's Snake? Structural Biology: Structural Studies of Rio2, an Atypical Serine Kinase Required for Ribosome Biogenesis Cell Biology: Telomere Protection Without a Telomerase: The Role of Drosophila ATM and Mre11 in Telomere Maintenance Important Information Issue Archive

National Cancer Institute

 

*To download a copy
of Acrobat Reader,
click here.

Molecular Biology

Fusion Gene Transcripts in Expressed Sequence Tags Database

Hahn Y, Bera TK, Gehlhaus K, Kirsch IR, Pastan IH, and Lee B. Finding fusion genes resulting from chromosome rearrangement by analyzing the expressed sequence databases. Proc Natl Acad Sci U S A 101: 13257–61, 2004.

The creation of fusion genes by chromosome translocation is a common feature of human cancer cells. The gene fusion often disrupts the normal regulation of the genes involved. It may result in overexpression of an oncogene, inactivation of a tumor suppressor gene, or production of altered protein with modified function. Several specific fusion genes are known to be responsible for hematologic disorders. The BCR/ABL1 fusion gene, for instance, is found in more than 90% of patients with chronic myelogenous leukemia. Evidence is emerging that fusion genes are also important in epithelial carcinogenesis.

Chromosome translocations can be discovered by cytogenetic experiments, but it is difficult to tell if a fusion gene has been created by the translocation and, if so, to identify it. Here we describe a procedure for identifying fusion genes by an analysis of the expressed sequence tags (EST) database. ESTs are short (~500 bp) sequences of randomly selected cDNAs prepared from a variety of tissues. The current database holds more than 6 million human ESTs, about half of which are from cancer tissues or derived cancer cell lines. The ESTs from fusion genes in this database can be identified because they map to two different locations in the human genome. A complicating factor is that many such chimeric transcripts in the EST database are cloning artifacts generated during the cDNA library construction process. However, these can be separated from genuine fusion gene transcripts because the fusion point usually occurs in an exon for the former, whereas it usually occurs at an exon-exon boundary for the latter.

We developed a semi-automatic procedure for systematic identification of fusion gene transcripts in the mRNA and EST databases based on these principles. Using this procedure, we could identify 118 mRNAs and 196 ESTs as fusion gene transcript sequences, from a total of 237 putative fusion genes. Among the mRNA sequences, 96 were previously annotated as fusion transcripts, including most of the BCR/ABL1 fusion transcript sequences.

The procedure also identified 177 novel fusion gene candidates. We experimentally verified one of these, the IRA1/RGS17 fusion, which was supported by three independent EST clones (Figure 1). A reverse transcriptase (RT)-PCR experiment using an mRNA sample from the MCF7 breast cancer cell line yielded a clear band with the correct size. A fluorescence in situ hybridization (FISH) experiment using two BAC clones containing IRA1 and RGS17 genes, respectively, detected a derivative chromosome, most likely the previously identified t(3;6)(q26;q25)del(3)(p14). The 5´-UTR exon 1 of IRA1 on 3q26.32 is fused with the start codon–bearing exon 2 of RGS17 on 6q25.2. The RGS17 protein is a member of the GTPase-activating proteins that act as regulators of G-protein signaling. Components in the G-protein–coupled receptor-signaling pathways, including RGS proteins, are known to be involved in many cancers and considered as potential therapeutic targets in cancer therapy.

Click to view full-size image.

Figure 1. Prediction and verification of the IRA1/RGS17 fusion resulting from a chromosome translocation. A) Schematic representation of the IRA1/RGS17 fusion. Boxes represent the exons, and broken lines the introns. The fusion event is indicated by an arc. Arrows indicate the transcription start sites. Exons are numbered as they occur in the original genes. Primers for the reverse transcriptase (RT)–PCR reaction are indicated (T530 and T531). ORFs (open reading frames) are marked with grey boxes. B) RT-PCR detection of the fusion transcripts in MCF7 cells. The fusion gene transcripts for the previously known BCAS4/BCAS3 and the predicted IRA1/RGS17 fusions were detected in the cells. The β actin (ACTB) was used as the positive control. The product sizes of ACTB, BCAS4/BCAS3, and IRA1/RGS17 are 600, 328, and 367 bp, respectively. C) Detection of the 3;6 translocation in MCF7 cells by a fluorescence in situ hybridization (FISH) experiment. A representative result is presented. The IRA1 gene (red) and the RGS17 gene (green) are on the chromosomes 3 and 6, respectively. Besides two copies each of chromosomes 3 and 6, a 3;6 translocation was detected (white arrow).

We expect to collect more fusion gene candidates in the future as the EST database continues to expand. A large collection of cancer-related gene fusions, attained through a combination of computational prediction and experimental verification, should present a new opportunity to uncover novel molecular mechanisms of carcinogenesis.

Yoonsoo Hahn, PhD
Visiting Fellow
hahny@mail.nih.gov

Byungkook Lee, PhD
Principal Investigator
Laboratory of Molecular Biology
NCI-Bethesda, Bldg. 37/Rm. 5120A
Tel: 301-496-6580
Fax: 301-480-4659
bk@nih.gov

back to top