Interdisziplinäres Zentrum
für Bioinformatik


Suche  |  Sitemap  |  Impressum  |

Transcription Factor Target Detection in Comparative Genomics


Peter F. Stadler
Bioinformatics Group
Department of Computer Science
University of Leipzig

Friedemann Horn
Institute of Clinical Immunology and Transfusion Medicine
University Hospital Leipzig

Claudia Fried, Toralf Kirsten, André Schuetzen-meister
Bioinformatics Group
Department of Computer Science
University of Leipzig


Cellular signaling pathways induce gene expression by activating specific transcription factors. Errors in the activation of transcription factors are common events in cancer as several of this factors act on genes involved in cellular proliferation, survival and differentiation. The identification of the targets of tumorigenic transcription factors that cause these changes is an important task in cancer research.

One way to identify those target genes is detection of transcription factor binding sites. These binding sites can be predicted by the search of recurring motifs in the regulatory regions of co-expressed genes. This has been shown to be feasible in yeast where the intergenic regions are very small. On the other hand, intergenic regions in the genomes of vertebrates can be very large. A simple search for exact string matches of experimentally verified binding sides on a genome wide level in vertebrates thus leads to a high number of false positives. To overcome this problem, only regions might be taken into account that are evolutionarily conserved. Conserved regions can be detected by phylogenetic footprinting with the program tracker (1) that compares non-coding sequences surrounding a set of orthologous genes.

The aim of our study, is to find target genes of the transcription factor Stat3 (Signal transducer and activator of transcription 3), a member of the Stat family of transcription factors that act as signal transducers of cytokines, hormones and growth factors. Stat proteins are among the best-studied of oncogenic transcription factors. Stat3 itself is involved in the regulation of cell growth, survival and differentiation. Constitutively active STAT3 promotes uncontrolled growth and survival through dysregulation of gene expression.

We plan to apply two different approaches to the problem of targetgene identification. One approach to identify putative target genes will be to detect Stat3 motifs in the regulatory regions of genes using two consensus binding sites from several experimentally known binding sites. This data will be combined with data from phylogenetic footprinting. Additional criteria will be the distance of the detected binding site to the transcription start site of the gene and occurence of more than one Stat3 binding site as strong binding of Stat3 is mediated by clusters of Stat3 binding sites.

Occurence of Stat3 binding sites in the region of junB (ENSG00000171223) a known targetgene of Stat3.


The second approach is to determine a candidate set of Stat3 targetgenes by correlation analysis of Gene Expression data in human. Gene Expression data are contained in the Gene Expression Ware-house and analysis platform (GeWare) (2). Co-regulated genes which change the expression levels in a correlated fashion between different microarray experiments are often associated with a common set of transcription factors. For the retrieved set of Stat3 candiate genes we will retrieve the known orthologous genes from the other vertebrate genomes for our database. Both the upstream and the downstream regions (e.g. within a 5kb range from the translational start and stop signals) as well as all introns will be subjected to phylogenetic footprinting to identify the transcriptional regulatory sites in the co-expressed genes. Analysis of Stat3 targets found by this study can provide new insight into mechanisms of cancer and may shed light on strategies for targeted therapy.

Phylogenetic footprints in the flanking region of junB.


Publications:
S. Prohaska, C. Fried, C. Flamm, G. P. Wagner and P. F. Stadler
Surveying Phylogenetic Footprints in Large Gene Clusters: Applications to Hox Cluster Duplications.
Mol.Evol.Phylog. 31: 581-604 (2004)
T. Kirsten, H.H. Do, E. Rahm
A Multidimensional Data Warehouse for Gene Expression Databases.
Proc. German Conference on Bioinformatics (GCB) 2003, Munich.

top