|Home||Suche | Sitemap | Impressum ||
Clinical Medicine and Molecular Genetics: Data Management and Analysis
Jörg Lange, Toralf Kirsten, Maciej Rosolowski
Interdisciplinary Centre for Bioinformatics
University of Leipzig
Department of Computer Sciences
University of Leipzig and IZBI
Hilmar Berger, Markus Löffler
Institute for Medical Informatics, Statistics and Epidemiology
University of Leipzig
During the last years, the availability of high-througput methods for simultaneous analysis of thousands of molecular parameters like gene expression, genotype or DNA copy number revolutionised the way researchers look for cellular changes in tumours.
Management, storage and analysis of this type of data require new software tools made to handle huge data sets. Software for the management of microarray data in a data warehouse and new analysis methods were developed at the IZBI Leipzig. These solutions lend itself to the handling of microarray data of various types and to their analysis in an integrated environment.
Currently, there are two joint research projects which benefit from these resources. The project “Molecular mechanisms in malignant lymphoma” was founded to expand the knowledge of molecular mechanisms leading to malignant lymphoma, possibly leading to better diagnostic parameters and treatment targets. In this project, tumour samples are analysed by gene expression profiling and array comparative genomic hybridization. All data are correlated with clinical data, a standardized FISH and immunohistochemical panel.
The “German glioma network” is another joint project of six german university clinics in which brain tumour samples and clinical data are collected and standardized genetic analysis is performed for each sample. Here, again, clinical data and molecular features of the tumour will be correlated in order to find prognostic and diagnostic markers.
In both projects there will be different types of data (low-dimensional clinical, cytogenetical and histopathological/immunohistochemical data and high-dimensional array data) that have to be stored in a way that both respects the special characteristics of the data types and at the same time facilitates access and analysis on the complete dataset.
We therefore chose an approach using specialised databases for low and high dimensional data. The array data are managed by the data warehouse GeWare, which has been developed at the IZBI. Furthermore clinical, genetic and histopathological data is captured by a commercial study management system.
For analysis, the data will be merged by defined mappings in the analysis platform. Several methods for statistical analysis and data visualization have been developed. GeWare was extended to support new array types and automatic data import of low-dimensional data sets.
New statistical methods for analysis of gene expression and array-CGH data are currently under development and will be integrated in the analysis platform.
Figure 1: Gene expression matrix. The expression values of a selected number of chips of the lymphoma research project and a selected number of probesets (genes) are displayed in a heatmap. The data is analyzed by hierarchical clustering for each dimension. Furthermore an annotation classification of the chip data is visualized to support combined analyses of genetic and clinical data.