Interdisciplinary Centre 

Home > Projects > Gene Expression > Multivariate procedures to increase power of microarraybased gene expression  Search  Sitemap  Imprint  

Multivariate procedures to increase power of microarraybased gene expressionIngo Röder, Markus Löffler, Ernst Schuster Institute for Medical Informatics, Statistics and Epidemiology University of Leipzig Cooperation: Hans Binder, Toralf Kirsten IZBI University of Leipzig Knut Krohn IZKF University of Leipzig Markus Eszlinger Med. Klinik u. Polliklinik II Medical Faculty of the University of Leipzig Background and problem: Micro array technology allows the simultaneous analysis of tenthousands of genes. Most often, however, the analysis is based on a few replications only. This causes problems in the application of classical multivariate tests which require sample sizes exceeding the number of observed variables. Moreover, the simultaneous testing of tens of thousands of genes for differential expression raises the "multiple testing problem", i.e. the increasing probability of obtaining false positive results when performing multiple tests. To overcome these problems, a class of stable, multivariate procedures based on the theory of spherical distributions has been proposed by Läuter, Glimm, and Kropf, 1996, 1998 [5,6]. These methods allow the use of multivariate information of many genes for testing differential gene expression. Furthermore, multiple testing procedures based on these principles have been constructed (e.g., Kropf, Läuter, 2002 [4]), which strictly keep the familywise type I error rate (FWE). Results: In this project, the above mentioned methods have been generalized to allow for the use of full multivariate information on expression intensities of individual genes analysed by the Affymetrix GeneChip technology. In contrast to the usual strategy, which constructs an expression score for each gene, based on averaging of the different oligonucleotide (perfect and missmatch) information, and then performs some test on these summarized expression values, we developed a test procedure based on the complete multivariate perfect match information. It is shown that a multiple FWEcontrolling procedure for normally distributed data proposed by Westfall, Kropf, and Finos, 2004 [8], can be generalised to a more powerful procedure (WKFprocedure) based on leftspherically distributed scores derived from the perfect match information, without losing the FWEcontrolling property. Herein, different variants of the WKFprocedure (nonstandardized principle component test  NPC, standardized principle component test  SPC, covariance sum test  CS, and standardized sum test SS) are considered. To illustrate the proposed test procedures, which have been implemented in the statistical programming environment R, we analyse two already published data sets, comparing gene expression of tumour and healthy tissues within identical patients and between two groups of different patients, respectively. Using these examples, we demonstrate that the use of multivariate scores leads to a more efficient identification of differentially expressed genes than the widely used MAS5 approach provided by the Affymetrix software tools (Affymetrix Microarray Suite 5 or GeneChip Operating Software) or even the robust version of a twoway analysis of variance (MDP) to estimate the expression value for each individual gene as suggested by Irizarry et al., 2003. The incorporation of the multivariate perfect match information is superior to classical expression score based methods with respect to the number of identifiable differentially expressed genes (Fig. 1). For a detailed description of the results we refer to Schuster et al., 2004 and Krohn et al, 2005.
top 