Search

Summary

Background

Genome-wide analyses, especially gene expression profiling using microarrays, have been extensively used in medical research and led to the identification of several molecular signatures involved in various aspects of human disease pathogenesis. Individual studies have typically investigated relatively small numbers of samples, making cross-study validation a crucial step for the scientific community. Combined use of gene expression data from public repositories has proved difficult due to inherent differences in microarray platforms, protocols used in independent laboratories, experimental designs, and annotations for both genes and samples. Several methodologies have been proposed to address these issues, depending on the experimental strategies and on the biological and clinical questions. When samples phenotypes are known, statistical methods that handle data sets separately and then apply gene-wise meta-analytic approaches have proven successful, allowing the identification of statistically relevant intersections of molecular signatures from different studies (Rhodes et al., 2002; Ghosh et al., 2003; Rhodes et al., 2004; Wang et al., 2004). Advanced multilevel models are now available for this task (Conlon et al., 2007; Scharpf et al., 2009). As an alternative, the assimilation of gene expression measurements, achieved by merging the data sets, has also been used to evaluate molecular signatures obtained from different studies (Sorlie et al., 2003; Hu et al., 2005; Kapp et al., 2006; Hayes et al., 2006). Finally, we previously developed a method to evaluate cross-platform consistency of expression patterns, using integrative correlation (ICOR).

Summary

Abstract

The probability of expression (i.e., POE) scale was developed to achieve two main goals: (1) Microarray data are generated using a variety of measurement techniques that are not directly comparable. We sought to develop a common scale for which microarray data generated using differing methods could be converted and then compared. (2) In many cases, we are interested in defining whether genes and/or samples fall into one of three categories: overexpressed, underexpressed, or normally expressed. However, gene expression values are usually generated on a continuous scale. The scale that we have developed is categorical, assigning these continuous individual expression values probabilities of falling into one of these three categories. We describe the POE scale, several of its practical uses, and demonstrate its use on a lung cancer microarray data set.

POE: A Latent Variable Mixture Model

The Motivation and Practicality of POE

One reason for developing the probability of expression (POE) scale is to transform continuous expression values to a three-component categorical scale. Often the continuous expression values are displayed by image plots that use a red–green scale for over- and underexpression, respectively. The commonly used red and green image plots effectively display microarray data and generally have three main colors: red (overexpression), green (underexpression), and black (normal expression). The visual impact is that we see red, green, and black and not the “smooth-scale” that would blend from red to black to green as implied by continuous data. Our goal is to assign probabilities to each observed expression value where the probabilities correspond to the chance that an expression value should be called “red,” “green,” or “black.”

Search Results

Refine search

Refine search

Actions for selected content:

2 results

20 - Optimized Cross-Study Analysis of Microarray-Based Predictors

Summary

7 - Models for Probability of Under- and Overexpression: The POE Scale

Summary

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

2 results

20 - Optimized Cross-Study Analysis of Microarray-Based Predictors

Summary

7 - Models for Probability of Under- and Overexpression: The POE Scale

Summary