A Bayesian Framework for Integrating Copy Number and Gene Expression Data

Yuan Ji; Filippo Trentini; Peter Mueller

doi:10.1017/CBO9781139226448.017

16 - A Bayesian Framework for Integrating Copy Number and Gene Expression Data

Published online by Cambridge University Press: 05 June 2013

Yuan Ji ,

Filippo Trentini and

Peter Mueller

Edited by

Kim-Anh Do ,

Zhaohui Steve Qin and

Marina Vannucci

Show author details

Yuan Ji: Affiliation:
NorthShore University Health-System
Filippo Trentini: Affiliation:
Bocconi University
Peter Mueller: Affiliation:
University of Texas
Kim-Anh Do: Affiliation:
University of Texas, MD Anderson Cancer Center
Zhaohui Steve Qin: Affiliation:
Emory University, Atlanta
Marina Vannucci: Affiliation:
Rice University, Houston

Book contents

Get access

Summary

Introduction

Overview

Diverse types of cancer genomics data are being collected widely and rapidly with the aim to systemically examine the origin and dynamics of different diseases. An important premise is that by integrating different types of genomics data, such as DNA copy number and RNA expression data, we will gain more knowledge about the underlying biological process. For example, high versus low correlation between a copy number aberration (CNA) for a gene marker and its abnormal RNA expression would indicate different disease mechanisms and therefore require different treatment strategies.

We propose a Bayesian model-based framework for the integration of different types of genomics data. We employ a mixture model (Parmigiani et al., 2002) for the observed expression data that defines latent indicators representing the differential expression status of each gene. By operating on the latent indicators, we effectively alleviate the high noise level in the original observed expression data. We integrate diverse types of genomics data through a regression of the latent variables across different data types. The regression model is naturally in agreement with the biological knowledge and allows for the easy incorporation of other covariates.

By definition, integration models must be able to borrow information from multiple genomic platforms, measured on the same patients and genes. For illustration purposes, we consider two of the most widely discussed genomic platforms: array comparative genomic hybridization (arrayCGH or aCGH), which measures DNA copy numbers, and expression microarrays, which measure RNA expression.

Type: Chapter
Information: Advances in Statistical Bioinformatics
Models and Integrative Inference for High-Throughput Data
, pp. 331 - 349

DOI: https://doi.org/10.1017/CBO9781139226448.017 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2013

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

16 - A Bayesian Framework for Integrating Copy Number and Gene Expression Data

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive