Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-wq2xx Total loading time: 0 Render date: 2024-04-16T20:10:07.764Z Has data issue: false hasContentIssue false

8 - Penalized Integrative Analysis of High-Dimensional Omics Data

from Part B - Vertical Integrative Analysis (General Methods)

Published online by Cambridge University Press:  05 September 2015

George Tseng
Affiliation:
University of Pittsburgh
Debashis Ghosh
Affiliation:
Pennsylvania State University
Xianghong Jasmine Zhou
Affiliation:
University of Southern California
Jin Liu
Affiliation:
Duke-NUS Graduate Medical School
Xingjie Shi
Affiliation:
Shanghai University of Finance and Economics, China
Jian Huang
Affiliation:
University of Iowa
Shuangge Ma
Affiliation:
Capital University of Economics and Business, China
Get access

Summary

Abstract

With omics data, results generated from single-dataset analysis are often unsatisfactory. Integrative analysis methods conduct the joint analysis of data from multiple independent studies or on multiple correlated responses, can effectively increase power, and outperform single-dataset analysis and meta-analysis. In this chapter, we review the penalized integrative analysis methods under both the homogeneity and heterogeneity models. Computation using the coordinate descent approach is described. We also discuss several important extensions. The analysis of a genome-wide association study demonstrates the applicability of reviewed methods.

Introduction

In the study of complex diseases such as cancer, cardiovascular diseases, and autoimmune diseases, profiling studies are nowroutinely conducted, generating “large d, small n” data, where the number of omics features profiled (genes, SNPs, methylation loci, etc.) d is much larger than the sample size n. Many different types of analyses can be conducted. For example, Chapters 3 and 4 were focused on identifying meaningful networks. In this chapter, our analysis goal is to identify a small subset of omics measurements that are associated with disease outcomes or phenotypes. Such measurements are also referred to as “markers” in the literature and in this chapter. Statistically, this is a variable selection problem. The development of integrative analysis methods has been partly motivated by the following examples.

8.1.1 Example 1

Consider the analysis of data generated in multiple independent studies with comparable designs. For example, in Ma et al. (2011), four pancreatic cancer data sets are collected and analyzed. The four data sets were generated in four independent studies, all having a case-control design, collecting mRNA gene expression measurements and searching for genes associated with the risk of pancreatic cancer. In high-dimensional omics studies, it has been recognized that the results generated in single-data-set analysis often have unsatisfactory properties such as low reproducibility. Among many possible contributing factors, the most important one is perhaps the small n. Multi-data-set analysis can effectively increase sample size and outperform single-data-set analysis (Guerra and Goldstein, 2009). This perspective has been explained in multiple chapters of this book. When the designs of multiple studies are “close enough”, it can be reasonable to expect that they identify the same set of markers.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×