Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Experimental Approaches to Generation of PPI Data
- 3 Computational Methods for the Prediction of PPIs
- 4 Basic Properties and Measurements of Protein Interaction Networks
- 5 Modularity Analysis of Protein Interaction Networks
- 6 Topological Analysis of Protein Interaction Networks
- 7 Distance-Based Modularity Analysis
- 8 Graph-Theoretic Approaches to Modularity Analysis
- 9 Flow-Based Analysis of Protein Interaction Networks
- 10 Statistics and Machine Learning Based Analysis of Protein Interaction Networks
- 11 Integration of GO into the Analysis of Protein Interaction Networks
- 12 Data Fusion in the Analysis of Protein Interaction Networks
- 13 Conclusion
- Bibliography
- Index
7 - Distance-Based Modularity Analysis
Published online by Cambridge University Press: 28 January 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Experimental Approaches to Generation of PPI Data
- 3 Computational Methods for the Prediction of PPIs
- 4 Basic Properties and Measurements of Protein Interaction Networks
- 5 Modularity Analysis of Protein Interaction Networks
- 6 Topological Analysis of Protein Interaction Networks
- 7 Distance-Based Modularity Analysis
- 8 Graph-Theoretic Approaches to Modularity Analysis
- 9 Flow-Based Analysis of Protein Interaction Networks
- 10 Statistics and Machine Learning Based Analysis of Protein Interaction Networks
- 11 Integration of GO into the Analysis of Protein Interaction Networks
- 12 Data Fusion in the Analysis of Protein Interaction Networks
- 13 Conclusion
- Bibliography
- Index
Summary
INTRODUCTION
The classic approaches to clustering follow a protocol termed “pattern proximity after feature selection” [158]. Pattern proximity is usually measured by a distance function defined for pairs of patterns. A simple distance measurement can capture the dissimilarity between two patterns, while similarity measures can be used to characterize the conceptual similarity between patterns. In protein-protein interaction (PPI) networks, proteins are represented as nodes and interactions are represented as edges. The relationship between two proteins is therefore a simple binary value: 1 if they interact, 0 if they do not. This lack of nuance makes it difficult to define the distance between the two proteins. The reliable clustering of PPI networks is further complicated by a high rate of false positives and the sheer volume of data, as discussed in Chapter 2.
Distance-based clustering employs these classic techniques and focuses on the definition of the topological or biological distance between proteins. These clustering approaches begin by defining the distance or similarity between two proteins in the network. This distance/similarity matrix can then be incorporated into traditional clustering algorithms. In this chapter, we will discuss a variety of approaches to distance-based clustering, all of which are grounded upon the use of these classic techniques.
TOPOLOGICAL DISTANCE MEASUREMENT BASED ON COEFFICIENTS
The simplest of these approaches use classic distance measurement methods and their various coefficient formulas to compute the distance between proteins in PPI networks. As discussed in [123], the distance between two nodes (proteins) in a PPI network can be defined as follows.
- Type
- Chapter
- Information
- Protein Interaction NetworksComputational Analysis, pp. 109 - 129Publisher: Cambridge University PressPrint publication year: 2009