One major goal in biological research is to understand how genes are regulated through transcriptional regulatory networks. Recent advances in biotechnology have generated enormous amounts of data that can be utilized to better achieve this goal. In this chapter, we develop a general statistical framework to integrate different data sources for transcriptional regulatory network reconstructions. More specifically, we apply measurement error models for network reconstructions using both gene expression data and protein–DNA binding data. A linear misclassification model is used to describe the relationship between the expression level of a specific gene and the binding activities of the proteins (transcription factors) that regulate this gene. We propose Markov chain Monte Carlo method for statistical inference based on this model. Extensive simulations are conducted to evaluate the performance of this model and assess the sensitivity of its performance when the model parameters are misspecified. Our simulation results suggest that our approach can effectively integrate gene expression data and protein–DNA binding data to infer transcriptional regulatory networks. Lastly, we apply our model to jointly analyze gene expression data and protein–DNA binding data to infer transcriptional regulatory networks in the yeast cell cycle.
Understanding gene regulations through the underlying transcriptional regulatory networks (referred as TRNs in the following) is a central topic in biology. A TRN can be thought of as consisting of a set of proteins, genes, small modules, and their mutual regulatory interactions. The potentially large number of components, the high connectivity among various components, and the transient stimulation in the network result in great complexity of TRNs.