Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- Preface to the second edition
- 1 Preliminaries
- 2 From cause to correlation and back
- 3 Sewall Wright, path analysis and d-separation
- 4 Path analysis and maximum likelihood
- 5 Measurement error and latent variables
- 6 The structural equation model
- 7 Multigroup models, multilevel models and corrections for the non-independence of observations
- 8 Exploration, discovery and equivalence
- Appendix A cheat-sheet of useful R functions
- References
- Index
7 - Multigroup models, multilevel models and corrections for the non-independence of observations
Published online by Cambridge University Press: 05 April 2016
- Frontmatter
- Dedication
- Contents
- Preface
- Preface to the second edition
- 1 Preliminaries
- 2 From cause to correlation and back
- 3 Sewall Wright, path analysis and d-separation
- 4 Path analysis and maximum likelihood
- 5 Measurement error and latent variables
- 6 The structural equation model
- 7 Multigroup models, multilevel models and corrections for the non-independence of observations
- 8 Exploration, discovery and equivalence
- Appendix A cheat-sheet of useful R functions
- References
- Index
Summary
Like successful politicians, good statistical models must be able to lie without getting caught. For instance, no series of observations from nature are really normally distributed. The normal distribution is just a useful abstraction – a myth – that makes life bearable. In constructing statistical models we pretend that the normal distribution is real and then check to ensure that our data do not deviate from it so much that the myth becomes a fairy tale. In the last chapter we saw how far we could stretch the truth about the distributional properties of our data before our data called us a liar. The goal of this chapter is to describe how SEM can deal with two other statistical myths that people often tell with respect to their data. These two myths are (a) that the observations in our data sets are generated by the same causal process (causal homogeneity) and (b) that these observations are independent draws from this single causal process.
Consider first the myth of causal homogeneity. It is easy to imagine cases in which different groups of observations might be generated by partially different causal processes. For instance, a behavioural ecologist studying a series of variables related to aggression and social dominance in primates would not necessarily want to combine together the observations from males and females, since it is possible that the behavioural responses of males and females are generated by different causal stimuli. When we sample from populations with different causal processes, either in terms of the causal structure or of the quantitative strengths between the variables, and we wish to compare the causal relationships across the different groups, we require a model that can explicitly take into account these differences between groups. Such modelling is called multigroup SEM, and this, in turn, requires the notion of nested models.
The assumption of the independence of observations can often be violated as well, because the observations are nested in space or time. The process of speciation itself suggests one way in which we can get non-independence of observations (Felsenstein 1985; Harvey and Pagel 1991). The attributes of organisms, if they have a genetic component, will often tend to be more similar to those of close relatives than to genetic strangers.
- Type
- Chapter
- Information
- Cause and Correlation in BiologyA User's Guide to Path Analysis, Structural Equations and Causal Inference with R, pp. 188 - 220Publisher: Cambridge University PressPrint publication year: 2016