Skip to main content Accessibility help
Cause and Correlation in Biology
  • Get access
    Check if you have access via personal or institutional login
  • Cited by 91
  • 2nd edition
  • Bill Shipley, Université de Sherbrooke, Canada
  • Export citation
  • Recommend to librarian
  • Buy the print book

Book description

Many problems in biology require an understanding of the relationships among variables in a multivariate causal context. Exploring such cause-effect relationships through a series of statistical methods, this book explains how to test causal hypotheses when randomised experiments cannot be performed. This completely revised and updated edition features detailed explanations for carrying out statistical methods using the popular and freely available R statistical language. Sections on d-sep tests, latent constructs that are common in biology, missing values, phylogenetic constraints, and multilevel models are also an important feature of this new edition. Written for biologists and using a minimum of statistical jargon, the concept of testing multivariate causal hypotheses using structural equations and path analysis is demystified. Assuming only a basic understanding of statistical analysis, this new edition is a valuable resource for both students and practising biologists.


Review of previous edition:'… the perfect introduction to SEM. This book can be used as the primary text in a SEM course given within any discipline, and can be used by scholars and researchers from any area of science.'

Source: Structural Equation Modeling

Review of previous edition:'Addressing students and practising biologists, Shipley does a terrific job of making mathematical ideas accessible … Cause and Correlation in Biology is a nontechnical and honest introduction to statistical methods for testing causal hypotheses.'

Johan Paulsson Source: Nature Cell Biology

Review of previous edition:'I highly recommend the book for those interested in multivariate approaches to biology.'

Source: Annals of Botany

'Bill Shipley has done an excellent job in tackling the fundamental issue of testing causality in biology and making it accessible to any biology student or scholar. This book is about statistics, but the storytelling is for biologists. When the first edition for this book came out, in 2000, path analyses were not a common tool for biologists. Although the first edition convinced us to use structural equation modelling, this second edition supplies the essential toolbox. This book is the best route to take if you want to master structural equation modelling in biology, and the very good news is that this second edition not only provides updates and extensions, it also offers R codes to run your analyses.'

Anne Charmantier - Centre d’Écologie Fonctionnelle et Évolutive (CEFE), Montpellier

'For a long time biologists have inferred causation only from carefully designed experiments. Shipley's book broadens horizons by showing how to use observational data to infer whether a causal model is plausible, and to estimate the variation in response due to competing causes.'

David Warton - University of New South Wales, Sydney

Refine List

Actions for selected content:

Select all | Deselect all
  • View selected items
  • Export citations
  • Download PDF (zip)
  • Send to Kindle
  • Send to Dropbox
  • Send to Google Drive

Save Search

You can save your searches here and later view and run them again in "My saved searches".

Please provide a title, maximum of 40 characters.


Aldrich, J. (1995). Correlations genuine and spurious in Pearson and Yule. Statistical Science 10: 364–76.
Bentler, P. M. (1995). EQS Structural Equations Program Manual, Version 3.0. Los Angeles, BMDP Statistical Software.
Bentler, P. M., and Bonnett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin 88: 588–606.
Bernard, C. (1865). Introduction à l’étude de la médicine expérimentale. Paris, J. B. Baillière.
Beveridge, W. I. B. (1957). The Art of Scientific Investigation. New York, Random House.
Blalock, H. M. (1961). Correlation and causality: the multivariate case. Social Forces 39: 246–51.
Blalock, H. M. (1964). Causal Inferences in Nonexperimental Research. Chapel Hill, University of North Carolina Press.
Blomberg, S. P., Lefevre, J. G., Wells, J. A., and Waterhouse, M. (2012). Independent contrasts and PGLS regression estimators are equivalent. Systematic Biology 61: 382–91.
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York, Wiley.
Bollen, K. A., and Long, J. S. (1993). Testing Structural Equation Models. Newbury Park, CA, Sage.
Bollen, K. A., and Stine, R. A. (1993). Bootstrapping goodness-of-fit measures in structural equation models, in Bollen, K. A., and Long, J. S. (eds.), Testing Structural Equation Models: 111–34. Newbury Park, CA, Sage.
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology 37: 62–83.
Browne, M. W., and Cudeck, R. (1993). Alternative ways of assessing model fit, in Bollen, K. A., and Long, J. S. (eds.), Testing Structural Equation Models: 136–62. Newbury Park, CA, Sage.
Bumpus, H. C. (1899). The elimination of the unfit as illustrated by the introduced sparrow. Biological Lectures Delivered at the Marine Biological Laboratory of Woods Hole 6: 209–26.
Burke, J. (1996). The Pinball Effect: How Renaissance Water Gardens Made the Carburetor Possible – and Other Journeys through Knowledge. Boston, Little, Brown.
Cleveland, W. S., and Devlin, S. J. (1988). Locally-weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association 83: 596–610.
Cleveland, W. S., Devlin, S. J., and Grosse, E. (1988). Regression by local fitting. Journal of Econometrics 37: 87–114.
Cleveland, W. S., Grosse, E., and Shyu, W. M. (1992). Local regression models, in Chambers, J. M., and Hastie, T. J. (eds.), Statistical Models in S: 309–76. Pacific Grove, CA, Wadsworth & Brooks.
Conover, W. J., and Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics. American Statistician 35: 124–9.
Cowan, I. R., and Farquhar, G. D. (1977). Stomatal function in relation to leaf metabolism environment, in Jennings, D. H. (ed.), Integration of Activity in the Higher Plant: 471–505. Cambridge University Press.
Cowles, M., and Davis, C. (1982a). Is the .05 level subjectively reasonable? Canadian Journal of Behavioural Sciences 14: 248–52.
Cowles, M., and Davis, C. (1982b). On the origins of the .05 level of statistical significance. American Psychologist 37: 553–8.
D'Agostino, R. B., Belanger, A., and D'Agostino, R. B. J. (1990). A suggestion for using powerful and informative tests of normality. American Statistician 44: 316–21.
Davenport, C. B. (1917). Inheritance of stature. Genetics 2: 313–89.
Davis, W. R. (1993). The FC1 rule of identification for confirmatory factor analysis. Sociological Methods and Research 21: 403–37.
De Robertis, E. D. P., and De Robertis, E. M. F. (1980). Cell and Molecular Biology. Boston, Thomson Learning.
DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods 2: 292–307.
Duhem, P. (1914). La théorie physique: Son objet, sa structure. Paris, Rivière.
Dunn, G., Everitt, B., and Pickles, A. (1993). Modelling Covariances and Latent Variables Using EQS. London, Chapman & Hall.
Eliason, S. R. (1993). Maximum Likelihood Estimation: Logic and Practice. Newbury Park, CA, Sage.
Epstein, R. J. (1987). A History of Econometrics. New York, Elsevier Science.
Farebrother, R. (1987). Algorithm AS 231: the distribution of a noncentral chi-square variable with nonnegative degrees of freedom. Applied Statistics 36: 402–5.
Feiblman, J. K. (1972). Scientific Method. The Hague, Martinus Nijhoff.
Felsenstein, J. (1985). Phylogenies and the comparative method. American Naturalist 125: 1–15.
Fisher, F. M. (1970). A correspondence principle for simultaneous equation models. Econometrica 38: 73–92.
Fisher, R. A. (1925). Statistical Methods for Research Workers. Edinburgh, Oliver & Boyd.
Fisher, R. A. (1926). The Design of Experiments. Edinburgh, Oliver & Boyd.
Fisher, R. A. (1950). Contributions to Mathematical Statistics. New York, Wiley.
Fisher, R. A. (1959). Smoking: The Cancer Controversy. Edinburgh, Oliver & Boyd.
Fisher, R. A. (1970). The Design of Experiments, 8th edn. New York, Hafner.
Forrest, D. W. (1974). Francis Galton: The Life and Work of a Victorian Genius. New York, Taplinger.
Galton, F. (1869). Hereditary Genius: An Inquiry into Its Laws and Consequences. London, Macmillan.
Geiger, D., Verma, T., and Pearl, J. (1990). Identifying independence in Bayesian networks. Networks 20: 507–34.
Glymour, G., Scheines, R., Spirtes, R., and Kelly, K. (1987). Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling. Orlando, Academic Press.
Goldberger, A. S. (1972). Structural equation methods in the social sciences. Econometrica 40: 979–1002.
Good, P. (1993). Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. New York, Springer.
Good, P. (1994). Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses, 2nd edn. New York, Springer.
Grace, J. B. (2006). Structural Equation Modeling and Natural Systems. Cambridge University Press.
Grace, J. B., and Bollen, K. A. (2008). Representing general theoretical concepts in structural equation models: the role of composite variables. Environmental and Ecological Statistics 15:191–213.
Griliches, Z. (1974). Errors in variables and other unobservables. Econometrica 42: 971–98.
Grime, J. P. (1979). Plant Strategies and Vegetation Processes. New York, Wiley.
Haavelmo, T. (1943). The statistical implications of a system of simultaneous equations. Econometrica 11: 1–12.
Harvey, P. H., and Pagel, M. D. (1991). The Comparative Method in Evolutionary Biology. Oxford University Press.
Hastie, T. J., and Tibshirani, R. (1990). Generalized Additive Models. London, Chapman & Hall.
Heise, D. (1975). Causal Analysis. New York, Wiley.
Hoogland, J. J., and Boomstra, A. (1998). Robustness studies in covariance structure modelling: an overview and a meta-analysis. Sociological Methods and Research 26: 239–367.
Hotelling, H. (1953). New light on the correlation coefficient and its transformations. Journal of the Royal Statistical Society, Series B 15: 193–232.
Howson, C., and Urbach, P. (1989). Scientific Reasoning: The Bayesian Approach. La Salle, IL, Open Court.
Hox, J. J. (1993). Factor analysis of multilevel data: gauging the Muthén model, in Oud, J. H. L., and van Blokland-Vogelsang, R. A. W. (eds.), Advances in Longitudinal and Multivariate Analysis in the Behavioural Sciences: 141–56. Nijmegen, ITS.
Jobson, J. D. (1992). Applied Multivariate Data Analysis, vol. I, Regression and Experimental Design. New York, Springer.
Jordano, P. (1995). Frugivore-mediated selection on fruit and seed size: birds and St. Lucie's cherry, Prunus mahaleb. Ecology 76: 2627–39.
Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika 32: 443–82.
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika 34: 183–202.
Jöreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika 57: 239–51.
Jöreskog, K. G. (1973). A general method for estimating a linear structural equation system, in Goldberger, A. S., and Duncan, O. D. (eds.), Structural Equation Models in the Social Sciences: 85–112. New York, Academic Press.
Keesling, J. W. (1972). Maximum likelihood approaches to causal analysis, PhD thesis. University of Chicago.
Kempthorpe, O. (1979). The Design and Analysis of Experiments. Huntington, NY, Robert E. Krieger.
Kendall, M. G., and Gibbons, J. D. (1990). Rank Correlation Methods. New York, Oxford University Press.
Kendall, M. G., and Stuart, A. (1983). The Advanced Theory of Statistics. London, Charles Griffin.
Kikuzawa, K. (1995). The basis for variation in leaf longevity of plants. Vegetatio 121: 89–100.
Korn, E. L. (1984). The ranges of limiting values of some partial correlations under conditional independence. American Statistician 38: 61–2.
Lande, R., and Arnold, S. J. (1983). The measurement of selection on correlated characters. Evolution 37: 1210–26.
Li, C. C. (1975). Path Analysis: A Primer. Pacific Grove, CA, Boxwood Press.
Little, R. J. A., and Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd edn. Hoboken, NJ, Wiley.
Mach, E. (1883). The Science of Mechanics: A Critical and Historical Account of Its Development, 5th edn, with revisions from 9th German edn. La Salle, IL, Open Court.
Manly, B. F. J. (1997). Randomization, Bootstrap and Monte Carlo Methods in Biology, 2nd edn. London, Chapman & Hall.
Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika 57: 519–30.
Mardia, K. V. (1974). Applications of some measures of multivariate skewness and kurtosis in testing normality and robustness studies. Sankhya, Series B 36: 115–28.
Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate Analysis. London, Academic Press.
Martins, E. P., and Hansen, T. F. (1997). Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the anlaysis of interspecific data. American Naturalist 149: 646–67.
Mayo, D. G. (1996). Error and the Growth of Experimental Knowledge. Chicago University Press.
McDonald, R. P. (1994). The bilevel reticular action model for path analysis with latent variables. Sociological Methods and Research 22: 399–413.
Meziane, D. (1998). Étude de la variation interspécifique de la vitesse spécifique de croissance et modélisation de l'effet des attributs morphologiques, physiologiques et d'allocation de biomasse, PhD thesis. Université de Sherbrooke.
Mulaik, S. A. (1986). Toward a synthesis of deterministic and probabilistic formulations of causal relations by the functional relation concept. Philosophy of Science 53: 313–32.
Muthén, B. O. (1990). Mean and Covariance Structure Analysis of Hierarchical Data, Statistical Series paper no. 62. Los Angeles, University of California.
Muthén, B. O. (1994). Multilevel covariance structure analysis. Sociological Methods and Research 22: 376–98.
Muthén, B. O. (1997). Latent variable modeling of longitudinal and multilevel data, in Raftery, A. E. (ed.), Sociological Methodology 1997: 453–81. Washington, DC, American Sociological Association.
Muthén, B. O., and Satorra, A. (1995). Complex sample data in structural equation modeling, in Marsden, P. V. (ed.), Sociological Methodology: 267–316. Washington, DC, American Sociological Association.
Niles, H. E. (1922). Correlation, causation and Wright's theory of ‘path coefficients’. Genetics 7: 258–73.
Norton, B. J. (1975). Biology and philosophy: the methodological foundations of biometry. Journal of the History of Biology 8: 85–93.
Passmore, J. (1966). A Hundred Years of Philosophy. Harmondsworth, Penguin Books.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Francisco, Morgan Kaufmann.
Pearl, J. (1997). The new challenge: from a century of statistics to an age of causation. Computing Science and Statistics 29: 415–23.
Pearl, J. (2000). Causality. Cambridge University Press.
Pearl, J., and Dechter, R. (1996). Identifying independencies in causal graphs with feedback, in Horvitz, E., and Jensen, F. V. (eds.), Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence: 240–6. San Francisco, Morgan Kaufmann.
Pearson, E. S., and Kendall, M. G. (1970). Studies in the History of Statistics and Probability. London, Griffin.
Pearson, K. (1892). The Grammar of Science. London, Adam & Charles Black.
Pearson, K. (1911). The Grammar of Science, 3rd edn. London, Adam & Charles Black.
Peters, R. H. (1991). A Critique for Ecology. Cambridge University Press.
Pollack, J. L. (1986). Contemporary Theories of Knowledge. Totowa, NJ, Rowman & Littlefield.
Popper, K. (1980). The Logic of Scientific Discovery. London, Hutchinson.
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1986). Numerical Recipes: The Art of Scientific Computing. Cambridge University Press.
Provine, W. B. (1986). Sewall Wright and Evolutionary Biology. University of Chicago Press.
Pugesek, B. H., and Tomer, A. (1996). The Bumpus house sparrow data: a reanalysis using structural equation models. Evolutionary Ecology 10: 387–404.
Rao, M. M. (1984). Probability Theory with Applications. Orlando, Academic Press.
Rapport, S., and Wright, T. (1963). Science: Method and Meaning. New York University Press.
Richardson, T. (1996a). A discovery algorithm for directed cyclic graphs, in Horvitz, E., and Jensen, F. V. (eds.), Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence: 454–61. San Francisco, Morgan Kaufmann.
Richardson, T. (1996b). Models of feedback: interpretation and discovery. PhD thesis, Pittsburgh, Carnegie Mellon University.
Rigdon, E. E. (1995). A necessary and sufficient identification rule for structural models estimated in practice. Multivariate Behavioral Research 30: 359–83.
Rosseel, Y. (2012). lavaan: an R package for structural equation modeling. Journal of Statistical Software 48: 1–36.
Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association 57: 473–89.
Santos, J. C., and Cannetella, D. C. (2011). Phenotypic integration emerges from aposematism and scale in poison frogs. Proceedings of the National Association of Science 108: 6175–80.
Satorra, A., and Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis, in Proceedings of the Business and Economic Statistics Section: Papers Presented at the Annual Meeting of the American Statistical Association: 308–13. Alexandria, VA, American Statistical Association.
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. London, Chapman & Hall.
Scott, A. J., and Holt, D. (1982). The effect of two-stage sampling on ordinary least squares methods. Journal of the American Statistical Association 77: 848–54.
Shipley, B. (1995). Structured interspecific determinants of specific leaf area in 34 species of herbaceous angiosperms. Functional Ecology 9: 312–19.
Shipley, B. (1997). Exploratory path analysis with applications in ecology and evolution. American Naturalist 149: 1113–38.
Shipley, B. (1999). Exploring hypothesis space: examples from organismal biology, in Glymour, C., and Cooper, G. F. (eds.), Computation, Causation, and Discovery: 441–52. Menlo Park, CA, AAAI Press.
Shipley, B. (2000). A new inferential test for path models based on directed acyclic graphs. Structural Equation Modeling 7: 206–18.
Shipley, B. (2009). Confirmatory path analysis in a generalized multilevel context. Ecology 90: 363–8.
Shipley, B., and Hunt, R. (1996). Regression smoothers for estimating parameters of growth analyses. Annals of Botany 76: 569–76.
Shipley, B., and Lechowicz, M. J. (2000). The functional coordination of leaf morphology and gas exchange in 40 wetland plant species. Ecoscience 7: 183–94.
Shipley, B., Lechowicz, M. J., Wright, I. J., and Reich, P. B. (2006). Fundamental trade-offs generating the worldwide leaf economics spectrum. Ecology 87: 535–41.
Shipley, B., and Peters, R. H. (1990). A test of the Tilman model of plant strategies: relative growth rate and biomass partitioning. American Naturalist 136: 139–53.
Shirahata, S. (1980). Rank tests of partial correlation. Bulletin of Mathematical Statistics 19: 9–18.
Simon, H. (1977). Models of Discovery. Dordrecht, D. Reidel.
Sokal, R. R., and Rohlf, F. J. (1981). Biometry. New York, Freeman.
Spearman, C. (1904). General intelligence objectively determined and measured. American Journal of Psychology 15: 201–93.
Spirtes, P. (1995). Directed cyclic graphical representation of feedback models, in Besnard, P., and Hanks, S. (eds.), Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence: 491–8. San Francisco, Morgan Kaufmann.
Spirtes, P., Glymour, C., and Scheines, R. (1990). Causality from probability, in McGee, G. (ed.), Evolving Knowledge in Natural Science and Artificial Intelligence: 181–99. London, Pitman.
Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation, Prediction, and Search. New York, Springer.
Spirtes, P., Richardson, T., Meek, C., and Scheines, R. (1998). Using path diagrams as a structural equation modeling tool. Sociological Methods and Research 27: 182–225.
Steiger, J. H. (1989). EzPATH: A Supplementary Manual for SYSTAT and SYGRAPH. Evanston, IL, SYSTAT Inc.
Steiger, J. H. (1990). Structural model evaluation and modification: an interval estimation approach. Multivariate Behavioral Research 25: 173–80.
Tanaka, J. S. (1993). Multifaceted conceptions of fit in structural equation models, in Bollen, K. A., and Long, J. S. (eds.), Testing Structural Equation Models: 10–39. Newbury Park, CA, Sage.
Van Buuren, S., and Groothuis-Oudshoorn, K. (2011). Multivariate imputation by chained equations. Journal of Statistical Software 45: 1–67.
Van Hulst, R. (1979). On the dynamics of vegetation: Markov chains as models of succession. Vegetatio 40: 3–14.
Verma, T., and Pearl, J. (1988). Causal networks: semantics and expressiveness, in Shachter, R., Levitt, T., Kanal, L. N., and Lemmer, J. F. (eds.), Proceedings of the Fourth Conference on Uncertainty in Artificial Intelligence: 352–9. New York, Elsevier Science.
Verma, T., and Pearl, J. (1990). Equivalence and synthesis of causal models, in Bonissone, P. P., Henrion, M., Kanal, L. N., and Lemmer, J. F. (eds.), Proceedings of the Sixth Conference on Uncertainty in Artificial Intelligence: 255–68. New York, Elsevier Science.
Von Hardenberg, A., and Gonzalez-Voyer, A. (2012). Disentangling evolutionary cause–effect relationships with phylogenetic confirmatory path analysis. Evolution 67: 378–87.
Wahba, G. (1991). Spline Models for Observational Data. Philadelphia, SIAM Press.
Wishart, J. (1928). Sampling errors in the theory of two factors. British Journal of Psychology 19: 180–7.
Wright, I. J., Reich, P. B., Westoby, M., Ackerly, D. D., Baruch, Z., Bongers, F., Cavender-Bares, J., Chapin, T., Cornelissen, J. H. C., Diemer, M., Flexas, J., Garnier, E., Groom, P. K., Gulias, J., Hikosaka, K., Lamont, B. B., Lee, T., Lee, W., Lusk, C., Midgley, J. J., Navas, M.-L., Niinemets, Ü., Oleksyn, J., Osada, N., Poorter, H., Poot, P., Prior, L., Pyankov, V. I., Roumet, C., Thomas, S. C., Tjoelker, M. G., Veneklaas, E. J., and Villar, R. (2004). The worldwide leaf economics spectrum. Nature 428: 821–7.
Wright, S. (1918). On the nature of size factors. Genetics 3: 367–74.
Wright, S. (1920). The relative importance of heredity and environment in determining the piebald pattern of guinea pigs. Proceedings of the National Academy of Science 6: 320–32.
Wright, S. (1921). Correlation and causation. Journal of Agricultural Research 10: 557–85.
Wright, S. (1925). Corn and Hog Correlations, USDA Bulletin no. 1300. Washington, DC, US Department of Agriculture.
Wright, S. (1984). Diverse uses of path analysis, in Chakravarti, A., Human Population Genetics: 1–34. New York, Van Nostrand Reinhold.


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Book summary page views

Total views: 0 *
Loading metrics...

* Views captured on Cambridge Core between #date#. This data will be updated every 24 hours.

Usage data cannot currently be displayed.