Correlation analysis

Lorena Madrigal

doi:10.1017/CBO9781139022699.010

Correlation analysis is a very popular statistical technique whose purpose is to determine if two variables co-vary. What is peculiar about correlation analysis is that neither variable is dependent or independent. What we want to determine is whether there is a statistically significant relation between both variables. As you will learn in the regression chapters, in regression analysis we designate the independent variable with an X and the dependent variable with a Y. To clarify that in correlation analysis we do not have such variables we will designate one variable as Y1 and another one as Y2. With this notation we imply that both variables in correlation are free to vary and are out of the control of the researcher.

When we apply the parametric correlation test, our variables must be continuous or discontinuous numeric. If the variables are discontinuous numeric then they should have a broad enough range, and not have few values (such as 0 and 1 or 1–5). When we apply the non-parametric correlation tests, we have more freedom in terms of the type of data we can analyze with the test. For example, we can use data that are approximations (estimates) or that are not well measured (“five or more”). Whatever type of data we are analyzing, what must be clear is that if we find that there is a significant correlation between two variables, we are not in any way saying that one variable causes another one. I am sure you have heard the saying “Correlation does not mean causation.” This is so true and so frequently forgotten! The natural and social world is full of spurious correlations, correlations which arise only because of chance and which have no meaning or importance in the natural and social world.

Book contents

9 - Correlation analysis

Summary

Access options

Book contents

9 - Correlation analysis

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive