Skip to main content Accessibility help
  • Get access
    Check if you have access via personal or institutional login
  • Cited by 2
  • Print publication year: 2014
  • Online publication date: June 2014

Chapter four - Causal Inference and Generalization in Field Settings

from Part one - Design and Inference Considerations

Related content

Powered by UNSILO


Aiken, L. S., West, S. G., Schwalm, D. E., Carroll, J. L., & Hsiung, S. (1998). Comparison of a randomized and two quasi-experimental designs in a single outcome evaluation: Efficacy of a university-level remedial writing program. Evaluation Review, 22, 207–244.
Aiken, L. S., West, S. G., Woodward, C. K., & Reno, R. R. (1994). Health beliefs and compliance with mammography screening recommendations in asymptomatic women. Health Psychology, 13, 122–129.
Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables (with commentary). Journal of the American Statistical Association, 91, 444–472.
Baker, S. G. (1998). Analysis of survival data from a randomized trial with all-or- none compliance: Estimating the cost-effectiveness of a cancer screening program. Journal of the American Statistical Association, 93, 929–934.
Barcikowski, R. S. (1981). Statistical power with group mean as the unit of analysis. Journal of Educational Statistics, 6, 267–285.
Barnow, L. S. (1973). The effects of Head Start and socioeconomic status on cognitive development of disadvantaged students (Doctoral dissertation, University of Wisconsin). Dissertation Abstracts International, 1974, 34, 6191A.
Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator distinction in social psychological research: Conceptual, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182.
Bentler, P. M., & Woodward, J. A. (1978). A Head Start re-evaluation: Positive effects are not yet demonstrable. Evaluation Quarterly, 2, 493–510.
Berk, R. A. (1988). Causal inference for sociological data. In N. J. Smeltzer (Ed.), Handbook of sociology (pp. 155–172). Newbury Park, CA: Sage.
Berkowitz, L. (1993). Aggression: Its, causes, consequences, and control. New York: McGraw-Hill.
Bickman, L. & Hency, T. (1972) (Eds.). Beyond the laboratory: Field research in social psychology. New York: McGraw-Hill.
Biglan, A., Hood, D, Borzovsky, P., Ochs, L., Ary, D., & Black, C. (1991). Subject attrition in prevention research. In C. G. Luekfeld & W. Bukowski (Eds.), Drug abuse prevention intervention research: Methodological issues (pp. 213–234). Washington, DC: NIDA Research Monograph #107.
Boruch, R. F. (1997). Randomized experiments for planning and evaluation. Thousand Oaks, CA: Sage.
Boruch, R. F., McSweeny, A. J., & Soderstrom, E. J. (1978). Randomized field experiments for program planning, development, and evaluation. Evaluation Quarterly, 2, 655–695.
Box, G. E. P., & Draper, N. R. (1987). Empirical model building and response surfaces. New York: Wiley.
Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2008). Time series analysis: Forecasting and control (4th Ed.). Hoboken, NJ: Wiley.
Braucht, G. N., & Reichardt, C. S. (1993). A computerized approach to trickle-process, random assignment. Evaluation Review, 17, 79–90.
Campbell, D. T. (1957). Factors relevant to the validity of experiments in social settings. Psychological Bulletin, 54, 297–312.
Campbell, D. T. (1986). Relabeling internal and external validity for applied social scientists. In W. M. K. Trochim (Ed.), Advances in quasi-experimental design and analysis (Vol. 31, pp. 67–78). San Francisco: Jossey-Bass.
Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNally.
Chatfield, C. (2004). The analysis of time series: An introduction (6th Ed.). Boca Raton, FL: Chapman & Hall.
Cicarelli, V. G., Cooper, W. H., & Granger, R. L. (1969). The impact of Head Start: An evaluation of the effects of Head Start on children's cognitive and affective development. Athens: Ohio University and Westinghouse Learning Corporation.
Cleveland, W. S. (1993). Visualizing data. Summit, NJ: Hobart Press.
Cochran, W. G. (1965). The planning of observational studies of human populations (with discussion). Journal of the Royal Statistical Society, Series A, 128, 134–155.
Cochran, W. G. & Cox, G. M. (1957). Experimental designs (6th Ed.). New York: Wiley.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd Ed.). Hillsdale, NJ: Erlbaum.
Collins, L. M., Murphy, S. A., Nair, V. N., & Strecher, V. J. (2005). A strategy for optimizing and evaluating behavioral interventions. Annals of Behavioral Medicine, 30, 65–73.
Conner, R. F. (1977). Selecting a control group: An analysis of the randomization process in twelve social reform programs. Evaluation Quarterly, 1, 195–244.
Cook, T. D. (1993). A quasi-sampling theory of the generalization of causal relationships. New Directions for Program Evaluation, 37, 39–81.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston: Houghton-Mifflin.
Cook, T. D., Shadish, W. J., & Wong, V. C. (2008). Three conditions under which observational studies produce the same results as experiments. Journal of Policy Analysis and Management, 27, 724–750.
Cook, T. D., & Steiner, P. M. (2010). Case matching and the reduction of selection bias in quasi-experiments: The relative importance of covariate choice, unreliable measurement and mode of data analysis. Psychological Methods, 15, 56–68.
Cook, T. D., & Wong, V. C. (2008). Empirical tests of the validity of the regression discontinuity design. Annals of Economics and Statistics, 91/92, 127–150.
Cronbach, L. J. (1982). Designing evaluations of social and educational programs. San Francisco: Jossey-Bass.
Donner, A., & Klar, N. (2000). Design and analysis of cluster randomization trials in health research. London: Arnold.
Draper, D. (1995). Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics, 20, 115–147.
Efron, B., & Feldman, D. (1991). Compliance as an explanatory variable in clinical trials (with discussion). Journal of the American Statistical Association, 86, 9–26.
Enders, C. K. (2010). Applied missing data analysis. New York: Guilford.
Enders, C. K. (2011). Missing not at random models for latent growth curve analyses. Psychological Methods, 16, 1–16.
Evans, R. I.Rozelle, R. M., Maxwell, S. E., Raines, B. E., Dill, C. A., Guthrie, T. J., Henderson, A. H., & Hill, P. C. (1981). Social modeling films to deter smoking in adolescents: Results of a three-year field investigation. Journal of Applied Psychology, 66, 399–414.
Faul, F., Erdfelder, E., Lang, A-G, & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175–191.
Fisher, R. A. (1935). The design of experiments. London: Oliver & Boyd.
Flay, B. R. (1986). Psychosocial approaches to smoking prevention: A review of findings. Health Psychology, 4, 449–488.
Frangakis, C. E., & Rubin, D. B. (1999). Addressing complications of intention-to-treat analysis in the combined presence of all-or-none treatment-noncompliance and subsequent missing outcomes. Biometrika, 86, 365–379.
Fraley, R. C., & Marks, M. J. (2007). The null hypothesis significance testing debate and its implications for personality research. In R. W. Robins, R. C. Fraley, & R. F. Krueger (Eds.), Handbook of research methods in personality psychology (pp. 149–169). New York: Guilford.
Franklin, R. D., Allison, D. B., & Gorman, B. S. (Eds.) (1996). Design and analysis of single case research. Mahwah, NJ: Erlbaum.
Funder, D. C., Levine, J. M., Mackie, D. M., Morf, C. C., Vazire, S., & West, S. G. (in press). Improving the dependability of research in personality and social psychology: Recommendations for research and educational practice. Personality and Social Psychology Bulletin.
Goldberger, A. S. (1972, April). Selection bias in evaluating treatment effects: Some formal illustrations (Discussion paper 123–72). Madison: University of Wisconsin, Institute for Research on Poverty.
Gonzales, N. A., Dumka, L. E., Millsap, R. E., Gottschall, A., McClain, D. B., Wong, J. J., Mauricio, A. M., Wheeler. L., Germán, M., & Carpentier, F. D. (2012). Randomized trial of a broad preventive intervention for Mexican American adolescents. Journal of Consulting and Clinical Psychology, 80, 1–16.
Graham, J. W. (2012). Missing data: Analysis and design. New York: Springer.
Green, P. J., & Silverman, B. W. (1994). Nonparametric regression and generalized linear models: A roughness penalty approach. Boca Raton, FL:Chapman & Hall.
Hahn, J., Todd, P., & Van der Klaauw, W. (2001). Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica, 69, 201–209.
Haviland, A., Nagin, D. S., & Rosenbaum, P. R. (2007). Combining propensity score matching and group-based trajectory analysis in an observational study. Psychological Methods, 12, 247–267.
Hennigan, K. M., del Rosario, M. L., Heath, L., Cook, T. D., Wharton, J. L., & Calder, B. J. (1982). Impact of the introduction of television on crime in the United States: Empirical findings and theoretical implications. Journal of Personality and Social Psychology, 42, 461–477.
Henry, P. J. (2008). College sophomores in the laboratory redux: Influences of a narrow data base on social psychology's view of the nature of prejudice. Psychological Inquiry, 19, 49–71.
Hirano, K., Imbens, G. W., Rubin, D. B., & Zhou, X. H. (2000). Assessing the effect of an influenza vaccine in an encouragement design. Biostatistics, 1, 69–88.
Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2011). MatchIt: Nonparametric preprocessing for parametric causal inference. Journal of Statistical Software, 42, 1–28.
Hoeppner, B. B., Goodwin, M. S., Velicer, W. F., Mooney, M. E., & Hatsukami, D. K. (2008). Detecting longitudinal patterns of daily smoking following drastic cigarette reduction. Addictive Behaviors, 33, 623–639.
Holland, P. W. (1986). Statistics and causal inference (with discussion). Journal of the American Statistical Association, 81, 945–970.
Holland, P. W. (1988). Causal inference, path analysis, and recursive structural equation models (with discussion). In C. Clogg (Ed.), Sociological methodology 1988 (pp. 449–493). Washington, DC: American Sociological Association.
Hong, G., & Raudenbush, S. W. (2006). Evaluating kindergarten retention policy. Journal of the American Statistical Association, 101, 901–910.
Houts, A. C., Cook, T. D., & Shadish, W. R. (1986). The person-situation debate: A critical multiplist perspective. Journal of Personality, 54, 52–105.
Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd Ed.). New York: Routledge.
Hunter, J. E. (1996, August). Needed: A ban on the significance test. In P. E. Shrout (chair), Symposium: Significance tests-should they be banned from APA journals? American Psychological Association, Toronto, Canada.
Imai, K., Keele, L., & Tingley, D. (2010). A general approach to causal mediation analysis. Psychological Methods, 15, 309–334.
Imbens, G. W. (2010). An economist's perspective on Shadish (2010) and West and Thoemmes (2010). Psychological Methods, 15, 47–55.
Imbens, G. W., & Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142, 615–635.
Imbens, G. W., & Rubin, D. B. (in press). Causal inference: Statistical methods for estimating causal effects in biomedical, social, and behavioral sciences. New York: Cambridge University Press [in preparation, Department of Economics, Harvard University, Cambridge, MA].
Jo, B. (2002). Statistical power in randomized intervention studies with noncompliance. Psychological Methods, 7, 178–193.
Jo, B., Ginexi, E. M., & Ialongo, N. S. (2010). Handling missing data in randomized experiments with noncompliance. Prevention Science, 11, 384–396.
Judd, C. M., & Kenny, D. A. (1981). Estimating the effects of social interventions. New York: Cambridge University Press.
Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103, 54–69.
Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings (2nd Ed.). New York: Oxford University Press.
Kenny, D. A., & Judd, C. M. (1986). The consequences of violating the independence assumption in analysis of variance. Psychological Bulletin, 82, 345–362.
Khuder, S. A., Milz, S., Jordan, T., Price, J., Silvestri, K., & Butler, P. (2007). The impact of a smoking ban on hospital admissions for coronary heart disease. Preventive Medicine, 45, 3–8.
King, G., Nielsen, R., Coberley, C., Pope, J. E., & Wells, A. (2011). Avoiding randomization failure in program evaluation, with application to the medicare health support program. Population Health Management, 14, S11–S22.
Kish, L. (1987). Statistical designs for research. New York: Wiley.
Kopans, D. B. (1994). Screening for breast cancer and mortality reduction among women 40–49 years of age. Cancer, 74, 311–322.
Kratochwill, T. R., & Levin, J. R. (2010). Enhancing the scientific credibility of single-case intervention research: Randomization to the rescue. Psychological Methods, 15, 122–144.
Kreft, I. G. G., & de Leeuw, J. (1998). Introducing multilevel modeling. Thousand Oaks, CA: Sage.
Lahey, B. B., & D’Orofrio, B. M. (2010). All in the family: Comparing siblings to test causal hypotheses regarding environmental influences on behavior. Current Directions in Psychological Science, 19, 319–323.
Larsen, R. J. (1989). A process approach to personality psychology: Utilizing time as a facet of data. In D. M. Buss & N. Cantor (Eds.), Personality psychology: Recent trends and emerging directions (pp. 177–193). New York: Springer-Verlag.
Lee, Y., Ellenberg, J., Hirtz, D., & Nelson, K. (1991). Analysis of clinical trials by treatment actually received: Is it really an option? Statistics in Medicine, 10, 1595–1605.
Lehman, D. R., Lempert, R. O., & Nisbett, R. E. (1988). The effects of group training on reasoning: Formal discipline and thinking about everyday events. American Psychologist, 43, 531–442.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd Ed.). Hoboken, NJ: Wiley.
Lohr, S. (2010). Sampling: Design and analysis (2nd Ed.). Boston: Brooks/Cole.
Ludwig, J., & Miller, D. L. (2007). Does Head Start improve children's life chances? Evidence from a regression discontinuity design. Quarterly Journal of Economics, 122, 159–208.
MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Mahwah, NJ: Lawrence Erlbaum.
Magidson, J. (1977). Toward a causal modeling approach to adjusting for pre-existing differences in the nonequivalent group situation: A general alternative to ANCOVA. Evaluation Quarterly, 1, 399–420.
Mark, M. M., & Mellor, S. (1991). Effect of self-relevance of an event on hindsight bias: The foreseeability of a layoff. Journal of Applied Psychology, 76, 569–577.
Matt, G. E. (2003). Will it work in Münster? Meta-analysis and the empirical generalization of causal relationships. In R. Schulze, H. Holling, & V. Böhning (Eds.), Meta-analysis: New developments and applications in medical and social sciences (pp. 113–128). Cambridge, MA: Hogrefe & Huber.
Matt, G. E., & Cook, T. D. (2009). Threats to the validity of generalized inferences. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), Handbook of research synthesis and meta-analysis (2nd Ed., pp. 537–560). New York: Russell Sage.
Mayer, A., Thoemmes, F., Rose, N., Steyer, R., & West, S. G. (2013). Theory and analysis of total, direct, and indirect causal effects. Unpublished manuscript, Psychologisches Institute, Universität Jena, Jena, Germany.
Mazur-Hart, S. F., & Berman, J. J. (1977). Changing from fault to no-fault divorce: An interrupted time series analysis. Journal of Applied Social Psychology, 7, 300–312.
Ming, K., & Rosenbaum, P. R. (2000). Substantial gains in bias reduction from matching with a variable numbers of controls. Biometrics, 56, 118–124.
Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inferences: Methods and principles for social research. New York: Cambridge University Press.
Moser, S. E., West, S. G., & Hughes, J. N. (2012). Trajectories of math and reading achievement in low achieving children in elementary school: Effects of early and later retention in grade. Journal of Educational Psychology, 104, 603–621.
Murray, D. M. (1998). Design and analysis of group-randomized trials. New York: Oxford University Press.
Murray, D. M., Varnell, S. P., & Blitstein, J. L. (2004). Design and analysis of group-randomized trials: A review of recent methodological developments. American Journal of Public Health, 94, 423–432.
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd Ed.). New York: Cambridge University Press.
Pohl, S., Steiner, P. M., Eisermann, J., Soellner, R., & Cook, T. D. (2009). Unbiased causal inference from an observational study: Results of a within-study comparison. Educational Evaluation and Policy Analysis, 31, 463–479.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis. Thousand Oaks, CA: Sage.
Reichardt, C. S. (2006). The principle of parallelism in the design of studies to estimate treatment effects. Psychological Methods, 11, 1–18.
Reichardt, C. S., & Gollob, H. F. (1997). When confidence intervals should be used instead of statistical tests, and vice versa. In L. L. Harlow, S. A. Mulaik, & J. H. Steiger (Eds.), What if there were no significance tests? Mahwah, NJ: Erlbaum.
Reichardt, C. S., & Mark, M. M. (2004). Quasi-experimentation. In J. S. Wholey, H. P. Hatry, & K. E. Newcomer (Eds.), Handbook of practical program evaluation (2nd Ed., pp. 126–149). San Francisco: Jossey-Bass.
Reis, H. T., & Gosling, S. D. (2010). Social psychological methods outside the laboratory. In S. Fiske, D. Gilbert, & G. Lindzey (Eds.), Handbook of social psychology (5th ed., Vol. 1, pp. 82–114). New York: Wiley.
Reynolds, K. D., & West, S. G. (1987). A multiplist strategy for strengthening nonequivalent control group designs. Evaluation Review, 11, 691–714.
Ribisl, K. M., Walton, M. A., Mowbray, C. T., Luke, D. A., Davidson, W. A., & Bootsmiller, B. J. (1996). Minimizing participant attrition in panel studies through the use of effective retention and tracking strategies: Review and recommendations. Evaluation and Program Planning, 19, 1–25.
Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7, 331–363.
Roos, Jr., L. L., Roos, N. P., & Henteleff, P. D. (1978). Assessing the impact of tonsillectomies. Medical Care, 16, 502–518.
Rosenbaum, P. R. (1986). Dropping out of high school in the United States: An observational study. Journal of Educational Statistics, 11, 207–224.
Rosenbaum, P. R. (1987). The role of a second control group in an observational study (with discussion). Statistical Science, 2, 292–316.
Rosenbaum, P. R. (2002). Observational studies (2nd Ed.). New York: Springer-Verlag.
Rosenbaum, P. R. (2007). Interference between units in randomized experiments. Journal of the American Statistical Association, 102, 191–200.
Rosenbaum, P. R. (2010). Design of observational studies. New York: Springer-Verlag.
Rosenbaum, P. R., & Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701.
Rubin, D. B. (1977). Assignment to treatment group on the basis of a covariate. Journal of Educational Statistics, 2, 1–26.
Rubin, D. B. (1978). Bayesian inference for causal effects. The Annals of Statistics, 6, 34–58.
Rubin, D. B. (1986). What ifs have causal answers. Journal of the American Statistical Association, 81, 961–962.
Rubin, D. B. (2005). Causal inference using potential outcomes. Journal of the American Statistical Association, 100, 322–331.
Rubin, D. B. (2011). Statistical inference for causal effects, with emphasis on applications in psychometrics and education. In M. Williams & P. Vogt (Eds.), Handbook of innovation in social research methods (pp. 524–542). Thousand Oaks, CA: Sage.
Sagarin, B. J., Ratnikov, A., Homan, W. K., Ritchie, T. D., & Hansen, E. J. (in press). Treatment non-compliance in randomized experiments: Statistical approaches and design issues. Psychological Methods.
Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13, 279–313.
Schwarz, N. B., & Hippler, H. J. (1995). Subsequent questions may influence answers to preceding questions in mail surveys. Public Opinion Quarterly, 59, 93–97.
Sears, D. O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology's view of human nature. Journal of Personality and Social Psychology, 51, 515–530.
Seaver, W. B., & Quarton, R. J. (1976). Regression discontinuity analysis of the dean's list effects. Journal of Educational Psychology, 68, 459–465.
Sechrest, L., West, S. G., Phillips, M. A., Redner, R., & Yeaton, W. (1979). Some neglected problems in evaluation research: Strength and integrity of treatments. In L. Sechrest, S. G. West, M. Phillips, R. Redner, & W. Yeatons (Eds.), Evaluation studies review annual (Vol. 4, pp. 15–35). Beverly Hills, CA: Sage.
Shadish, W. R. (2010). Campbell and Rubin: A primer and comparison of their approaches to causal inference in field settings. Psychological Methods, 15, 3–17.
Shadish, W. R. (2013). Propensity score analysis: Promise, reality, and irrational exhuberance. Journal of Experimental Criminology, 9, 129–144.
Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random to nonrandom assignment. Journal of the American Statistical Association, 103, 1334–1343.
Shadish, W. R., & Cook, T. D. (1999). Design rules: More steps toward a complete theory of quasi-experimentation. Statistical Science, 14, 294–300.
Shadish, W. R., & Cook, T. D. (2009). The renaissance of field experimentation in evaluating interventions. Annual Review of Psychology, 60, 607–629.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton-Mifflin.
Shadish, W. R., Galindo, R., Wong, V. C., Steiner, P. M., Cook, T. D. (2011). A randomized experiment comparing random to cutoff-based assignment. Psychological Methods, 16(2), 179–191.
Shadish, W. R., Hu, X., Glaser, R. R., Knonacki, R., & Wong, S. (1998). A method for exploring the effects of attrition in randomized experiments with dichotomous outcomes. Psychological Methods, 3, 3–22.
Shadish, W. R., & Sullivan, K. J. (2012). Theories of causation in psychological science. In H. Cooper (Ed.), APA Handbook of research methods in psychology (Vol. 1, pp. 23–52). Washington, DC: American Psychological Association.
Shadish, W. R., Hedges, L. V., Pustejovsky, J., Rindskopf, D. M., Boyajian, J. G. & Sullivan, K. J. (in press). Analyzing single-case designs: d, G, multilevel models, Bayesian estimators, generalized additive models, and the hopes and fears of researchers about analyses. In T. R. Kratochwill & J. R. Levin (Eds.), Single-Case Intervention Research: Methodological and Data-Analysis Advances. Washington, D.C.: American Psychological Association.
Snijders, T. A. B., & Bosker, R. J. (2011). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd Ed.). Thousand Oaks, CA: Sage.
Sobel, M. E. (2006). What do randomized studies of housing mobility demonstrate? Journal of the American Statistical Association, 101, 1398–1407.
Steyer, R., Partchev, I., Kroehne, U., Nagengast, B., & Fiege, C. (in press). Probability and causality: Theory. Heidelberg, Germany: Springer.
Stuart, E. A., Cole, S. R., Bradshaw, C. P., & Leaf, P. J. (2011). The use of propensity scores to assess the generalizability of results from randomized trials. The Journal of the Royal Statistical Society, Series A, 174, 369–386.
Stuart, E. A., & Green, K. M. (2008). Using full matching to estimate causal effects in nonexperimental studies: Examining the relationship between adolescent marijuana use and adult outcomes. Developmental Psychology, 44, 395–406.
Thoemmes, F. J., & West, S. G. (2011). The use of propensity scores for nonrandomized designs with clustered data. Multivariate Behavioral Research, 46, 514–543.
Trochim, W. M. K. (1984). Research design for program evaluation: The regression-discontinuity approach. Beverly Hills, CA: Sage.
Trochim, W. M. K., Cappelleri, J. C., & Reichardt, C. S. (1991). Random measurement error does not bias the treatment effect estimate in the regression-discontinuity design: II. When an interaction effect is present. Evaluation Review, 15, 571–604.
Velicer, W. F., & Molenaar, P. C. (2013). Time series analysis for psychological research. In J. A. Schinka & W. F. Velicer (Eds.), Handbook of psychology, Vol. 2: Research methods in psychology (2nd Ed., pp. 628–660). Hoboken, NJ: Wiley.
Vinokur, A. D., Price, R. H., & Caplan, R. D. (1991). From field experiments to program implementation: Assessing the potential outcomes of an experimental intervention program for unemployed persons. American Journal of Community Psychology, 19, 543–562.
Warner, R. M. (1998). Spectral analysis of time-series data. New York: Guilford.
West, S. G. (2008, July). Observational studies: Towards improving design and analysis. In Symposium on causal effects – design and analysis. Altes Schloss Dornburg, Germany. Video available from
West, S. G. (2009). Alternatives to randomized experiments. Current Directions in Psychology, 18, 299–304.
West, S. G., & Aiken, L. S. (1997). Towards understanding individual effects in multiple component prevention programs: Design and analysis strategies. In K. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 167–210). Washington, DC: American Psychological Association.
West, S. G., Aiken, L. S., & Todd, M. (1993). Probing the effects of individual components in multiple component prevention programs. American Journal of Community Psychology, 21, 571–605.
West, S. G., Cham, H., Thoemmes, F., Renneberg, B., Schultz, J., & Weiler, M. (in press). Propensity scores as a basis for equating groups: Basic principles and application in clinical outcome research. Journal of Consulting and Clinical Psychology.
West, S. G., Duan, N., Pequegnat, W., Gaist, P., DesJarlais, D., Holtgrave, D., Szapocznik, J., Fishbein, M., Rapkin, B., Clatts, M., & Mullen, P. (2008). Alternatives to the randomized controlled trial. American Journal of Public Health, 98, 1359–1366.
West, S. G., & Graziano, W. G. (2012). Basic, applied, and full-cycle social psychology: Enhancing causal generalization and impact. In D. T. Kenrick, N. J. Goldstein, & Braver, S. L. (Eds.), Six degrees of social influence: Science, application, and the psychology of Bob Cialdini (pp. 119–133). New York: Oxford University Press.
West, S. G., & Hepworth, J. T. (1991). Statistical issues in the study of temporal data: Daily experiences. Journal of Personality, 59, 611–662.
West, S. G., Hepworth, J. T., McCall, M. A., & Reich, J. W. (1989). An evaluation of Arizona's July 1992 drunk driving law: Effects on the city of Phoenix. Journal of Applied Social Psychology, 19, 1212–1237.
West, S. G., Newsom, J. T., & Fenaughty, A. M. (1992). Publication trends in JPSP: Stability and change in the topics, methods, and theories across two decades. Personality and Social Psychology Bulletin, 18, 473–484.
West, S. G., Ryu, E., Kwok, O-M., & Cham, H. (2011). Multilevel modeling: Current applications in personality research. Journal of Personality, 79, 2–49.
West, S. G., & Sagarin, B. J. (2000). Participation selection and loss in randomized experiments. In L. Bickman (Ed.), Research design: Donald Campbell's legacy (pp. 117–154). Thousand Oaks, CA: Sage.
West, S. G., & Thoemmes, F. (2008). Equating groups. In J. Brannon, P. Alasuutari, & L. Bickman (Eds.), Handbook of social research methods (pp. 414–430). Thousand Oaks, CA: Sage.
West, S. G., & Thoemmes, F. (2010). Campbell's and Rubin's perspectives on causal inference. Psychological Methods, 15, 18–37.
Willett, J. B., & Singer, J. D. (2013). Applied multilevel data analysis. Book in preparation. Graduate School of Education, Harvard University, Cambridge, MA.
Wilson, T. D., Aronson, E., & Carlsmith, K. (2010). The art of laboratory experimentation. In S. T. Fiske, D. T. Gilbert, & G. Lindzey (Eds.), Handbook of social psychology (5th Ed., Vol. 1, pp. 51–81). Hoboken, NJ: Wiley.
Wong, V. C., Steiner, P.M., Cook, T. D. (2013). Analyzing regression-discontinuity designs with multiple assignment variables: A comparative study of four estimation methods. Journal of Educational and Behavioral Statistics, 38, 117--141.