References
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.
Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture models: Implications for overextraction of latent trajectory classes. Psychological Methods, 8, 338–363.
Bechtoldt, H. P. (1961). An empirical study of the factor analysis stability hypothesis. Psychometrika, 26, 405–432.
Bechtoldt, H. P. (1974). A confirmatory analysis of the factor stability hypothesis. Psychometrika, 39, 319–326.
Bentler, P. M. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238–246.
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.
Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Psychometrika, 46, 443–459.
Bollen, K. A. (1989). Structural equations with latent variables. Oxford: Wiley.
Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford.
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62–83.
Browne, M. W., & Arminger, G. (1995). Specification and estimation of mean and covariance structure models. In G. Arminger, C. Clogg, & M. E. Sobel (Eds.), Handbook of statistical modeling for the social and behavioral sciences (pp. 185–249). New York: Plenum.
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press.
Cattell, R. B. (1956a). A shortened “basic English” version (Form C) of the 16 PF Questionnaire. Journal of Social Psychology, 44, 257–278.
Cattell, R. B. (1956b). Validation and intensification of the Sixteen Personality Factor Questionnaire. Journal of Clinical Psychology, 12, 205–214.
Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical experiment. Journal of Educational Psychology, 54, 1–22.
Cattell, R. B. (1971). Abilities: Their structure, growth, and action. Boston: Houghton Mifflin.
Cattell, R. B., & Tsujioka, B. (1964). The importance of factor-trueness and validity, versus homogeneity and orthogonality, in test scales. Educational and Psychological Measurement, 24, 3–30.
Chen, W.-H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289.
Cheung, G. W., & Rensvold, R. B. (1999). Testing factorial invariance across groups: A reconceptualization and proposed new method. Journal of Management, 25, 1–27.
Clark, L. A., & Watson, D. (1991). Tripartite model of anxiety and depression: Psychometric evidence and taxonomic implications. Journal of Abnormal Psychology, 100, 316–336.
Cleary, T. A. (1968). Test bias: Prediction of grades of Negro and white students in integrated colleges. Journal of Educational Measurement, 5, 115–124.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Edwards, M. C., Houts, C. R., & Cai, L. (2012). A diagnostic procedure to detect departures from local independence in item response models. Manuscript under review.
Eid, M. (2000). A multitrait-multimethod model with minimal assumptions. Psychometrika, 65, 241–261.
Eid, M., & Diener, E. (2006). Handbook of multimethod measurement in psychology. Washington, DC: American Psychological Association.
Eid, M., Lischetzke, T., & Nussbeck, F. W. (2006). Structural equation models for multitrait-multimethod data. In M. Eid & E. Diener (Eds.), Handbook of multimethod measurement in psychology (pp. 283–299). Washington, DC: American Psychological Association.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
Fabrigar, L. R., & Wegener, D. T. (2012). Exploratory factor analysis. New York: Oxford University Press.
Ferguson, G. A. (1941). The factorial interpretation of test difficulty. Psychometrika, 6, 323–329.
Ferrer, E., Balluerka, N., & Widaman, K. F. (2008). Factorial invariance and the specification of second-order growth models. Methodology, 4, 22–36.
Fiske, D. W. (1949). Consistency of the factorial structures of personality ratings from different sources. Journal of Abnormal and Social Psychology, 44, 329–344.
Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7, 286–299.
Goldberg, L. R. (1993). The structure of phenotypic personality traits. American Psychologist, 48, 26–34.
Goldberg, L. R., & Velicer, W. F. (2006). Principles of exploratory factor analysis. In S. Strack (Ed.), Differentiating normal and abnormal personality (2nd ed., pp. 209–237). New York: Springer.
Gorsuch, R. L. (1988). Exploratory factor analysis. In J. R. Nesselroade & R. B. Cattell (Eds.), Handbook of multivariate experimental psychology (2nd ed., pp. 231–258). New York: Plenum.
Guilford, J. P. (1941). The difficulty of a test and its factor composition. Psychometrika, 6, 67–77.
Guilford, J. P. (1964). Zero correlations among tests of intellectual abilities. Psychological Bulletin, 61, 401–404.
Hancock, G. R., Kuo, W.-L., & Lawrence, F. R. (2001). An illustration of second-order latent growth models. Structural Equation Modeling, 8, 470–489.
Horn, J. L., & Hofer, S. M. (1992). Major abilities and development in the adult period. In R. J. Sternberg & C. A. Berg (Eds.), Intellectual development (pp. 44–99). New York: Cambridge University Press.
Horn, J. L., & McArdle, J. J. (2007). Understanding human intelligence since Spearman. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: Historical developments and future directions (pp. 205–247). Mahwah, NJ: Erlbaum.
Horn, J. L., McArdle, J. J., & Mason, R. (1983). When is invariance not invariant: A practical scientist's look at the ethereal concept of factor invariance. Southern Psychologist, 1, 179–188.
Horn, J. L., & Noll, J. (1997). Human cognitive capabilities: Gf-Gc theory. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (pp. 53–91). New York: Guilford.
Hu, L. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.
John, O. P., Donahue, E. M., & Kentle, R. L. (1991). The Big Five Inventory – Versions 4a and 54. Berkeley, CA: University of California, Berkeley, Institute of Personality and Social Research.
Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 32, 443–482.
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183–202.
Jöreskog, K. G. (1971a). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426.
Jöreskog, K. G. (1971b). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109–133.
Kenny, D. A. (1976). An empirical application of confirmatory factor analysis to the multitrait-multimethod matrix. Journal of Experimental Social Psychology, 12, 247–252.
Levy, R., & Hancock, G. R. (2007). A framework of statistical tests for comparing mean and covariance structure models. Multivariate Behavioral Research, 42, 33–66.
Linacre, J. M. (1994). Sample size and item calibrations stability. Rasch Measurement Transactions, 7, 328.
Linacre, J. M., & Wright, B. D. (1999). A user's guide to Bigsteps/Winsteps. Chicago: Mesa.
Little, T. D. (1997). Mean and covariance structures (MACS) analyses of cross-cultural data: Practical and theoretical issues. Multivariate Behavioral Research, 32, 53–76.
Little, T. D., Cunningham, W. A., Shahar, G., & Widaman, K. F. (2002). To parcel or not to parcel: Exploring the question, weighing the merits. Structural Equation Modeling, 9, 151–173.
Little, T. D., Lindenberger, U., & Nesselroade, J. R. (1999). On selecting indicators for multivariate measurement and modeling with LVs: When “good” indicators are bad and “bad” indicators are good. Psychological Methods, 4, 192–211.
Little, T. D., Rhemtulla, M., Gibson, K., & Schoemann, A. M. (2013). Why the items versus parcels controversy needn't be one. Psychological Methods, 18, 285--300.
Lord, F. M. (1952). A theory of test scores. Psychometric Monographs, No. 7.
Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000). WinBUGS – a Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325–337.
MacCallum, R. (1986). Specification searches in covariance structure modeling. Psychological Bulletin, 100, 107–120.
MacCallum, R. C., Widaman, K. F., Preacher, K. J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36, 611–637.
MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84–99.
Marsh, H. W. (1989). Confirmatory factor analyses of multitrait-multimethod data: Many problems and a few solutions. Applied Psychological Measurement, 13, 335–361.
Marsh, H. W., Lüdtke, O., Nagengast, B., Morin, A. J. S., & von Davier, M. (2013). Why item parcels are (almost) never appropriate: Two wrongs do not make a right – Camouflaging misspecification with item parcels in CFA models. Psychological Methods, 18, 257--284.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
McArdle, J. J. (1988). Dynamic but structural equation modeling of repeated measures data. In J. R. Nesselroade & R. B. Cattell (Eds.), Handbook of multivariate experimental psychology (2nd ed., pp. 561–614). New York: Plenum.
McArdle, J. J. (1996). Current directions in structural factor analysis. Current Directions in Psychological Science, 5, 11–18.
McArdle, J. J. (2007). Five steps in the structural factor analysis of longitudinal data. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: Historical developments and future directions (pp. 99–130). Mahwah, NJ: Erlbaum.
McArdle, J. J., & Cattell, R. B. (1994). Structural equation models of factorial invariance in parallel proportional profiles and oblique confactor problems. Multivariate Behavioral Research, 29, 63–113.
McCrae, R. R., & Costa, Jr., P. T. (1987). Validation of the five-factor model of personality across instruments and observers. Journal of Personality and Social Psychology, 52, 81–90.
McDonald, R. P. (1967). Nonlinear factor analysis (Psychometric Monograph No. 15). Richmond, VA: Psychometric Corporation.
McDonald, R. P. (1999). Test theory. Mahwah, NJ: Erlbaum.
McDonald, R. P., & Ahlawat, K. S. (1974). Difficulty factors in binary data. British Journal of Mathematical and Statistical Psychology, 27, 82–99.
McDonald, R. P., & Marsh, H. W. (1990). Choosing a multivariate model: Noncentrality and goodness of fit. Psychological Bulletin, 107, 247–255.
Meredith, W. (1964a). Notes on factorial invariance. Psychometrika, 29, 177–185.
Meredith, W. (1964b). Rotation to achieve factorial invariance. Psychometrika, 29, 187–206.
Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525–543.
Meredith, W., & Horn, J. (2001). The role of factorial invariance in modeling growth and change. In L. M. Collins & A. G. Sayer (Eds.), New methods for the analysis of change (pp. 203–240). Washington, DC: American Psychological Association.
Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York: Routledge/Taylor & Francis.
Millsap, R. E., & Meredith, W. (2007). Factorial invariance: Historical perspectives and new problems. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: Historical perspectives and future directions (pp. 131–152). Mahwah, NJ: Erlbaum.
Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39, 479–515.
Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59–71.
Muraki, E. (1993). Information functions of the generalized partial credit model. Applied Psychological Measurement, 17, 351–363.
Muthén, B. O. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551–560.
Muthén, B. O. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115–132.
Muthén, B. O. (1985). A method for studying the homogeneity of test items with respect to other relevant variables. Journal of Educational Statistics, 10, 121–132.
Muthén, B. O., & Lehman, J. (1985). Multiple-group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics, 10, 133–142.
Muthén, L. K., & Muthén, B. O. (1998–2012). Mplus User's Guide (7th ed.) [Computer software]. Los Angeles, CA: Muthén & Muthén.
Orlando, M. (2004). Critical issues to address when applying item response theory (IRT) models. Paper presented at the Drug Information Association meeting, Bethesda, MD.
Program Committee of the Institute of Objective Measurement. (
2000, December).
Definition of objective measurement. Retrieved March 4, 2012 from
http://www.rasch.org/define.htm.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
Rasch, G. (1966). An item analysis that takes individual differences into account. British Journal of Mathematical and Statistical Psychology, 19, 49–57.
Reckase, M. D. (1977). A linear logistic multidimensional model for dichotomous item response data. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York: Springer-Verlag.
Reeve, B. B., & Fayers, P. (2005). Applying item response theory modeling for evaluating questionnaire item and scale properties. In P. Fayers & R. D. Hays (Eds.), Assessing quality of life in clinical trials: Methods of practice (pp. 55–73). New York: Oxford University Press.
Rizopoulos, D. (2006). ltm: An R package for latent variable modeling and item response theory analysis. Journal of Statistical Software, 17, 1–25.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monographs, No. 17.
Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & Clogg, C. C. (Eds.), LVs analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage.
Savalei, V. (2010). Small sample statistics for incomplete nonnormal data: Extensions of complete data formulae and a Monte Carlo comparison. Structural Equation Modeling, 17, 241–264.
Sheu, C.-F, Chen, C.-T., Su, Y.-H., & Wang, W.-C. (2005). Using SAS PROC NLMIXED to fit item response theory models. Behavior Research Methods, 37, 202–218.
Sörbom, D. (1989). Model modification. Psychometrika, 54, 371–384.
Spearman, C. (1904). “General intelligence,” objectively determined and measured. American Journal of Psychology, 15, 201–293.
Spearman, C. (1927). The abilities of man. Oxford: Macmillan.
Steiger, J. H., & Lind, J. C. (1980, May). Statistically based tests for the number of common factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, Iowa.
Steinberg, L., & Thissen, D. (1996). Uses of item response theory and the testlet concept in the measurement of psychopathology. Psychological Methods, 1, 81–97.
Sterba, S. K., & MacCallum, R. C. (2010). Variability in parameter estimates and model fit across repeated allocations of items to parcels. Multivariate Behavioral Research, 45, 322–358.
Tellegen, A. (1982). Brief manual for the Multidimensional Personality Questionnaire. Unpublished manuscript, University of Minnesota, Minneapolis.
Thissen, D. (1991). MULTILOG user's guide: Multiple, categorical item analysis and test scoring using item response theory. Chicago: Scientific Software.
Thissen, D. (2003). Estimation in Multilog. In M. du Toit (Ed.), IRT from SSI: Bilog-MG, Multilog, Parscale, Testfact. Lincolnwood, IL: Scientific Software International.
Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567–577.
Thissen, D., Steinberg, L., & Gerrard, M. (1986). Beyond group-mean differences: The concept of item bias. Psychological Bulletin, 99, 118–128.
Thurstone, L. L. (1931). Multiple factor analysis. Psychological Review, 38, 406–427.
Thurstone, L. L. (1934). The vectors of mind. Psychological Review, 41, 1–32.
Thurstone, L. L. (1935). The vectors of mind. Chicago: University of Chicago Press.
Thurstone, L. L. (1938). Primary mental abilities. Psychometric Monographs, No. 1. Thurstone, L. L. (1947). Multiple factor analysis. Chicago: University of Chicago Press.
Thurstone, L. L., & Thurstone, T. G. (1941). Factorial studies of intelligence. Psychometric Monographs, No. 2.
Tisak, J., & Meredith, W. (1989). Exploratory longitudinal factor analysis in multiple populations. Psychometrika, 54, 261–281.
Tsutakawa, R. K., & Johnson, J. C. (1990). The effect of item uncertainty of parameter estimation on ability estimates. Psychometrika, 55, 371–390.
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38, 1–10.
Velicer, W. F., & Jackson, D. N. (1990). Component analysis versus common factor analysis: Some issues in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1–28.
Watson, D., Clark, L. A., Weber, K., Assenheimer, J. S., Strauss, M. E., & McCormick, R. A. (1995). Testing a tripartite model: II. Exploring the symptom structure of anxiety and depression in student, adult, and patient samples. Journal of Abnormal Psychology, 104, 15–25.
Watson, D., Weber, K., Assenheimer, J. S., Clark, L. A., Strauss, M. E., & McCormick, R. A. (1995). Testing a tripartite model: I. Evaluating the convergent and discriminant validity of anxiety and depression symptom scales. Journal of Abnormal Psychology, 104, 3–14.
Wherry, R. J., & Gaylord, R. H. (1944). Factor pattern of test items and tests as a function of the correlation coefficient: Content, difficulty, and constant error factors. Psychometrika, 9, 237–244.
Widaman, K. F. (1985). Hierarchically nested covariance structure models for multitrait-multimethod data. Applied Psychological Measurement, 9, 1–26.
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters? Multivariate Behavioral Research, 28, 263–311.
Widaman, K. F. (2007). Common factors versus components: Principals and principles, errors and misconceptions. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: Historical developments and future directions (pp. 177–203). Mahwah, NJ: Erlbaum.
Widaman, K. F., Ferrer, E., & Conger, R. D. (2010). Factorial invariance within longitudinal structural equation models: Measuring the same construct across time. Child Development Perspectives, 4, 10–18.
Widaman, K. F., & Reise, S. P. (1997). Exploring the measurement invariance of psychological instruments: Applications in the substance use domain. In K. J. Bryant, M. Windle, & S. G. West (Eds.), The science of prevention: Methodological advances from alcohol and substance abuse research (pp. 281–324). Washington, DC: American Psychological Association.
Wilson, D. T., Wood, R., & Gibbons, R. D. (1991). Testfact: Test scoring, item statistics, and item factor analysis. Chicago: Scientific Software International.
Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.
Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44, 1–27.
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125–145.
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187–213.
Zimowski, M. F., Muraki, E., Mislevy, R. J., & Bock, R. D. (1996). BILOG-MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago: Scientific Software.