Twin studies of complex traits, such as behavior or psychiatric diagnoses, frequently involve univariate analysis of a sum score derived from multiple items. In this article, we show that absence of measurement invariance across zygosity can bias estimates of genetic and environmental components of variance. Specifically, if the item responses are considered as multiple indicators of a latent factor, and the aim is to partition the variance in the latent factor, then the factor loadings relating the items to the factor should be equal for monozygotic (MZ) and dizygotic (DZ) twins. While it seems unlikely, a priori, that these loadings should differ as a function of zygosity, certain special measurement situations are cause for concern. Ratings by parents, or self-ratings of pheno- types which are more easily observed in others than via introspection, may be tainted by the co-twin's phenotype to a greater extent in MZ than DZ pairs. We also show that the analysis of sum scores typically biases both MZ and DZ correlations compared to the true latent trait correlation. These two sources of bias are quantified for a range of values and are shown to be especially acute for sum scores based on binary items. Solutions to these problems include formal tests for measurement invariance across zygosity prior to analysis of the sum or scale scores, and multivariate genetic analysis at the individual item or symptom level.