The dominant, contemporary paradigm for developing and refining diagnoses relies heavily on assessing reliability with kappa coefficients and virtually ignores a core component of psychometric practice: the theory of latent structures. This article describes a psychometric approach to psychiatric nosology that emphasizes the diagnostic accuracy and confusability of diagnostic categories. We apply these methods to the Diagnostic Interview for Genetic Studies (DIGS), a structured psychiatric interview designed by the NIMH Genetics Initiative for genetic studies of schizophrenia and bipolar disorder. Our results show that sensitivity and specificity were excellent for both DSM-III-R and RDC diagnoses of major depression, bipolar disorder, and schizophrenia. In contrast, diagnostic accuracy was substantially lower for subtypes of schizoaffective disorder – especially for the DSM-III-R definitions. Both the bipolar and depressed subtypes of DSM-III-R schizoaffective disorder had excellent specificity but poor sensitivity. The RDC definitions also had excellent specificity but were more sensitive than the DSM-III-R schizoaffective diagnoses. The source of low sensitivity for schizoaffective subtypes differed for the two diagnostic systems. For RDC criteria, the schizoaffective subtypes were frequently confused with one another; they were less frequently confused with other diagnoses. In contrast, the DSM-III-R subtypes were often confused with schizophrenia, but not with each other.