Psychiatric symptoms in early life are associated with poor cognition [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1]. For example, psychotic symptoms in childhood and adolescence are associated with cognitive impairment [Reference Bora, Lin, Wood, Yung, McGorry and Pantelis2,Reference Brewer, Wood, Phillips, Francey, Pantelis and Yung3]. There is also evidence for IQ deficits in children with conduct problems [Reference Deater-Deckard, Mullineaux, Beekman, Petrill, Schatschneider and Thompson4], vocabulary deficits in children with aggression [Reference Dionne, Tremblay, Boivin, Laplante and Pérusse5], visuospatial deficits in children with hyperactivity [Reference Seguin, Parent, Tremblay and Zelazo6], and social deficits in adolescents with externalizing problems [Reference Gur, Moore, Calkins, Ruparel and Gur7]. Small, generalized deficits have also been reported in children with symptoms of anxiety and depression [Reference Lundy, Silva, Kaemingk, Goodwin and Quan8].
Early life psychopathology is also underpinned by genes with evidence of substantial genetic effects on childhood and adolescent psychopathology [Reference Wesseldijk, Fedko, Bartels, Nivard, van Beijsterveldt and Boomsma9,Reference Polderman, Benyamin, de Leeuw, Sullivan, van Bochoven and Visscher10], as well as specific psychotic [Reference Polanczyk, Moffitt, Arseneault, Cannon, Ambler and Keefe11,Reference Zavos, Freeman, Haworth, McGuire, Plomin and Cardno12], externalizing [Reference Hannigan, Walaker, Waszczuk, McAdams and Eley13–Reference Huizink, Van Den Berg, van der Ende and Verhulst15], internalizing [Reference Ormel, Oldehinkel, Ferdinand, Hartman, De Winter and Veenstra16], and anxiety/depression symptoms [Reference Gregory and Eley17–Reference Nivard, Dolan, Kendler, Kan, Willemsen and van Beijsterveldt21]. Since cognition in early life is also influenced by genes [Reference Haworth, Wright, Luciano, Martin, de Geus and van Beijsterveldt22,Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23], recent studies have examined whether there is a genetic component to cognitive correlates of early life psychopathology. Evidence from twin and family studies shows genetic overlap between psychopathology and cognitive ability [Reference Jacobs, Rijsdijk, Derom, Danckaerts, Thiery and Derom24], psychopathology and executive functions [Reference Harden, Engelhardt, Mann, Patterson, Grotzinger and Savicki25], attention deficit hyperactivity disorder (ADHD) and IQ [Reference Kuntsi, Eley, Taylor, Hughes, Asherson and Caspi26], ADHD and executive functions [Reference Bidwell, Willcutt, Defries and Pennington27,Reference Doyle, Biederman, Seidman, Reske-Nielsen and Faraone28], and inattention and attention regulation [Reference Wang, Deater-Deckard, Petrill and Thompson29]. GWAS evidence shows genetic overlap between ADHD and intelligence [Reference Demontis, Walters, Martin, Mattheisen, Als and Agerbo30-Reference Savage, Jansen, Stringer, Watanabe, Bryois and de Leeuw32], academic underperformance [Reference O’Connell, Shadrin, Smeland, Bahrami, Frei and Bettella31,Reference Liu, Li, Viding, Asherson and Pingault33,Reference Johnson, McGue and Iacono34], and executive functions [Reference Donati, Dumontheil and Meaburn35]. However, a comprehensive examination of genetic underpinnings of cognitive impairment in early life psychopathology, that is, comprising multiple dimensions of psychopathology and cognition, is lacking.
Gaps in knowledge also exist regarding whether genetic influences on early life psychopathology show developmental variation. A recent review reported static and dynamic effects on internalizing and externalizing symptoms between infancy and early adulthood [Reference Hannigan, Walaker, Waszczuk, McAdams and Eley13], with certain genetic effects remaining influential throughout development, and novel genetic factors also emerging. However, previous studies focused on developmental periods rather than examining age continuously. Another way to test for developmental variation in genetic influences is Gene × Age (G × A) interactions. G × A interactions can be tested using a cross-sectional design that models differences in psychopathology as a function of relatedness and similarity in age between individuals [Reference Glahn, Kent, Sprooten, Diego, Winkler and Curran36–Reference Almasy, Towne, Peterson and Blangero38]. Thus, G × A analysis tests for fluctuations in action of genetic factors, as well as variation in genetic factors at different ages [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23].
We used data from the Philadelphia Neurodevelopmental Cohort (PNC) [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1], a large population-based sample aged 8–21, to derive factors of psychopathology using a bifactor model of item-level data from a psychiatric interview [Reference Shanmugan, Wolf, Calkins, Moore, Ruparel and Hopson39,Reference Moore, Calkins, Satterthwaite, Roalf, Rosen and Gur40]. In bifactor models, items load on two factors simultaneously: (a) a general factor that accounts for commonality of all items (here general psychopathology) and (b) specific factors that account for unique influence of specific domains over and above the general factor (specific dimensions of psychopathology) [Reference Knowles, Mathias, Pearlson, Barrett, Mollon and Denbow41,Reference Chen, West and Sousa42]. Thus, bifactor models estimate the contribution of items to the general factor after controlling for specific factors, and vice versa [Reference Shanmugan, Wolf, Calkins, Moore, Ruparel and Hopson39]. Therefore, the utility of bifactor models lies in their ability to parse out this shared variance between general and specific factors, such that there is no contamination between factors, meaning that the general factor cannot account for findings in the specific factors, or vice versa. In other words, bifactor models allow examination of the unique contribution of the general and specific factors to prediction of external factors, or of the unique contribution of external factors (here genetic factors) to the general and specific factors [Reference Reise, Moore and Haviland43,Reference Reise44]. Importantly, bifactor models accommodate orthogonal factor scores despite correlated latent factors [Reference Shanmugan, Wolf, Calkins, Moore, Ruparel and Hopson39,Reference Reise, Moore and Haviland43]. We then used a genetic relatedness matrix to establish whether these psychopathology factors (a) were heritable, (b) showed genetic overlap with cognition, and (b) showed G × A interactions. In line with previous evidence, we hypothesized that psychopathology factors, would be (a) heritable, (b) show negative genetic correlations with cognitive functioning, and (c) be influenced by developmentally dynamic genetic factors, that is, show G × A interactions.
PNC is a population-based sample from the greater Philadelphia area, comprising 9,421 individuals aged 8–21. The study has been described in detail [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1]. Briefly, between 2006 and 2012, 50,293 adults were recruited by the Center for Applied Genomics at Children’s Hospital of Philadelphia and provided access to Electronic Medical Records (EMRs). EMRs were screened for eligibility for PNC participation, yielding 19,161 individuals, released to the recruitment team in weekly waves between 2009 and 2011. Potential probands (ages 18–21) or caregivers/legal guardians (ages 8–17) were sent letters introducing the study, and then contacted by phone to explain the study, verify eligibility, and schedule appointments. Participants provided written consent for genomic studies upon providing blood samples during the clinical visit. Inclusion criteria were: (a) ability to provide signed informed consent (parental consent for participants <18), (b) English language proficiency, and (c) physical and cognitive ability to participate in cognitive testing. Data are in dbGaP (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v3.p2).
Genetic analyses were limited to participants who identified as white non-Hispanic (European American), leaving 4,662 subjects with genetic, cognitive, and psychiatric data. Mean age was 13.8 (standard deviation [SD] = 3.6), 50.3% were male (n = 2,346).
Participants completed the Penn computerized neurocognitive battery [Reference Gur, Richard, Hughett, Calkins, Macy and Bilker45,Reference Moore, Reise, Gur, Hakonarson and Gur46], which consists of 14 tests that capture functioning in five domains: (a) executive function (abstraction and mental flexibility, attention, working memory), (b) episodic memory (verbal, facial, spatial), (c) complex cognition (verbal reasoning, nonverbal reasoning, spatial processing), (d) social cognition (emotion identification, emotion differentiation, age differentiation), and (e) speed (motor, sensorimotor). Accuracy and reaction times are recorded for each test. All tests show moderate to very high reliability [Reference Gur, Ragland, Moberg, Turner, Bilker and Kohler47]. The battery also included the reading subtest of the Wide Range Achievement Test (WRAT), a measure of general verbal knowledge. As in our prior work [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23], we derived a general composite score (g) as the first component of principal component analysis (PCA) of accuracy scores. We also derived a general composite score for speed (gs) as the first component of PCA of reaction times. To minimize the impact of missing data, multivariate imputation by chained equation (MICE) [Reference White, Royston and Wood48] was used to impute missing values using the mice package in R [Reference van Buuren and Groothuis-Oudshoorn49]. Imputation was based on age, sex and available cognitive data (participants missing <50% cognitive data) [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23]. Subsequent analyses were conducted on imputed data.
Psychiatric symptoms were ascertained using a computerized, structured interview (GOASSESS) [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1,Reference Shanmugan, Wolf, Calkins, Moore, Ruparel and Hopson39,Reference Satterthwaite, Connolly, Ruparel, Calkins, Jackson and Elliott50], a modified version of the Kiddie-Schedule for Affective Disorders and Schizophrenia [Reference Kaufman, Birmaher, Brent, Rao, Flynn and Moreci51]. GOASSESS was administered to caregivers/legal guardians (ages 8–10), participants and caregivers/legal guardians (ages 11–17), and participants (ages 18–21). Bachelor- and Master-level assessors underwent a 25-h training protocol comprising didactic sessions, assigned readings, and supervised pairwise practice. Assessors were certified through standardized procedures requiring observation by a certified clinical observer who rated proficiency on a 60-item checklist of interview procedures. Responses coded by the assessor were required to correspond to responses coded by a certified clinical observer. Assessors underwent repeat observation until meeting passing criteria [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1].
Factor analysis to create factors of psychopathology
We applied a confirmatory bifactor model [Reference Reise, Moore and Haviland43] in Mplus [Reference Muthén and Muthén52] to 112 items from the GOASSESS [Reference Calkins, Merikangas, Moore, Burstein, Behr and Satterthwaite1,Reference Shanmugan, Wolf, Calkins, Moore, Ruparel and Hopson39,Reference Moore, Calkins, Satterthwaite, Roalf, Rosen and Gur40](Figure 1a) using mean- and variance-adjusted weighted least squared estimator. Five orthogonal factors of psychopathology were generated for 9,421 individuals with GOASSESS data: (a) anxious-misery (mood and anxiety), (b) externalizing (ADHD and conduct disorder), (c) fear (phobias), (d) psychosis-spectrum, and (e) a general factor of overall psychopathology. Since bifactor models estimate the contribution of items to an overall dimension (general psychopathology) after controlling for specific factors, and vice versa, all factors (both general and specific) are orthogonal. Thus, bifactor models parse out the shared variance between general and specific factors, such that there is no contamination between factors. Therefore, bifactor models allow examination of the unique contribution of external factors (here genetic factors) to the general and specific factors [Reference Reise, Moore and Haviland43,Reference Reise44]. Table S1 shows factor loadings, Table S2 shows correlations between factors (and with cognition), Figure S1 shows test information plots.
Samples were genotyped on one of four Illumina arrays: HumanHap550, HumanHap610, OmniExpress, or Human1M. Genotyped data were imputed at the Broad Institute [Reference Germine, Robinson, Smoller, Calkins, Moore and Hakonarson53] using IMPUTE2 and reference haplotypes in Phase I of the 1000 genomes data (June 2011 release) that included ~37,138,905 variants from 1,094 individuals from Africa, Asia, Europe, and the United States. Imputed genotype data were used in subsequent analyses.
Estimation of genetic relatedness matrix
Empirical relatedness quantifies the proportion of alleles that are identical by descent between individuals. Empirical relatedness was calculated for all pairs of individuals using genotype data [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23]. Briefly, 50k common autosomal single nucleotide proteins (SNPs) in approximate linkage equilibrium were selected from all available SNP variants after linkage disequilibrium (LD) pruning (r 2 > 0.1) using PLINK [Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee54]. Relatedness was estimated from these SNPs using IBDLD [Reference Han and Abney55](up to 50 SNPs within a 2-cm span). The matrix was inspected to ensure correct properties (trace equal to number of genotyped subjects, symmetry, positive semi-definiteness, range of diagonal, and off-diagonal elements). Distribution of estimated relatedness values has been presented previously [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23]. Empirical genetic relatedness matrices are advantageous because quantitative genetic analyses (described below) that are traditionally applied to family data using genetic relatedness matrices calculated from pedigree information can, in turn, be applied to cohorts of related and/or unrelated individuals (such as the PNC) using empirical genetic relatedness instead.
Statistical and quantitative genetic analyses
R  was used to generate descriptive statistics and graphics. Genetic analyses were conducted in Sequential Oligogenetic Linkage Analysis Routines (SOLAR) [Reference Blangero, Goring, Kent, Williams, Peterson and Almasy57] on 4,662 participants who identified as white non-Hispanic (European American), regardless of relatedness. While analyses in SOLAR performed on family data are robust to population stratification, the PNC sample comprises both related and unrelated individuals. Therefore, we only included individuals of European American ancestry (the most common ancestry group in the PNC sample) in our analyses and excluded individuals of non-European ancestry. Relatedly, since rare variants that may explain a substantial proportion of phenotypic variance are not well captured by common SNPs, using related individuals is more powerful than using unrelated individuals when estimating heritability, but the combination of related and unrelated individuals, as in the PNC, is optimal. Moreover, related individuals (even distantly) are critical for detecting G × A interactions (described below). When using only unrelated individuals it is not possible to detect changes in genetic correlation over time since related pairs serve as a pseudo-longitudinal design where the same polygenotypes are observed at different ages.
Univariate and bivariate polygenic models
SOLAR implements linear mixed-effects models, which decompose the overall variance of a quantitative trait [Reference Almasy and Blangero58,Reference Almasy and Blangero59]. Traditionally, these analyses are performed on family data using matrices calculated from pedigree information, but can be applied to cohorts of related and unrelated individuals using relatedness estimated from genotype data [Reference Speed, Hemani, Johnson and Balding60]. Under a univariate polygenic model, the phenotypic variance (σ2p) of a trait is decomposed into genetic (σ2g) and environmental (σ2e) components. Environmental variance incapsulates all variance that is not genetic, including error. Narrow-sense heritability (h2) is the proportion of phenotypic variance accounted for by additive genetic variance (h2 = σ2g/σ2p). To determine whether heritability (h2) was significantly greater than 0, likelihood of the polygenic model was compared to that of a model with h2 constrained to 0. Under a bivariate polygenic model, phenotypic covariance between two traits is decomposed into genetic and environmental components to determine the extent to which traits are influenced by shared genetic effects. Since genetic correlations between traits are only meaningful if traits are heritable, we estimated heritability of all traits. Bivariate polygenic analyses were then applied to significantly heritable pairs of traits to estimate genetic (ρg), environmental (ρe), and phenotypic (ρp) correlations. The genetic correlation (ρg) denotes the correlation between latent additive genetic factors influencing both traits. The environmental correlation (ρe) denotes the correlation between nongenetic factors influencing both traits. To determine whether genetic (ρg) and environmental (ρe) correlations were significantly different from 0, likelihood of the bivariate polygenic model was compared to that of a model where the parameter of interest was constrained to 0.
Gene × Age interaction models
A polygenic model can be extended to examine Gene × Environment (G × E) interactions [Reference Glahn, Kent, Sprooten, Diego, Winkler and Curran36–Reference Almasy, Towne, Peterson and Blangero38]. One consequence of G × E is that additive genetic variance is greater under certain environments than others. To test for this effect with a continuous environmental variable (age), the polygenic model is modified to include a linear function on the logarithm of σ2g. This linear function contains a free parameter, γ, reflecting change in σ2g unit of the environmental variable (age in years). A nonzero value of γ implies a heritable response to the environment, and therefore, a G × E interaction. This G × E interaction tests for fluctuations (with age) in action of genetic factors and a significant G × E interaction suggests a change in magnitude of effect of specific genetic factors (with age). A second consequence of G × E is that the trait exhibits imperfect pleiotropy with itself, that is, the genetic correlation between the trait measured under one environment and the trait measured under another environment is less than 1. This phenomenon can be examined in cross-sectional studies where individuals are tested under a single environment (timepoint), provided relatedness between individuals is known [Reference Blangero37]. To test for this effect, the polygenic model is modified to include another free parameter, λ, reflecting the rate of decay in genetic correlation (ρg) as difference in the environmental variable increases. A nonzero value of λ implies imperfect pleiotropy, and therefore, a G × E interaction. This G × E interaction tests for variation in genetic factors influencing the trait (at different ages) and a significant G × E interaction suggests a change (with age) in the genetic factors themselves. G × E interaction models were fitted to heritable traits, with age in years as the continuous environmental variable that is, Gene × Age interactions. See Table S3 for more information.
All models included age, age2, sex, and their interactions as covariates. To adjust for multiple testing, false discovery rate (FDR) was set at 5% [Reference Benjamini and Yekutieli61]. Rank-based inverse normal transformations were applied to all traits to ensure normality.
In addition to the bifactor model described above, we applied a confirmatory hierarchical model (Figure 1b) in Mplus [Reference Muthén and Muthén52] to generate four correlated factors of psychopathology and a general factor. We generated the same factors as the bifactor model: specific factors of (a) anxious-misery, (b) externalizing, (c) fear, and (d) psychosis-spectrum, and (e) a general factor. In bifactor models, general and specific factors are orthogonal, whereas in hierarchical models, the general factor is defined by the specific factors and thus general and specific factors are explicitly correlated. Univariate, bivariate, and G × A analyses, as described above, were repeated on these factors to examine the contaminating effect, that is shared variance of general and specific factors. Table S4 shows factor loadings, Table S5 shows correlations between factors (and with cognition), Figure S2 shows test information plots.
Externalizing psychopathology is heritable
Significant heritability estimates were observed for general (h2 = 0.21, p = 0.040) and externalizing psychopathology (h2 = 0.46, p = 2 × 10−6), but only externalizing survived FDR correction (Figure 2 and Table 1). As reported previously [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23], most accuracy (h2 range = 0.21–0.72) and reaction time (h2 range = 0.23–0.38) measures were also heritable. Subsequent bivariate and G × A analyses were run on significantly heritable traits (after FDR correction).
Bolded estimates significant after correction for multiple testing (FDR = 0.05).
Abbreviations: AWRAT, Wide Range Achievement Test; SE, standard error.
Overlapping genetic factors on externalizing and cognition suggest pleiotropic effects
Externalizing showed significant negative phenotypic correlations with most accuracy measures (range ρp = −0.042 to −0.127) and some reaction time measures (range ρp = −0.036 to −0.66; Table 1) after FDR correction, such that greater psychopathology was associated with poorer accuracy and slower reaction times. Significant negative genetic correlations were observed after FDR correction between externalizing and accuracy measures of face memory (ρg = −0.412, p = 0.004), verbal reasoning (ρg = −0.485, p = 0.001), spatial processing (ρg = −0.426, p = 0.010), general verbal knowledge (ρg = −0.314, p = 0.002), g (ρg = −0.44, p = 0.002), and sensorimotor speed (ρg = −0.659, p = 1 × 10−4) suggesting that genetic factors underlying externalizing overlap with lower accuracy and slower reaction times.
Genetic variance on externalizing decreases with age
Significant decrease in genetic variance was observed on externalizing (γg = −0.146, p = 0.004; Figure 3), suggesting that specific genetic factors influence externalizing psychopathology between childhood and early adulthood, but also that the magnitude of effect of these genetic factors decreases with age. Significant increase in environmental variance was also observed (γe = 0.059, p = 0.009). Decay in genetic correlation did not reach statistical significance (λ = 0.027, p = 0.423), such that we did not find evidence for change in genetic factors, that is, novel genetic influences. Data presented in Figure 3 were generated using Formula 5 in Table S3.
Results of genetic analyses are robust to factor analytic approach
Univariate, bivariate, and G × A analyses conducted on psychopathology factors derived from the hierarchical model generated similar results. Externalizing (h2 = 0.58, p = 1x10−7) remained heritable, but general psychopathology (h2 = 0.37, p = 0.001) anxious-misery (h2 = 0.35, p = 0.002) and fear (h2 = 0.25, p = 0.011) were also significantly heritable, although fear did not survive FDR correction (Table S6 and Figure S3). General psychopathology and anxious-misery showed significant phenotypic correlations with cognition (Table S6), but only externalizing showed significant genetic correlations with verbal reasoning (ρg = −0.43, p = 0.001), general verbal knowledge (ρg = −0.25, p = 0.005), and g (ρg = −0.37, p = 3x10−6), after FDR correction. Again, significant decrease in genetic variance (γg = −0.112, p = 0.029) and increase in environmental variance (γe = 0.089, p = 0.008) were observed for externalizing (Figure S4) Change in genetic variance on general psychopathology (γg = 0.017, p = 0.279) and anxious-misery (γg = 0.023, p = 0.269) was not significant. Data presented in Figure S4 were generated using Formula 5 in Table S3.
Using a large, population-based cohort of individuals aged 8–21, we showed that externalizing psychopathology in the first two decades of life is under considerable genetic influence. Externalizing showed genetic overlap with lower performance on face memory, verbal reasoning, spatial processing, motor speed, verbal knowledge, and general cognitive ability. We did not find evidence for novel genetic factors on externalizing throughout this developmental period, rather we found a decrease in genetic variance, and increase in environmental variance. These findings have several implications for our understanding of early life psychopathology.
First, to the best of our knowledge, this is the first study to use a large, population-based cohort and genetic relatedness matrix to estimate heritability of data-driven factors of psychopathology (both orthogonal and correlated) throughout childhood and early adulthood. Our finding of heritable general psychopathology, anxious-misery, externalizing, and fear are in line with previous evidence [Reference Polanczyk, Moffitt, Arseneault, Cannon, Ambler and Keefe11–Reference Huizink, Van Den Berg, van der Ende and Verhulst15]. We did not find evidence for genetic influences on the psychosis-spectrum factor, in contrast to prior evidence [Reference Polanczyk, Moffitt, Arseneault, Cannon, Ambler and Keefe11,Reference Zavos, Freeman, Haworth, McGuire, Plomin and Cardno12]. However, SNP heritability estimates of psychotic symptoms are more modest, with a report of nonsignificant and zero estimates for paranoia and hallucinations, respectively [Reference Sieradzka, Power, Freeman, Cardno, Dudbridge and Ronald62]. Moreover, while there has been progress in delineating molecular genetic underpinnings of adult schizophrenia [Reference Ripke, O’Dushlaine, Chambert, Moran, Kähler and Akterin63], the same cannot be said of childhood psychotic symptoms [Reference Zammit, Hamshere, Dwyer, Georgiva, Timpson and Moskvina64]. Notably, subjects were less likely to endorse items that loaded on the psychosis-spectrum factor (average 11% endorsing), than on externalizing (21%), anxious-misery (15%), and fear (16%). Similarly, clinical diagnostic rates of attention deficit, oppositional defiant, and conduct disorders in PNC are 16, 33, and 7%, respectively [Reference Moore, Calkins, Satterthwaite, Roalf, Rosen and Gur40], while rate of threshold psychotic symptoms is 4% [Reference Calkins, Moore, Merikangas, Burstein, Satterthwaite and Bilker65]. Psychotic symptoms also emerge later during development than symptoms relating to externalizing, anxious-misery, and fear [Reference Häfner, Maurer, Löffler and Riecher-Rössler66,Reference Kessler, Berglund, Demler, Jin, Merikangas and Walters67].
In contrast, we found externalizing psychopathology to be significantly and moderately heritable, with genetic factors explaining 46–58% of the variance. Equally substantial genetic influences on externalizing between ages 8 and 26 have been reported in twin and adoption studies [Reference Hannigan, Walaker, Waszczuk, McAdams and Eley13–Reference Huizink, Van Den Berg, van der Ende and Verhulst15]. A very similar SNP heritability of 0.44 has also been reported for externalizing [Reference Pappa, Fedko, Mileva-Seitz, Hottenga, Bakermans-Kranenburg and Bartels68], but another study found SNP heritability estimates of zero for a range of externalizing problems [Reference Trzaskowski, Dale and Plomin69]. Differences between SNP and pedigree heritability estimates are likely due to rare variants not well captured by common SNPs, with SNP heritability representing the lower bound. Thus, using related individuals is more powerful than using unrelated individuals when estimating heritability, and the combination of all possible relationships, as in our sample, results in something of a hybrid between SNP and pedigree estimates. However, shared environment may contribute to heritability inflation when using related individuals. Nevertheless, rare variants account for a significant proportion of total heritability [Reference Zaitlen, Kraft, Patterson, Pasaniuc, Bhatia and Pollack70], and our heritability estimates are generally in line with previous studies.
Importantly, we found significant heritability estimates for general psychopathology and anxious-misery factors generated from a hierarchical, but not bifactor, model. This finding highlights the utility of bifactor models, in which all factors (both general and specific) are orthogonal [Reference Moore, Calkins, Satterthwaite, Roalf, Rosen and Gur40,Reference Knowles, Carless, de Almeida, Curran, McKay and Sprooten71], thereby allowing us to examine the unique contribution of genetic factors to general psychopathology and specific factors of anxious-misery, externalizing, fear, and psychosis-spectrum [Reference Reise, Moore and Haviland43,Reference Reise44]. This finding, as well as the finding of stronger genetic influences on externalizing than general psychopathology, is also in line with previous evidence of genetic signal on specific cognitive factors emerging only when variance associated with general cognitive ability is parsed out, that is when applying a bifactor model [Reference Knowles, Carless, de Almeida, Curran, McKay and Sprooten71]. In hierarchical models, on the other hand, variance associated with the general factor may account for significant findings in the specific factors, and vice versa. Thus, our findings suggest that genetic effects underlying early-life anxiety and depression may underlie more general psychopathology. Estimates of genetic influence on childhood anxiety [Reference Gregory and Eley17,Reference Thapar and McGuffin18], depression [Reference Rice, Harold and Thapar19,Reference Birmaher, Ryan, Williamson, Brent, Kaufman and Dahl20], and phobias [Reference Lichtenstein and Annas72,Reference Kendler, Gardner, Annas, Neale, Eaves and Lichtenstein73] are heterogeneous, ranging from small to large (h 2 = 0.15–0.77) for anxiety and depression [Reference Gregory and Eley17,Reference Rice, Harold and Thapar19], and for phobias (h 2 = 0.23–0.72) [Reference Lichtenstein and Annas72,Reference Kendler, Gardner, Annas, Neale, Eaves and Lichtenstein73].
Second, we found negative genetic correlations between externalizing and face memory, verbal reasoning, spatial processing, sensorimotor speed, verbal knowledge, and general cognitive ability (g). These findings are in line with previous evidence of genetic overlap between ADHD and general and executive cognitive functions [Reference Bidwell, Willcutt, Defries and Pennington27–Reference Savage, Jansen, Stringer, Watanabe, Bryois and de Leeuw32,Reference Donati, Dumontheil and Meaburn35], but expand knowledge by showing that genetic overlap extends to externalizing psychopathology more generally, as well as to memory, complex reasoning, and speed functions. Similarly, in the same sample, genetic overlap was reported between inattention and memory, social cognition, executive function, and complex reasoning [Reference Micalizzi, Brick, Marraccini, Benca-Bachman, Palmer and Knopik74]. Genetic overlap between externalizing and cognitive functioning may be due to the same genetic factors influencing both traits. A recent genome-wide meta-analyses identified shared risk loci for ADHD and intelligence [Reference O’Connell, Shadrin, Smeland, Bahrami, Frei and Bettella31]. Rare genetic variants have also been identified for cognition [Reference Huguet, Schramm, Douard, Tamer, Main and Monin75,Reference Kendall, Bracher-Smith, Fitzpatrick, Lynham, Rees and Escott-Price76] and ADHD [Reference Harich, van der Voet, Klein, Cizek, Fenckova and Schenck77], with overlapping genes implicated in ADHD studies of common and rare variants [Reference Martin, O’Donovan, Thapar, Langley and Williams78]. However, while there is evidence of common and rare variants disrupting similar biological pathways in ADHD [Reference Stergiakouli, Hamshere, Holmans, Langley, Zaharieva and Hawi79], the neurobiological mechanisms underlying shared genetic influences on ADHD and intelligence remain unclear [Reference O’Connell, Shadrin, Smeland, Bahrami, Frei and Bettella31]. Future studies that utilize animal models are needed to elucidate the causal and biological pathways underlying these shared genetic influences. Alternatively, cognitive impairments may lead to externalizing, and/or vice versa. Importantly, different cognitive measures show different psychometric properties and associations with specific functions require replication. Future longitudinal studies incorporating behavioral, neuroimaging, and genetic data can further disentangle these associations. Nevertheless, our findings suggest that cognitive impairment may be a useful endophenotype [Reference Glahn, Knowles, McKay, Sprooten, Raventós and Blangero80,Reference Gottesman and Gould81] of externalizing psychopathology. Interestingly, the WRAT, a measure of verbal knowledge showed the strongest phenotypic correlation with externalizing, but the weakest (statistically significant) genetic correlation. This finding highlights the importance of elucidating genetic underpinnings of phenotypic associations to delineate biological etiology.
Finally, we found static and dynamic genetic influences on externalizing psychopathology between childhood and early adulthood, in line with previous evidence [Reference Hannigan, Walaker, Waszczuk, McAdams and Eley13–Reference Huizink, Van Den Berg, van der Ende and Verhulst15,Reference Kan, Dolan, Nivard, Middeldorp, van Beijsterveldt and Willemsen82]. While we did not find evidence for novel genetic influences, we found a decrease in genetic variance, and increase in environmental variance. Our findings are in line with those of Huizink [Reference Huizink, Van Den Berg, van der Ende and Verhulst15], who found a decrease in genetic influences on externalizing from 43 to 29% between age 12 and 26, as well as an increase in environmental influences from 39 to 52% between these ages. Similarly, Wichers [Reference Wichers, Gardner, Maes, Lichtenstein, Larsson and Kendler14] found a decrease in genetic effects from 78 to 73% and an increase in environmental effects from 20 to 26% between age 8 and 20, but also reported novel genetic influences throughout adolescence [Reference Wichers, Gardner, Maes, Lichtenstein, Larsson and Kendler14]. Huizink, on the other hand, reported novel environmental, but not genetic, influences [Reference Huizink, Van Den Berg, van der Ende and Verhulst15]. Several phenomena may underlie these G × A interactions. Genes may become less expressed due to maturational processes involving hormonal and physical development [Reference Hannigan, Walaker, Waszczuk, McAdams and Eley13]. Increasing environmental influences likely reflect growing peer influences and substance use [Reference Miettunen, Murray, Jones, Maki, Ebeling and Taanila83]. Of note, environmental variance in this study incapsulates all variance that is not genetic. However, measurement error is unlikely to account for our findings since reliability of our factors is high (Figures S1 and S2). Moreover, method of symptom reporting differed by age, and genetic effects may differ by reporting method. For example, Scourfield et al. [Reference Scourfield, Van den Bree, Martin and McGuffin84] found heritability estimates of 54 and 35% for parent- and self-reported conduct problems, respectively. Nevertheless, G × A interaction analyses adjusting additionally for reporting method generated similar findings, with significant decrease in genetic variance (γg = −0.151, p = 0.002; γg = −0.112, p = 0.026), and increase in environmental variance (γe = 0.058, p = 0.006; γe = 0.092, p = 0.007) on externalizing factors derived from bifactor and hierarchical models, respectively. Future studies that are able to combine longitudinal, self- and parent-report symptom data will help elucidate these age-associated effects further. Conversely, we previously reported increasing genetic and environmental variance on general cognitive ability in this sample [Reference Mollon, Knowles, Mathias, Gur, Peralta and Weiner23]. Thus, while a proportion of genetic factors underlying externalizing and cognition overlap, other, nonoverlapping genetic influences may show diverging developmental trajectories. Advanced quantitative genetic methods may shed light on trajectories of shared genetic influences.
This study has limitations. First, our analyses were restricted to European American individuals and future studies should include other populations. Second, our data were cross-sectional and longitudinal studies with repeated assessments of identical measures and individuals are needed to fully establish age-related changes in genetic factors. Third, lower heritability estimates for general psychopathology and anxious-misery meant less power to detect genetic correlations with cognition and G × A interactions. Finally, although we used a large sample, and comprehensive assessments of cognition and psychopathology, our findings require replication.
The authors thank all study participants.
Conflicts of Interest
The authors declare no conflicts of interest.
This research was supported by National Institute of Mental Health grants R01 MH107248 and MH107235.
To view supplementary material for this article, please visit http://dx.doi.org/10.1192/j.eurpsy.2021.21.
Data Availability Statement
The data that support the findings of this study are openly available in the database of Genotypes and Phenotypes (dbGaP): https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000607.v3.p2.