Trust connects individuals, but the obverse – mistrust – disconnects. Our view is that excessive mistrust, paranoia, is corrosive for mental health, relationships, and societal well-being. Many people have a few paranoid thoughts, a few people have many (Freeman et al., Reference Freeman, Garety, Bebbington, Smith, Rollinson, Fowler, Kuipers, Ray and Dunn2005). Excessive mistrust is common in adolescents (Wong et al., Reference Wong, Freeman and Hughes2014; Bird et al., Reference Bird, Evans, Waite, Loe and Freeman2019), adults (Bebbington et al., Reference Bebbington, McBride, Steel, Kuipers, Radovanovic, Brugha, Jenkins, Meltzer and Freeman2013; Elahi et al., Reference Elahi, Perez Algorta, Varese, McIntyre and Bentall2017), and older adults (Östling and Skoog, Reference Östling and Skoog2002; Cohen et al., Reference Cohen, Magai, Yaffee and Walcott-Brown2004). Paranoia in its severest form, persecutory delusion, is seen clinically in conditions such as schizophrenia. Over the past 20 years, paranoia has become an increasing focus of research. As part of this research endeavor, the Green et al., Paranoid Thoughts Scale (GPTS; Green et al., Reference Green, Freeman, Kuipers, Bebbington, Fowler, Dunn and Garety2008) was developed to assess paranoia across the spectrum of severity. Published in this journal, the GPTS is recommended as the best current measure of paranoia (Statham et al., Reference Statham, Emerson and Rowse2019). In this paper, we use data collected over the past decade to rigorously assess the psychometric properties of the GPTS and provide score ranges with clinical cut-offs to enable interpretation of scale scores.
The central focus of the GPTS is on the occurrence of recent persecutory ideation (Part B), since this is the content of persecutory delusions. Scale items were generated based on a definition that persecutory ideation consists of believing that harm is going to occur and that the perpetrator has the deliberate intention to cause this harm (Freeman and Garety, Reference Freeman and Garety2000). It is the strength of the GPTS Part B questionnaire that all item content is clearly of a persecutory nature (e.g. ‘Certain individuals have had it in for me.’ ‘People wanted me to feel threatened, so they stared at me.’), whereas older scales such as the Paranoia Scale (Fenigstein and Vanable, Reference Fenigstein and Vanable1992) predominately contain items that do not meet the definition of persecutory ideation (e.g. ‘No one really cares much what happens to you’, ‘I am sure I get a raw deal from life’). The development of the GPTS was also influenced by a theoretical perspective that there is a hierarchy of paranoia (Freeman et al., Reference Freeman, Garety, Bebbington, Smith, Rollinson, Fowler, Kuipers, Ray and Dunn2005): persecutory ideation typically builds upon ideas of reference and other social-evaluative concerns. The results of factor analytic studies support this coupling of ideas of reference and persecution (e.g. Paolini et al., Reference Paolini, Moretti and Compton2016). Hence the GPTS also included a scale assessing the related but less severe phenomenon of social ideas of reference (Part A). Item content for Part A was developed in line with criteria building on the work of Startup and Startup (Reference Startup and Startup2005): ‘The person holds the belief that some neutral event has special personal significance/refers to them personally by means of observation or communication by another.’ The items were also designed to capture the dimensions of conviction, preoccupation, and distress. In the initial validation, principal components analysis of a 93-item pool for the GPTS completed by 353 university students indicated two components: persecution and social ideas of reference. The item pool was also completed by 50 patients with current persecutory delusions in the context of psychosis (Green et al., Reference Green, Freeman, Kuipers, Bebbington, Fowler, Dunn and Garety2008). Items for the two 16-item Part A (reference) and Part B (persecution) scales were selected by considering: factor loadings, item-scale correlation, item variance, level of item endorsement, ability to discriminate between the non-clinical and clinical group, and overall face validity of the scales. In the original scale paper, the GPTS demonstrated good psychometric properties, including test-retest reliability, convergent validity, and sensitivity to change. Further confirming the construct validity of the scale, GPTS scores are associated with the occurrence of unfounded paranoia in virtual reality simulations of neutral social situations (e.g. Freeman et al., Reference Freeman, Pugh, Antley, Slater, Bebbington, Gittins, Dunn, Kuipers, Fowler and Garety2008; Freeman et al., Reference Freeman, Pugh, Vorontsova, Antley and Slater2010).
The GPTS is now the most commonly used measure of paranoia in research studies (e.g. Scott et al., Reference Scott, Rowse and Webb2017; Raihani and Bell, Reference Raihani and Bell2018; Rehman et al., Reference Rehman, Gumley and Biello2018) and clinical trials (e.g. Freeman et al., Reference Freeman, Dunn, Startup, Pugh, Cordwell, Mander, Černis, Wingham, Shirvell and Kingdon2015a, Reference Freeman, Evans, Černis, Lister and Dunn2015b, Reference Freeman, Waite, Startup, Myers, Lister, McInerney, Harvey, Geddes, Zaiwalla, Luengo-Fernandez, Foster, Clifton and Yu2015c; van den Berg et al., Reference van den Berg, de Bont, van der Vleugel, de Roos, de Jongh, van Minnen and van der Gaag2016; Garety et al., Reference Garety, Ward, Freeman, Fowler, Emsley, Dunn, Kuipers, Bebbington, Waller, Greenwood, Rus-Calafell, McGourty and Hardy2017). In a recent review of self-report measures of paranoia, Statham et al. (Reference Statham, Emerson and Rowse2019) conclude: ‘on the basis of current evidence, the GPTS offers the most valid and informative assessment of paranoia. However some psychometric findings (e.g. internal consistency, structural validity) require replication with a larger sample.’ In this paper, we pool our data over the past decade from both clinical studies of psychosis and of non-clinical paranoia in the general population to evaluate the psychometric properties of the GPTS. We had three objectives. First, we wanted to assess the validity of the basic two-factor structure of the GPTS with a much larger sample size. Second, we wanted to use item response theory (IRT) (Reise and Henson, Reference Reise and Henson2003) to evaluate item properties, test reliability, and measurement invariance between different ages and genders. Third, we wanted to enable the interpretation of scale scores by specifying severity ranges. In future research, it would be valuable to know the level of the paranoia spectrum that is being studied, since this is opaque in many non-clinical studies that either do not select for paranoia score or use median splits to categorize purportedly ‘high’ and ‘low’ paranoia groups. In clinical studies and settings, it would also be valuable to have a screening tool for whether an individual is likely to have a persecutory delusion.
There were a total of 10 551 participants with GPTS data from 16 studies. In total, 1228 participants completed both GPTS Part A (ideas of reference) and Part B (ideas of persecution). Of these participants, 1212 had complete data (with no missing items) for both Parts A and B, 1218 had complete data (with no missing items) for Part A, and 1221 had complete data (with no missing items) for Part B. This included 806 participants recruited from the general population from four studies (Freeman et al., Reference Freeman, Pugh, Antley, Slater, Bebbington, Gittins, Dunn, Kuipers, Fowler and Garety2008, Reference Freeman, Thompson, Vorontsova, Dunn, Carter, Garety, Kuipers, Slater, Antley, Glucksman and Ehlers2013b, Reference Freeman, Evans, Černis, Lister and Dunn2015b; and one yet unpublished) and 422 patients with psychosis from eight studies (Freeman et al., Reference Freeman, Pugh, Dunn, Evans, Sheaves, Waite, Cernis, Lister and Fowler2014, Reference Freeman, Dunn, Startup, Pugh, Cordwell, Mander, Černis, Wingham, Shirvell and Kingdon2015a, Reference Freeman, Waite, Startup, Myers, Lister, McInerney, Harvey, Geddes, Zaiwalla, Luengo-Fernandez, Foster, Clifton and Yu2015c, Reference Freeman, Waite, Emsley, Kingdon, Davies, Fitzpatrick and Dunn2016, Reference Freeman, Bradley, Antley, Bourke, DeWeever, Evans, Černis, Sheaves, Waite, Dunn, Slater and Clark2016a, Reference Freeman, Bradley, Waite, Sheaves, DeWeever, Bourke, McInerney, Evans, Černis, Lister, Garety and Dunn2016b, Reference Freeman, Lister, Waite, Yu, Slater, Dunn and Clark2019a; Bradley et al., Reference Bradley, Freeman, Chadwick, Harvey, Mullins, Johns, Sheaves, Lennox, Broome and Waite2018). An additional 9324 participants provided complete GPTS data with no missing items for Part B only, including 3826 individuals from two general population studies (Freeman et al., Reference Freeman, Morrison, Murray, Evans, Lister and Dunn2013a; Brown et al., Reference Brown, Waite, Rovira, Nickless and Freemansubmitted), 3755 university students with insomnia (Freeman et al., Reference Freeman, Sheaves, Goodwin, Yu, Nickless, Harrison, Emsley, Luik, Foster, Wadekar, Hinds, Gumley, Jones, Lightman, Jones, Bentall, Kinderman, Rowse, Brugha, Blagrove, Gregory, Fleming, Walklet, Glazebrook Davies, Hollis, Haddock, John, Coulson, Fowler, Pugh, Cape, Mosely, Brown, Hughes, Obonsawin, Coker, Watkins, Schwannauer, MacMahon, Siriwaardena and Espie2017), and 1743 patients with non-affective psychosis (Freeman et al., Reference Freeman, Taylor, Molodynski and Waite2019b). This provided a total of 10 545 participants with complete data for the GPTS Part B.
Participant subgroups, based on clinical information, were created for descriptive reports and the ROC analysis. Nine hundred and thirty-seven participants from three general population studies who reported non-psychotic mental health disorders were included in a mental health problems subgroup. This included 32 people who reported being treated for anxiety and/or depression (Freeman et al., Reference Freeman, Thompson, Vorontsova, Dunn, Carter, Garety, Kuipers, Slater, Antley, Glucksman and Ehlers2013b), 236 participants who reported current (non-psychotic) mental health problems (Freeman et al., Reference Freeman, Morrison, Murray, Evans, Lister and Dunn2013a), and 669 participants reporting current contact with mental health services for (non-psychotic) mental health problems (Freeman et al., Reference Freeman, Sheaves, Goodwin, Yu, Nickless, Harrison, Emsley, Luik, Foster, Wadekar, Hinds, Gumley, Jones, Lightman, Jones, Bentall, Kinderman, Rowse, Brugha, Blagrove, Gregory, Fleming, Walklet, Glazebrook Davies, Hollis, Haddock, John, Coulson, Fowler, Pugh, Cape, Mosely, Brown, Hughes, Obonsawin, Coker, Watkins, Schwannauer, MacMahon, Siriwaardena and Espie2017). Participants who reported not having any mental health problems and participants from the remaining general population studies with no mental health information formed a non-clinical subgroup (n = 7297). The psychosis subgroup consisted of 1804 patients from three clinical studies who had been recruited based on a diagnosis of psychotic disorder (Freeman et al., Reference Freeman, Waite, Startup, Myers, Lister, McInerney, Harvey, Geddes, Zaiwalla, Luengo-Fernandez, Foster, Clifton and Yu2015c, Reference Freeman, Lister, Waite, Yu, Slater, Dunn and Clark2019a; Bradley et al., Reference Bradley, Freeman, Chadwick, Harvey, Mullins, Johns, Sheaves, Lennox, Broome and Waite2018). The persecutory delusion group consisted of 360 patients with psychosis, from six clinical trials, recruited for the presence of a persecutory delusion (Freeman et al., Reference Freeman, Pugh, Dunn, Evans, Sheaves, Waite, Cernis, Lister and Fowler2014, Reference Freeman, Dunn, Startup, Pugh, Cordwell, Mander, Černis, Wingham, Shirvell and Kingdon2015a, Reference Freeman, Waite, Emsley, Kingdon, Davies, Fitzpatrick and Dunn2016, Reference Freeman, Bradley, Antley, Bourke, DeWeever, Evans, Černis, Sheaves, Waite, Dunn, Slater and Clark2016a, Reference Freeman, Bradley, Waite, Sheaves, DeWeever, Bourke, McInerney, Evans, Černis, Lister, Garety and Dunn2016b, Reference Freeman, Lister, Waite, Yu, Slater, Dunn and Clark2019a). One hundred and forty-seven participants from the general population studies were not included in any of the subgroups due to a self-reported personal or family history of psychosis. This included 45 participants with a diagnosis of a severe mental illness such as bipolar disorder or schizophrenia (Brown et al., Reference Brown, Waite, Rovira, Nickless and Freemansubmitted), nine participants with a diagnosis of a psychotic disorder (Freeman et al., Reference Freeman, Sheaves, Goodwin, Yu, Nickless, Harrison, Emsley, Luik, Foster, Wadekar, Hinds, Gumley, Jones, Lightman, Jones, Bentall, Kinderman, Rowse, Brugha, Blagrove, Gregory, Fleming, Walklet, Glazebrook Davies, Hollis, Haddock, John, Coulson, Fowler, Pugh, Cape, Mosely, Brown, Hughes, Obonsawin, Coker, Watkins, Schwannauer, MacMahon, Siriwaardena and Espie2017), and 93 participants with a reported family history of psychosis (Freeman et al., Reference Freeman, Morrison, Murray, Evans, Lister and Dunn2013a).
Green et al. Paranoid Thoughts Scale
The GPTS is a thirty-two item self-report measure of paranoia, designed for both clinical and non-clinical populations (Green et al., Reference Green, Freeman, Kuipers, Bebbington, Fowler, Dunn and Garety2008). Part A assesses ideas of reference (e.g. ‘It was hard to stop thinking about people talking about me behind my back’) and Part B assesses ideas of persecution (e.g. ‘I was convinced there was a conspiracy against me’). Each item is rated on a five-point scale (1–5). Scores on each scale can range from 16 to 80. Higher scores indicate greater levels of paranoid thinking.
All analyses were conducted in R, version 3.5 (R Core Team, 2013). Packages used included ‘psych’ (Revelle, Reference Revelle2018), ‘mirt’ (Chalmers, Reference Chalmers2012), ‘pROC’ (Robin et al., Reference Robin, Turck, Hainard, Tiberti, Lisacek, Sanchez and Müller2011), and ‘optimumCutpoints’ (Lopez-Raton et al., Reference Lopez-Raton, Xose Rodriguez-Alvarez, Cadarso Suarez and Gude Sampedro2014).
The factor structure of the GPTS was assessed in the 1212 participants with complete GPTS data from both Parts A and B. Factor analysis was appropriate as Bartlett's test of Sphericity was significant (χ2 = 46 249.4, df = 496, p < 0.001) and the Kaiser–Meyer–Olkin test of sampling adequacy was excellent (KMO = 0.98). Confirmatory factor analysis (CFA) using the MLR robust maximum likelihood estimator was first conducted to examine the model fit of the two-factor structure identified in the initial GPTS validation study (Green et al., Reference Green, Freeman, Kuipers, Bebbington, Fowler, Dunn and Garety2008). Model fit was assessed using a Comparative Fit Index (CFI) and Tucker–Lewis index (TLI) of >0.95, a Root Mean Square Error of Approximation (RMSEA) of <0.06, and a Standardized Root Mean Square Residual (SRMR) of <0.08 (Hu and Bentler, Reference Hu and Bentler1999). Based on the outcome of the CFA, exploratory factor analysis (EFA) was then conducted using principal axis factoring and oblique rotation. For the revised GPTS, items were considered for deletion by assessing the factor loadings, residuals, and content of items.
There are a number of helpful introductory and detailed descriptions of IRT techniques available (e.g. Reise and Waller, Reference Reise and Waller2009; Embretson and Reise, Reference Embretson and Reise2013; van der Linden and Hambleton, Reference van der Linden and Hambleton2013). IRT analyses were conducted using all available data for each subscale of the GPTS (Part A = 1218, Part B = 10 545). Where appropriate, unidimensional IRT analyses were conducted to examine the item and test properties of the individual factors of the GPTS. IRT was only conducted if the assumption of unidimensionality was met. The EFA and Mokken analysis were used to evaluate whether items conform to a single scale, with Loevinger's H above 0.3 indicating unidimensionality (Stochl et al., Reference Stochl, Jones and Croudace2012). A two-parameter graded response model (GRM) was fitted to the items (Samejima, Reference Samejima1969). Person fit statistics were calculated to detect outliers where the pattern of responses across the items was atypical and therefore likely guided by other response mechanisms (e.g. random responding). Participants with atypical response patterns, determined by extreme person fit statistic scores (z < −3 or >3), were excluded (Felt et al., Reference Felt, Castaneda, Tiemensma and Depaoli2017).
The item and test parameters derived from the IRT analysis are expressed as a function of θ, representing the continuum of the latent trait (i.e. paranoia) where values denote standard deviations from the average level (θ = 0). As such, higher values of θ represent more severe paranoia. The ability of each item to discriminate different levels of paranoia is denoted by the discrimination parameter (a), with higher values indicating small shifts in severity lead to increases in the probability that an item will be endorsed. Discrimination parameters above 1 are highly discriminative, whilst those below 0.5 are considered unacceptable (Baker and Kim, Reference Baker and Kim2017). The difficulty parameters (b) describe the level of severity that the item measures, with the four difficulty parameters for each item denoting the 50% probability of responding at the boundary between each response option. Higher difficulty parameters indicate that the item responses typically measure more severe levels of paranoia.
The reliability of the GPTS was evaluated using the test information (TI) function, representing the precision of the measure at different points along the θ spectrum. To aid interpretation, the TI at specific values of θ were converted to an equivalent α reliability on a 0–1 scale with the formula 1/√TI(θ) (O'Connor, Reference O'Connor2018). To evaluate measurement invariance, we conducted differential item functioning (DIF) analysis for age and gender, with the criteria of a β change above 10% and a pseudo R 2 above 0.13 indicating significant item variance (Crane et al., Reference Crane, Gibbons, Ocepek-Welikson, Cook, Cella, Narasimhalu, Hays and Teresi2007; Choi et al., Reference Choi, Gibbons and Crane2011). The presence of DIF reflects a measurement bias where demographic factors influence the way participants respond to the items (Holland and Wainer, Reference Holland and Wainer2012).
Determining score ranges and clinical cut-offs
The expected score function from the IRT analysis was used to examine score ranges, providing the expected total score at different points of the θ spectrum. To assess the accuracy of the expected score function, we examined the model fit of the GRM for the data and the correlation between θ scores and raw total scores. Receiver operating characteristic (ROC) analyses were conducted using the 360 patients with a confirmed persecutory delusion as the discrimination group and non-clinical participants from the general population (n = 7297) as the control group. The area under the curve (AUC) was used to evaluate the ability of the GPTS to discriminate people with persecutory delusions from the control group, with values above 0.70 considered fair, over 0.80 good, and over 0.90 excellent (Egan, Reference Egan1975). The cut-off score providing the optimal balance of sensitivity and specificity was then calculated based on Youden's J statistic (Youden, Reference Youden1950). Cut-off scores were incorporated with the expected score function to determine the score ranges.
The initial CFA in the 1212 participants with complete data for the full GPTS demonstrated the original two-factor structure of Part A and B had an inadequate model fit (χ2 = 2599.2, df = 463, CFI = 0.91, TLI = 0.90, RMSEA = 0.087, SRMR = 0.038). An EFA was therefore conducted on the 32 items to explore the factor structure (see online Supplementary Materials). Although a parallel analysis suggested a four-factor model, none of the 32 items loaded uniquely on a third or fourth factor when these solutions were extracted. As a result, a two-factor model was still considered the best solution. This identified that although all 16 persecution items strongly loaded on the same factor with no cross-loadings, only 10 of the social reference items loaded onto a unique factor. Within the social reference scale, four items loaded on both factors (‘I was convinced that people were singling me out’, ‘People have been checking up on me’, ‘I was stressed by people watching me’, and ‘I was worried by people's undue interest in me’), and two items loaded only on the persecution factor (‘I was certain that people have followed me’ and ‘Certain people were hostile towards me personally’).
These findings suggest that the 16 social reference items do not have a coherent factor structure and therefore cannot be considered a unidimensional scale. However, as all 16 persecution items loaded strongly on one factor that can be treated as a unidimensional subscale to measure ideas of persecution. Mokken analysis confirmed all 16 persecution items were within a single factor, with Loevinger's H coefficients above 0.3 for all items and an overall coefficient of homogeneity of 0.699 (s.e. = 0.005). Mean GPTS persecution scores for each of the four participant subgroups are shown in Table 1.
We report the properties of the GPTS Persecution scale (Part B), which can inform the understanding of previous studies that have used this scale. Although the 16 items had a coherent unidimensional factor structure, several items had correlated residuals (Yen's Q3>0.2), suggesting local dependence within the items. The IRT analysis should therefore be interpreted with caution. Following the removal of participants with atypical response patterns (n = 190), a GRM with the remaining 10 355 participants demonstrated a good fit to the data (CFI = 0.99, TLI = 0.98, SRMSR = 0.037, RMSEA = 0.068).
The item parameters for the GPTS Persecution scale are provided in the online Supplementary Materials. All 16 items were highly discriminative of shifts in paranoia (a = 2.45–5.37). The most discriminating items were ‘I was distressed by being persecuted’ (a = 5.37, s.e. = 0.12) and ‘The thought that people were persecuting me played on my mind’ (a = 5.07, s.e. = 0.11). High difficulty parameters for a response of 0–1 (b 1) on the items ‘I was convinced there was a conspiracy against me’ (b 1 = 0.79, s.e. = 0.02), ‘I was sure someone wanted to hurt me’ (b 1 = 0.78, s.e. = 0.01), ‘People wanted me to feel threatened, so they stared at me’ (b 1 = 0.76, s.e. = 0.02), and ‘I was distressed by being persecuted’ (b 1 = 0.76, s.e. = 0.01) suggest any endorsement of these items, even at a low level, are indicative of high paranoia severity (>0.75 s.d. above average). In contrast, low-level endorsement on the items ‘I was certain people did things in order to annoy me’ (b 1 = −0.09, s.e. = 0.02) and ‘Certain individuals have had it in for me’ (b 1 = 0.22, s.e. = 0.01) is in line with average levels of paranoia in the population. For each of the 16 items, full endorsement (b 4; response of 4) indicates a severe level of persecutory ideation (>1.50 s.d. above average).
Overall reliability was high across the spectrum of paranoia severity, with α values >0.90 within the θ range of 0.27 below and 2.51 s.d. above average levels of paranoia, and α >0.95 between 0.015 below and 2.29 s.d. above average. This shows the persecution scale is most reliable at heightened levels of severity, with a maximum α of 0.99 (TI = 78.3, s.e. = 0.11) at 1.14 s.d. above average paranoia. All 16 persecution items were invariant between men (n = 3677) and women (n = 4830), and between age groups (13–21 years, n = 2636; 22–29 years, n = 3047; 30–44 years, n = 2440; 45+ years, n = 2232), in the DIF analysis (pseudo R 2 change <0.13 and β change <10%).
The total score from the 16 original persecution items was highly correlated with the participant θ scores (r = 0.90), indicating the total score has a high level of precision. The expected score function (supplementary materials) showed most people are unlikely to endorse many persecution items, with an expected score of 19.1 (minimum 16) at the average level of paranoia in the population. Expected scores increase as the level of trait paranoia increases, with expected scores of 26.7 at 0.5 s.d. above average, 42.6 at 1.0 s.d. above average, 60.9 at 1.5 s.d. above average, and 74.1 at 2.0 s.d. above average.
The GPTS Persecution score ranges are shown in Table 2. Our recommended cut-off for identifying moderately severe persecutory ideation is a score of 35 or above, representing 0.80 s.d. above the average level of paranoia in the population. ROC analysis identified 35 as the optimal cut-off point (sensitivity = 0.931, 95% CI 0.903–0.955; specificity = 0.878, 95% CI 0.870–0.885) to discriminate patients with persecutory delusions (n = 360) from the non-clinical group (n = 7297), with an overall AUC of 0.959 (95% CI 0.950–0.969).
Note. The θ values represent standard deviations above the average (θ = 0) population level.
Although a score of 35 most accurately discriminates the delusion group from a non-clinical sample, individuals with persecutory delusions typically score well above this level with a mean score of 58.7 (s.d. = 14.8) and a lower quartile of 49. Our recommended cut-off to identify severe persecutory ideation and the likely presence of a persecutory delusion is 45, representing 1.10 s.d. above the average level of paranoia in the population. The ROC analysis demonstrates that a cut-off of 45 is unlikely to incorrectly identify an individual as having a persecutory delusion when they do not (specificity = 0.94, 95% CI 0.93–0.95), while still being able to identify the majority of patients with confirmed persecutory delusions (sensitivity = 0.81, 95% CI 0.77–0.85). As shown in Table 3, scores above this level were present in 81% (n = 293) of the patients with persecutory delusions.
The Revised GPTS
Due to the problematic factor structure of the GPTS Part A and the local dependence in items in Part B, we derived a Revised-Green et al., Paranoid Thoughts Scale (R-GPTS). The six items from Part A that loaded on the persecution factor in the initial EFA were deleted, providing a clean two-factor structure. Five Part B items were deleted due to highly correlated residuals with other items (‘I have definitely been persecuted’, ‘People have intended me harm’, ‘I was distressed by people wanting to harm me in some way’, ‘I was annoyed because others wanted to deliberately upset me’, ‘The thought that people were persecuting me played on my mind’). One further item was deleted from Part B due to potentially confusing wording (‘I was preoccupied with thoughts of people trying to upset me deliberately’). Two additional social reference items were deleted due to loading on the persecution factor in the revised EFA (‘I was frustrated by people laughing at me’ and ‘It was hard to stop thinking about people talking about me behind my back’).
Parallel analysis of the remaining 18 items suggested a two-factor model was now the best solution (see online Supplementary Materials). The final model with an eight-item Reference scale and a 10-item Persecution scale provided a clean factor structure explaining 69% of the variance with a good model fit (χ2 = 535.3, df = 134, p < 0.001, CFI = 0.96, TLI = 0.96, RMSEA = 0.068, SRMR = 0.031). None of the items in either revised subscale had correlated residuals above 0.20. Mokken analysis confirmed the subscales can be treated as unidimensional constructs, with Loevinger's H coefficients above 0.3 for all items and high overall coefficients of homogeneity (Reference: H = 0.637, s.e. = 0.013; Persecution: H = 0.675, s.e. = 0.005). The two factors were highly correlated (r = 0.79). The scaling of the individual items was changed to 0–4 to enable easier interpretation of total scores.
R-GPTS reference scale
IRT was conducted with the 1224 participants with complete data for the eight R-GPTS Reference items. Following removal of participants with atypical response patterns (n = 4), a GRM demonstrated a good model fit to the items (CFI = 0.99, TLI = 0.99, SRMSR = 0.028, RMSEA = 0.064).
All eight items were highly discriminative of ideas of social reference, with parameters ranging from 2.10 to 3.69 (Table 4). The most highly discriminating item was ‘People definitely laughed at me behind my back’ (a = 3.69, s.e. = 0.21). Unlike the persecution scale, the difficulty parameters show for all eight items, low-level endorsement (response 0–1) likely represents average levels of ideas of reference within the population (b 1 = −0.50 to 0.30). The items where moderate endorsement (b 2 and b 3) most strongly represents heightened severity were ‘People have been dropping hints for me’ (b 2 = 0.83, s.e. = 0.05; b 3 = 1.31, s.e. = 0.06) and ‘I have been thinking a lot about people avoiding me’ (b 2 = 0.66, s.e. = 0.05; b 3 = 1.26, s.e. = 0.06). For each of the eight items, full endorsement (response 3–4) represents more severe ideas of reference (>1.20 s.d. above average). The R-GPTS Reference scale has good reliability across a wide range of the spectrum of ideas of reference, with α values above 0.90 within 0.47 s.d. below and 2.03 s.d. above average levels of social reference (Fig. 1). The maximum reliability was 0.95 (TI = 19.4, s.e. = 0.23) at 0.82 s.d. above average.
Note: a = discrimination, b = difficulty parameters at the category thresholds between 0–1 (b 1), 1–2 (b 2), 2–3 (b 3), and 3–4 (b 4).
All R-GPTS Reference items were invariant between men (n = 644) and women (n = 584), and between age groups (15–28 years, n = 309; 29–39 years, n = 315; 40–50 years, n = 303; 51+ years, n = 301), in the DIF analysis (pseudo R 2 change <0.13 and β change <10%).
R-GPTS persecution scale
IRT analysis was conducted with the 10 revised persecution items. Following removal of participants with atypical response patterns (n = 54), a GRM with the remaining 10 491 participants demonstrated a good fit to the data (CFI = 0.99, TLI = 0.99, SRMSR = 0.030, RMSEA = 0.062).
The item parameters are shown in Table 4. All 10 persecution items were highly discriminative of shifts in paranoia, with parameters ranging from (a = 2.43–4.45). As with the original GPTS Persecution scale, the most discriminating item was still ‘I was distressed by being persecuted’ (a = 4.45, s.e. = 0.10), followed by ‘I was sure someone wanted to hurt me’ (a = 4.25, s.e. = 0.10). The difficulty parameters for a response of 0–1 (b 1) identified the same four items from the original measure as the most indicative of heightened severity at low levels of endorsement (>0.75 s.d. above average). This included ‘I was convinced there was a conspiracy against me’ (b 1 = 0.79, s.e. = 0.02), ‘I was sure someone wanted to hurt me’ (b 1 = 0.78, s.e. = 0.01), ‘I was distressed by being persecuted’ (b 1 = 0.77, s.e. = 0.01), and ‘People wanted me to feel threatened, so they stared at me’ (b 1 = 0.75, s.e. = 0.02). Similarly, for all 10 items, full endorsement (b 4: response 3–4) indicated a severe level of persecutory ideation (>1.50 s.d. above average).
The revised persecution scale retained excellent reliability across the spectrum of paranoia severity, with equivalent α values above 0.90 between 0.12 s.d. below and 2.38 s.d. above average levels of paranoia and values above 0.95 between 0.23 and 2.10 s.d. above average. Similar to the original scale, the revised persecution scale demonstrated the highest reliability at elevated levels of paranoia, with a maximum α of 0.97 (TI = 39.1, s.e. = 0.16) at 1.21 s.d. above average (see Fig. 1).
The total score from the revised 10-item persecution scale has increased precision compared to the original 16-item scale, with a correlation with the participant θ scores of r = 0.92. As with the original GPTS, the majority of people are unlikely to endorse the persecutory items with an expected score of 2.53 (range 0–40) at the average level of trait paranoia (θ = 0). The expected score was 7.46 at 0.5 s.d. above average, 16.8 at 1.0 s.d. above average, 27.7 at 1.5 s.d. above average, and 35.7 at 2 s.d. above average (see Fig. 1).
As shown in Table 2, our recommended cut-off for moderately severe levels of persecutory ideation on the revised persecution scale was 11 (>0.80 s.d. above average). ROC analysis identified 11 as the optimal cut-off (sensitivity = 0.928, 95% CI 0.900–0.953; specificity = 0.852, 95% CI 0.844–0.861) to discriminate patients with a persecutory delusion (n = 360) from the non-clinical group (n = 7297), with an overall AUC of 0.953 (95% CI 0.943–0.963).
The recommended cut-off for severe persecutory ideation and the likely presence of a persecutory delusion is 18, representing >1.10 s.d. above the average level of paranoia in the population. The ROC analysis demonstrates this cut-off is unlikely to identify incorrectly an individual as having a persecutory delusion when they do not (specificity = 0.93, 95% CI 0.93–0.94), while still correctly identifying the majority of patients with confirmed persecutory delusions (sensitivity = 0.81, 95% CI 0.77–0.85). As shown in Table 3, scores above this level were present in 81% (n = 293) of the patients with persecutory delusions (mean = 26.1, s.d. = 9.46).
Empirical research makes it apparent that within the diagnosis of schizophrenia are multiple distinct psychotic experiences, such as paranoia, grandiosity, hallucinations, anhedonia, and thought disorder (e.g. Peralta and Cuesta, Reference Peralta and Cuesta1999; Wigman et al., Reference Wigman, Vollebergh, Raaijmakers, Iedema, van Dorsselaer, Ormel, Verhulst and van Os2011; Peralta et al., Reference Peralta, Moreno-Izco, Calvo-Barrena and Cuesta2013; Paolini et al., Reference Paolini, Moretti and Compton2016). Each of these distinct experiences is on quantitative dimensions in the general population (e.g. Zavos et al., Reference Zavos, Freeman, Haworth, McGuire, Plomin, Cardno and Ronald2014; Elahi et al., Reference Elahi, Perez Algorta, Varese, McIntyre and Bentall2017), just as has been found for common emotional disorders (Plomin et al., Reference Plomin, Haworth and Davis2009). Precision in the measurement of each psychotic experience is needed. Our particular focus has been on paranoid thinking. Based upon a definition that persecutory ideation concerns unfounded thoughts that others deliberately intend you harm (Freeman and Garety, Reference Freeman and Garety2000), the GPTS-Part B scale was developed (Green et al., Reference Green, Freeman, Kuipers, Bebbington, Fowler, Dunn and Garety2008). This was accompanied by the GPTS-Part A, which assesses the related, but less severe phenomena, of ideas of reference. With data from 10 000 people, including over 2000 patients with psychosis, we provide a comprehensive examination of the psychometric properties of the GPTS.
We show that the original GPTS-Persecution scale (Part B) is an adequate assessment of persecutory ideation, with good reliability across the spectrum of paranoia. However, there is a potential for measurement error due to the covariance between several of the items. The original GPTS Reference scale (Part A) stands up less well to the testing. It contains problematic items that are not fully separable from the persecutory ideation scale. We therefore do not recommend this as a stand-alone scale. To overcome these problems, we created a Revised GPTS with stand-alone assessments of persecution ideation and ideas of social reference. Both revised scales have excellent psychometric properties, with high reliability across both non-clinical and clinical levels of paranoia. Importantly, the R-GPTS Persecution scale is most reliable at the severe end of the paranoia spectrum, making it a helpful clinical tool. Future use of the Revised GPTS will produce more precise estimates of the presence of paranoia.
Although we conceive paranoia as having a spectrum of severity in the general population, it is still valuable to ask: what are high and low paranoia levels? When studying paranoia in analogue non-clinical populations, it will be very informative for researchers to specify the level of severity of the phenomenon that is being examined. It will also be beneficial in clinical research to identify the potential presence of persecutory delusions. Our interpretative score ranges will allow this to happen for the first time. Our patient group included several hundred people who were selected for studies on the basis of having a persecutory delusion, enabling precise cut-offs to be identified. Use of the R-GPTS will not only provide more precise estimates of paranoia but will enable better interpretation of the scores.
Where are the weaknesses in the current evaluation? Although our extensive sample includes participants likely representing the full spectrum of paranoia severity, it is important to note that our general population sample was not an epidemiologically representative cohort, which could skew the severity ranges. There is clearly scope for future improvement in the understanding of the total scores for the measure. We also do not report the test-retest reliability of the R-GPTS, although in all likelihood this will remain as high as the original measure. Further, it remains an issue that we cannot know how much of the item endorsement may reflect genuine hostility rather than unfounded paranoia. Nevertheless, we believe the R-GPTS will provide further stimulus for the successful study of paranoia.
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291719003155.
Daniel Freeman is supported by an NIHR Research Professorship (NIHR-RP-2014-05-003). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health. The data were collected in studies funded by: the NIHR Research Professorship; the Efficacy and Mechanism Evaluation (EME) Programme (WIT project; 09/160/06), which is funded by the UK Medical Research Council (MRC) and managed by the UK NHS National Institute for Health Research (NIHR) on behalf of the MRC-NIHR partnership; the Medical Research Council Developmental Pathway Funding Scheme (Immersive Virtual Reality Cognitive Treatment (VRCT) for persecutory delusions; MR/P02629X/1); a strategic award from the Wellcome Trust (University of Oxford Sleep and Circadian Neuroscience Institute; 098461/Z/12/Z); and a UK Medical Research Council (MRC) Senior Clinical Fellowship (G0902308) awarded to DF. A number of studies have also been supported by the NIHR Oxford Health Biomedical Research Centre.
Appendix: The Revised Green et al., Paranoid Thoughts Scale (R-GPTS)
Please read each of the statements carefully.
They refer to thoughts and feelings you may have had about others over the last month.