Skip to main content Accessibility help


  • Access
  • Cited by 78


      • Send article to Kindle

        To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

        Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

        Find out more about the Kindle Personal Document Service.

        Neural correlates of the misattribution of speech in schizophrenia
        Available formats

        Send article to Dropbox

        To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

        Neural correlates of the misattribution of speech in schizophrenia
        Available formats

        Send article to Google Drive

        To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

        Neural correlates of the misattribution of speech in schizophrenia
        Available formats
Export citation



The neurocognitive basis of auditory verbal hallucinations is unclear.


To investigate whether people with a history of such hallucinations would misattribute their own speech as external and show differential activation in brain areas implicated in hallucinations compared with people without such hallucinations.


Participants underwent functional magnetic resonance imaging (fMRI) while listening to pre-recorded words. The source (self/non-self) and acoustic quality (undistorted/distorted) were varied across trials. Participants indicated whether the speech they heard was their own or that of another person. Twenty people with schizophrenia (auditory verbal hallucinations n=10, no hallucinations n=10) and healthy controls (n=11) were tested.


The hallucinator group made more external misattributions and showed altered activation in the superior temporal gyrus and anterior cingulate compared with both other groups.


The misidentification of self-generated speech in patients with auditory verbal hallucinations is associated with functional abnormalities in the anterior cingulate and left temporal cortex. This may be related to impairment in the explicit evaluation of ambiguous auditory verbal stimuli.


Declaration of interest


Auditory verbal hallucinations are a cardinal feature of schizophrenia but their neurocognitive basis is unclear. Theoretical accounts proposed that such hallucinations result from a breakdown in the monitoring of the intention to generate inner speech, through a loss of the ‘efference copy’ associated with the generation of verbal material. This efference copy serves to inform an internal monitor of forthcoming action and may thus help to distinguish self-generated from externally generated verbal material (Blakemore et al, 2002). In the absence of this signal, inner speech may thus be misidentified as ‘alien’ and perceived as externally generated voices (Feinberg, 1978; Frith & Done, 1988). Hallucinations have therefore been conceptualised as resulting from a breakdown in the systems monitoring the current intention to make actions (Frith & Done, 1988).

However, monitoring can also occur at the level of the conscious evaluation of the verbal output (Levelt, 1983) when speakers hear their own voice. Impairment at this level may also lead to the erroneous misattribution of self-generated speech. When patients with schizophrenia who are prone to auditory verbal hallucinations speak and hear an acoustically distorted version of their own voice they tend to misidentify their own speech as being that of somebody else (Johns & McGuire, 1999; Fu et al, 2001; Johns et al, 2001). Although this impairment is consistent with a loss of efference copy, it could equally result from a problem with the conscious evaluation of auditory verbal feedback (Allen et al, 2004).

The purpose of our study was to use functional magnetic resonance imaging (fMRI) to examine the brain regions involved in the conscious appraisal of speech in people with schizophrenia who were and were not prone to auditory verbal hallucinations. The subjective experience of these hallucinations in schizophrenia is associated with activation in the inferior frontal, anterior cingulate and temporal cortex (McGuire et al, 1993; Shergill et al, 2000b ). Furthermore, the processing of verbal material in people who are prone to such hallucinations has been associated with differential engagement of these regions relative to people with schizophrenia who do not experience hallucinations and controls (McGuire et al, 1995; Shergill et al, 2003) particularly, in the temporal cortex (Fu et al, 2001). We tested the hypothesis that in people with auditory verbal hallucinations the appraisal of speech would be associated with the differential engagement of temporal, prefrontal and anterior cingulate cortices. More specifically, we tested the prediction that external misattributions in people with these hallucinations would be associated with altered activation of the temporal cortices.



All participants were right-handed men who spoke English as their first language and had no history of hearing problems. The study had local research ethics committee approval and all participants gave informed consent.

Control group

A control group of 11 healthy volunteers was recruited from the local community through advertisements. Applicants with a history of medical or psychiatric disorder, a drug or alcohol use problem, a family history of psychiatric disorder, or who were receiving medication were excluded. Their mean age was 28 years and their mean IQ, estimated with the National Adult Reading Test (NART; Nelson & O'Connell, 1978), was 115 (see Table 1).

Table 1 Group demographic and clinical characteristics.

Variable Control group (n=11) Mean (s.d.) Non-hallucinator group (n=10) Mean (s.d.) Hallucinator group (n=10) Mean (s.d.) Group comparisons
Age, years 29.21 (4.26) 34.78 (11.4) 34.83 (6.88) NS
Education, years 14.34 (3.2) 12.3 (1.64) 11.7 (1.41) NS
Premorbid IQ score 115 (5.78) 99 (8.56) 100 (7.42) F=16.9, P<0.001
Age at first onset, years 21.31 (5.63) 22.5 (5.13) NS
Duration of illness, years 16.32 (12.42) 12.33 (9.35) NS
SAPS scores
    AVH 0 4.47 (0.74) U=0, P<0.001
    Other hallucinations 0 0.82 (0.32) NS
    Delusions 4.15 (1.37) 4.41 (0.78) NS
    Formal thought disorder 1.57 (1.15) 0.95 (0.42) NS
    Bizarre behaviour 0.73 0.55 NS
    Total score1 6.38 (2.82) 10.21 (1.40) U=10.5, P=0.004
SANS scores
    Total score2 6.75 (5.51) 6.70 (3.82) NS
    Attentional problems 1.83 (1.25) 1.5 (1.05) NS
Antipsychotic medication
    Typical:atypical, n:n 3:7 4:6 χ2=0.11, P=0.73
Depression (CDS score) 5.51 (6.77) 8.00 (7.22) NS

Patient groups

All patients met DSM–IV criteria for schizophrenia (American Psychiatric Association, 1994) and were recruited through the South London and Maudsley National Health Service Trust. Clinical teams were systematically contacted with a request to identify patients with schizophrenia who either had prominent and current auditory verbal hallucinations, or had no current or previous history of such hallucinations. This information was corroborated by careful review of the patients’ clinical records. Potentially eligible patients were then approached by the investigators and assessed using the Scale for the Assessment of Positive Symptoms (SAPS; Andreasen, 1984a ), the Scale for the Assessment of Negative Symptoms (SANS; Andreasen, 1984b ), the Calgary Depression Scale (Addington et al, 1990) and the NART.

The hallucinator group (n=10) comprised patients who scored ≥3 on the SAPS auditory hallucination item (clear evidence of voices and that they had occurred in the past week). All of these patients had a documented history of auditory verbal hallucinations. Patients in this group were also experiencing other positive symptoms, particularly delusions, and had low levels of negative symptoms (see Table 1). Nine of this group were in hospital at the time of testing and one was receiving out-patient treatment. None reported hallucinations during the fMRI scanning procedure.

The non-hallucinator group (n=10) was composed of patients who were not experiencing auditory verbal hallucinations at the time of testing and had no previous history of such hallucinations. This was assessed by detailed inspection of the patients’ notes, and consultation with clinical staff. Patients with any history of such hallucinations were excluded. Patients in this group had positive symptoms other than hallucinations – particularly delusions (see Table 1). Eight of these patients were in hospital at the time of testing and two were receiving out-patient treatment.

Exclusion criteria for both patient groups included the presence of an Axis II DSM–IV diagnosis or another Axis I diagnosis, a neurological disorder or a history of substance or alcohol misuse. Patients with an IQ below 80 were also excluded. All patients had been receiving regular doses of antipsychotic medication for at least 1 month prior to testing. Potential participants who reported a history of hearing problems were excluded. The healthy volunteers had a higher premorbid IQ than either patient group; the IQ score was therefore included as a covariate in the between-group analyses.


Word lists

Eighty adjectives applicable to people were used (e.g. ‘perfect’, ‘tall’). All the words were monosyllabic or bisyllabic with a Thorndike–Lorge frequency greater than 50 (Gilhooly & Logie, 1980), and were selected from lists used in a previous study (McGuire et al, 1996). The emotional valence of these words had previously been rated by 40 healthy volunteers as either negative, positive or neutral (Johns et al, 2001). Thus the 80 words used consisted of 27 positive, 27 negative and 26 neutral words. The sets of words presented in each condition were balanced for the number of syllables (i.e. equal amounts of one and two syllable words), word frequency and valence (equal amounts of positive, negative and neutral words).

Auditory stimuli

The participants’ speech was recorded on Cool Edit 2000 for Windows, which allowed the recordings to be normalised, pitch-shifted and edited into 80 individual wave files. A pitch shift of –4 semitones was used because it made the speaker's voice more difficult to recognise without making the speech incomprehensible. A male researcher who was unknown to the participants recorded the words for the non-self condition (40 words in total). A researcher was chosen who used English received pronunciation.


A factorial design was used, with two levels for sources of speech (self, alien) and two levels of distortion (0, –4 semitones). There were 20 words in each of four speech conditions presented in the fMRI experiment (20 self undistorted, 20 self distorted, 20 alien undistorted, 20 alien distorted). The experimental manipulations were source of speech (self, alien) and distortion level (0, –4 semitones). Words were presented in a non-self (alien) voice as well as in the participant's voice, to test whether any response bias was specific to self-generated words.


Patients underwent symptom assessment using the SAPS and SANS either the day before or on the day of the fMRI scan. Approximately 1 hour before scanning all participants were presented with a list of 80 words on a piece of paper and asked to read them aloud in a clear voice at a rate of approximately one word per second. Participants read all 80 words, even though half would subsequently be presented to them in another person's voice; this was to ensure that participants could not make judgements based on source information during the task. They were not asked to remember the words. Their speech was recorded by a computer. The experimenter then edited the recordings so that 40 of the words were replaced by a recording of the same word spoken in another person's voice, and 40 were pitch-shifted. The subsets of words that were replaced and pitch-shifted respectively were pre-designated (allocated so that the subsets subsets were matched for word length, frequency and valence). The same subsets of words were used for all participants. Once participants had been placed in the scanner a standardised instruction script was read out to them. Participants were told to listen carefully to each word and make a decision regarding the source of the speech; they were able to register a response of either ‘self’, ‘unsure’ or ‘other’ by means of a button box. The option to register an unsure response was included to avoid participants having to make a forced choice between a self or alien source even when they were unsure.

Image acquisition

Images were acquired in a 1.5 T Magnet (Signa LX; GE, Milwaukee, Wisconsin, USA) using a compressed gradient echo (Edmister et al, 1999), echoplanar image acquisition (Hall et al, 1999), with a time to repetition (TR) of 1.2 s (0.8 s of silence), flip angle 80°, time to echo (TE) 40 ms, 64 × 64 pixels, field of view 200 mm, slice thickness 7 mm and interslice gap 0.7 mm (voxel size 3.125 mm × 3.125 mm × 7 mm); 482 image volumes were acquired in two runs of 6 min each. Of the 482 images 80 were experimental events (20 in each speech condition) and the remainder were rest (i.e. no auditory stimulus was presented). Each whole-brain volume consisted of 14 axial slices parallel to the anterior–posterior intercommissural line.

Stimuli were presented in random order in an event-related design, with a variable interstimulus interval (4–12 s) following a non-gaussian random distribution (Poisson function peaking at 7 s) individually set for each condition (Dale, 1999). Image acquisition and stimulus presentation were synchronised by a transistor–transistor logic (TTL) pulse from the scanner to the computer used to present the stimuli and record the behaviour. The compressed acquisition permitted presentation of each word in in the the absence of acoustic scanner noise. Each response time was locked to the beginning of the word presentation.

Image analysis

Data were analysed with software developed at the Institute of Psychiatry, using a non-parametric approach. Data were first processed (Bullmore et al, 1999a ) to minimise motion-related artefacts. Responses to the experimental paradigms were then detected by first convolving each component of the experimental design with each of two gamma variate functions (peak responses at 4 s and 8 s respectively). The best fit between the weighted sum of these convolutions and the time series at each voxel was computed using the constrained blood oxygen level dependent (BOLD) effect model suggested by Friman et al (2003). Following computation of the model fit, a goodness-of-fit statistic was computed. This consisted of the ratio of the sum of squares of deviations from the mean image intensity (over the whole time series) due to the model to the sum of squares of deviations due to the residuals (SSQ ratio). Following computation of the observed SSQ ratio at each voxel, the data are permuted by the wavelet-based method described and extensively characterised by Bullmore et al (2001). Using this distribution it is possible to calculate the critical value of SSQ ratio needed to threshold the maps at any desired type I error rate. The detection of activated voxels is extended from voxel to cluster level using the method described in detail by Bullmore et al (1999b ). Events in the four experimental conditions (self, self distorted, alien and alien distorted speech) were contrasted against rest volumes for all participants.

Group mapping

The observed and permuted SSQ ratio maps for each individual, as well as the BOLD effect size maps, were transformed into the standard space of Talairach & Tournoux (1988) using the two-stage warping procedure described in detail by Brammer et al (1997). Group activation maps were computed by determining the median SSQ ratio at each voxel (over all individuals) in the observed and permuted data maps (medians are used to minimise outlier effects). Cluster-level maps were thresholded at less than one expected type I error cluster per brain. The computation of a standardised measure of effect SSQ ratio at the individual level, followed by analysis of the median SSQ ratio maps over all individuals, treats intra- and inter-individual variations in effect separately, constituting a mixed-effect approach to analysis which is deemed desirable in fMRI.

Repeated-measures contrasts

The analysis was performed using the brain activation data from each participant under each condition. The permutation-based analysis was performed by first determining the median change across all participants and between participant treatments. The treatment labels were then permuted and the median change computed. The use of median statistics renders this analysis robust to outlier data in individual cases. The data were then analysed using a non-parametric repeated-measures analysis of covariance (Bullmore et al, 1999b ). The experimental conditions were defined according to the source of the speech (self or alien) and the level of distortion (undistorted or distorted). The data were analysed using a series of non-parametric factorial analysis of variance (ANOVA). We examined the main effect of speech source, distortion and their interactions with group. The effect of the emotional valence of the words on the fMRI data was not examined because it had no significant effect on behavioural results. To test for the interaction between the source of speech, level of distortion and group we examined the main effect of distortion on self speech and the interaction with group and the main effect of distortion on alien speech and its interaction with group. To examine the neural correlates of the misattribution of speech, we analysed the main effect of the accuracy of attribution (correct responses or misattributions errors). Events were categorised as correct or misattributions according to each participant's behavioural response. Trials associated with unsure responses were excluded from this analysis. Maps of the difference in the effect size of the BOLD response associated with correct and incorrect attributions were generated. In this particular analysis the effect size statistic was used because the numbers of trials associated with correct and incorrect responses were not equal across conditions. The effect size statistic is relatively insensitive to differences in the number of responses per condition. Use of the effect size statistic also avoids the possibility that differences in BOLD response could reflect changes in the denominator of the statistic (noise) rather than signal, as can occur when using standardised statistics such as F, t or SSQ ratio. All between-group contrasts were covaried for NART premorbid IQ scores (using XBAM version 3.4;


The demographic and clinical characteristics of the participants are shown in Table 1.

Behavioural data

Analysis of variance was conducted for misattribution errors, defined as misidentifications of the source of the speech (i.e. an ‘other’ response when hearing their own speech or a ‘self’ response when hearing alien speech), excluding ‘unsure’ responses (Fig. 1). The data were analysed using an ANOVA for repeated measures.

Analysis of variance

For misattribution errors the main effects for source (F=6.00, d.f.=1,28, P=0.02), distortion (F=12.36, d.f.=1,28, P=0.002) and group (F=6.18, d.f.=2,28, P=0.006) were all significant. As there was a significant between-group variance in NART scores this variable was used as a covariate. After the inclusion of this covariate the between-subjects effect for group remained significant (F=4.67, d.f.=2,28, P=0.02). There was a significant interaction between the effects of source of speech and group (F=3.50, d.f.=2,28, P=0.04). A post hoc one-way ANOVA revealed a significant group difference in the self speech condition (F=11.24, d.f.=2,30, P<0.001). A Bonferroni t-test showed that those in the hallucinator group made significantly more misattribution errors than the participants in both the non-hallucinator (P=0.001) and control groups (P=0.001). There was no significant group difference in either of the alien speech conditions (for alien undistorted speech, F=0.09, d.f.=2,29, P=0.91; for alien distorted speech, F=0.21, d.f.=2,29, P=0.13). The interaction between source, distortion and group was nonsignificant (F=1.16, d.f.=2,28, P=0.32). All main effects and interactions involving valence were also non-significant.

Imaging data: task-related activation independent of condition

Performance of the task across all conditions and all groups (independent of performance) was associated with bilateral activation in the inferior frontal, anterior cingulate and superior temporal gyri, the brain-stem and the cerebellum.

Source of speech and group interaction

The main effect of source of speech is presented in Table 2. There was a significant interaction between the source of speech and group in the left superior temporal gyrus (Fig. 2(a,b)). Examination of the SSQ ratios from this region revealed that both the control group and the non-hallucinator group showed greater activation when processing alien speech compared with self speech. However, in the hallucinator group the response in this area was similar for alien and for self speech.

Table 2 Main effects and group interactions for source of speech and level of distortion; all contrasts are reported at a clusterwise threshold of P=0.01 (less than one false positive cluster).

Cerebral region Side Coordinates1 Cluster Size BA
x y z
Source of speech
    Self > alien
        Inferior frontal gyrus L -29 26 -2 50 47
        Anterior cingulate R 4 26 15 28 24
        Insula L -36 19 4 9 13
    Alien > self
        Lingual gyrus R 0 -78 -13 36 18
        Middle frontal gyrus R 43 26 15 13 46
        Cuneus L -11 -78 15 10 18
        Fusiform gyrus L -33 -13 8 9 20
        Superior temporal gyrus R -30 -7 9 7 21
    Undistorted > distorted
        Middle temporal gyrus L -50 -29 -2 84 21
        Lingual gyrus R 1 -78 -12 83 18
        Middle frontal gyrus R 43 15 20 16 46
    Distorted > undistorted
        Inferior frontal gyrus R 26 -2 10 89 47
        Cingulate gyrus R 11 15 31 37 32
        Insula R 32 22 4 15 13
        Inferior frontal gyrus L -43 15 -7 35 47
    Source × group
        Superior temporal gyrus L -44 -22 -2 35 22
    Distortion × group
        Cingulate gyrus L -4 26 31 66 32

Fig. 1 Mean number of misattribution error trials according to condition and group.

Distortion and group interaction

The main effect of distortion is shown Table 2. There was an interaction between the effects of distortion and group (Fig. 2a,c ). In both the control group and the non-hallucinator group processing distorted relative to undistorted speech was associated with activation in the cingulate gyrus. In the hallucinator group the response in this region was unaffected by acoustic distortion (Table 2).

Effects of distortion on self and alien speech and group interactions

There were significant interactions between the effect of distortion on self speech and group in the left anterior cingulate and the right superior temporal gyrus (Fig. 3a,b ; Table 3). In the cingulate gyrus both the control group and the non-hallucinator group showed greater activation when processing distorted v. undistorted self speech, whereas the opposite was true in the hallucinator group. In the right superior temporal gyrus the hallucinator group showed greater activation for distorted v. undistorted self speech, the converse was evident in the non-hallucinator group, and distortion had little effect on activation in the control group. The group interaction for the effect of distortion on alien speech was restricted to the right anterior cingulate gyrus (Table 3). In this region both the control group and the non-hallucinator group showed greater activation when processing alien speech that was distorted as opposed to undistorted. However, in the hallucinator group distortion had no effect on the level of activation in this region.

Table 3 Main effects and group interactions for the effects of distortion on both self and alien speech and analysis of response accuracy; all contrasts are reported at a clusterwise threshold of P=0.01 (less than one false positive cluster)

Cerebral region Side Coordinates Cluster size BA
x y z
Effect of distortion on self speech × group
    Cingulate gyrus L -4 22 26 37 32
    Superior temporal gyrus R 51 -18 4 54 22
Effect of distortion on alien speech × group
    Cingulate gyrus R 4 30 26 58 32
Response analysis
    Correct > misattribution
        Middle temporal gyrus L -50 -30 -7 175 21
        Middle temporal gyrus R -51 -13 0 124 21
    Misattributions > correct Null result
    Group interaction (all speech)
        Middle temporal gyrus L -50 -30 -2 123 21
    Group interaction in the self speech condition
        Middle temporal gyrus L -50 -30 -2 133 21
    Group interaction in the alien speech condition Null result

Main effect and group interaction for correct v. misattributed responses

For all participants correct responses (regardless of speech source or the level of distortion) were associated with greater activation in the middle temporal gyrus bilaterally relative to misattributions. No area was more activated in association with misattributions than with correct responses. There was an interaction between response accuracy (correct/misattribution) and group in the left middle temporal gyrus. In both the control and non-hallucinator groups there was greater activation for correct responses (correct identification of either self or alien speech) than for misattributions, whereas there was no difference in the hallucinator group. In order to test our specific hypothesis about activation being associated with external (self to alien) misattributions, the analysis was then restricted to the self speech condition (i.e. the correct identification of self speech v. its misattribution to an external source). Again there was an interaction with group in the left middle temporal gyrus, with the same patterns of activation as described above (Fig. 3c , Table 3). When the effect of response accuracy was examined in the alien speech condition alone there was no significant interaction with group.


Our study used fMRI to study the neural correlates of making self/non-self judgements about the source of pre-recorded speech in the presence and absence of acoustic distortion. We examined the effects of speech source and of distortion in patients with auditory verbal hallucinations, patients without such hallucinations and controls. In addition, by using event-related fMRI we were able to categorise the neural response to each word according to the accuracy of the self/non-self attribution and thus examine the correlates of external misattributions.

A tendency for patients with hallucinations to misattribute their own distorted speech to an alien source was first demonstrated using a paradigm in which participants overtly articulated single words and heard what they said in real time (Johns & McGuire, 1999). We used the same paradigm, except that participants heard the words but did not speak. As in a recent study using this modified version of the task, we found that patients with auditory verbal hallucinations also made more external misattributions than both the non-hallucinator group and the control group (Allen et al, 2004), particularly when their speech was distorted (although this did not achieve statistical significance in our study). This may reflect a lack of power, as the number of trials per condition was limited by the practicalities of the fMRI experiment.

Overall, the task activated a network of inferior frontal, temporal and cingulate regions as well as areas in the brain-stem and cerebellum. This is consistent with data from previous studies of voice processing (Binder et al, 2000) and a study of the same task in healthy volunteers (Allen et al, 2005). Within this network, across all three groups there were regions that were more activated when participants processed self-generated speech compared with alien speech and vice versa. However, the hallucinator group differed from both controls and the non-hallucinator group in the effect of the source of the speech on activation in the left superior temporal gyrus. In this region both the reference groups showed increased activation when listening to alien speech compared with self speech, whereas the activation in the hallucinator group was relatively unaffected by the source of the speech. Activation during the task was also influenced by the acoustic distortion of the stimuli. Again, there were significant differences in the effects of distortion between the hallucinators and the other two groups. In the control and non-hallucinator groups distortion was associated with the engagement of the anterior cingulate gyrus, but this effect was absent in the hallucinator group.

The above data suggest that when patients who were prone to hallucinations evaluated speech, the left temporal cortex and the anterior cingulate were differentially responsive to its source and its acoustic quality respectively relative to the reference groups. These findings are consistent with our hypothesis and with data from previous studies that have implicated these regions in schizophrenia (Shapleske et al, 1999; Carter et al, 2001) and the pathophysiology of auditory verbal hallucinations (Suzuki et al, 1993; Shergill et al, 2000a ).

The group differences in the effects of source on the left superior temporal activation suggest that this region is normally sensitive to whether speech has been self or externally generated, but that this sensitivity might be impaired in patients who are prone to auditory verbal hallucinations. Interestingly, a difference in BOLD signal for the perception of one's own actions, compared with the perception of the actions of another, has been reported in pre-motor areas (Grezes et al, 2004). This may be due to a closer match between stimulated and perceived action for self-generated actions. Although our study involved the auditory modality it is possible that a similar mechanism applies to the perception of self speech and the speech of another. Functional differences in processing in the secondary auditory cortex are of particular interest, because an impairment in the ability to distinguish self-generated from external speech is fundamental to most cognitive models of auditory hallucinations (Frith & Done, 1988; Seal et al, 2004).

Fig. 2 Brain activation maps (a) and SSQ plots for (b) the interaction between the effects of source of speech and group in the left superior temporal gyrus and (c) the interaction between the effect of distortion and group in the left ACC (P=0.01<1 false positive cluster. (ACC, anterior cingulate cortex; SSQ, sum of squares; STG, superior temporal gyrus).

The group differences in the effects of distortion on activation in the dorsal part of the anterior cingulate cortex occurred regardless of the source of speech. The caudal portion of the anterior cingulate is implicated in directed attention, response monitoring and selection (Corbetta et al, 1991; Carter et al, 1998). Its activation in association with distortion may thus have reflected increased engagement of these processes in response to stimuli that become more difficult to perceive as a result of the pitch shift. The failure of patients with hallucinations to activate the anterior cingulate in the presence of distortion may thus reflect impairments in these cognitive processes. However, when the effect of distortion was restricted to self-generated speech an interaction with group was observed in the right superior temporal gyrus. In this region patients with hallucinations showed increased activation to distorted self-generated speech. The basis of the increased activation is unclear, but it could reflect altered modulation from other regions that are themselves differentially engaged in this group during this condition, such as the anterior cingulate. Furthermore, several studies have reported that patients with schizophrenia demonstrated relatively greater activation of the right temporal gyrus cortex (compared with the left) when listening to normal speech, and this may reflect a disruption in left lateralisation of language function seen in right-handed individuals (Woodruff et al, 1997).

Information on the neural correlates of misattributions themselves was obtained by comparing activity associated with misattributions and correct responses. When participants in the hallucinator group made external misattributions (when processing their own speech) these were associated with activation in the left middle temporal gyrus, whereas in the control and non-hallucinator groups there was a greater left temporal response when participants correctly identified their own speech. This distinction between the groups was specific to external misattributions, as there were no group difference in activation when participants misidentified alien speech as their own (internal misattributions).

Fig. 3 (a) Brain activation map for the interaction between the effects of distortion on self speech and group (P=0.01, <1 false positive cluster). (b) SSQ plots for group interactions in the superior temporal gyrus and anterior cingulate gyrus; (c) brain activation map for group interactions with accuracy of response in the self speech condition in the left middle temporal gyrus (P=0.01; <1 false positive cluster); in the control and non-hallucinator groups misattributions were associated with less activation than correct responses, but the converse was true in the hallucinator group; (d) percentage signal change plots for group × accuracy interaction in the left superior temporal gyrus (SSQ, sum of squares).

Both the behavioural and neuroimaging results of our study are similar to those reported using a version of the task that involved participants articulating the words aloud (McGuire et al, 1996; Fu et al, 2001). Thus, in both cases, patients with hallucinations tended to make external misattributions when processing their own distorted speech, and this misattribution was associated with activation of the temporal cortex relative to the correct recognition of self-generated speech. The overall similarity of the results despite the absence of an efference copy component in this study suggests that the differences between the hallucinator groups and the other groups might be related to impairment with the evaluation of auditory verbal material, rather than defective corollary discharge. For example, patients with auditory verbal hallucinations usually have delusions, and delusions are associated with abnormalities of reasoning manifested as a tendency to ‘jump to conclusions’ (Garety et al, 1991). Indeed, recent behavioural work suggests that misattribution errors on verbal self-monitoring tasks may be related to delusions rather than to hallucinations (Johns et al, 2006). However, this finding was not replicated in our study.

The study has some limitations. Although it focused on how biased judgements might contribute to the experience of externality, it does not explain how the events that are being judged occur in the first place. Contemporary models of hallucinations propose that they arise through the combination of the generation of anomalous experiences and problems in the appraisal of these experiences (Seal et al, 2004; Ditman & Kuperberg, 2005) The biased judgement of sensory material could also contribute to other symptoms, such as delusions: in this case faulty judgements might lead to the misinterpretation of external events such as other people's behaviour. The coincidence of auditory hallucinations and delusions in schizophrenia is consistent with these symptoms sharing cognitive mechanisms. Second, it is possible that attentional problems may contribute to the tendency to make misattribution errors. The patient groups did not differ on a measure of SANS attentional problems; however, a more rigorous assessment of attentional impairments would have helped to exclude this possibility. The attenuated anterior cingulate response observed in the hallucinator group may reflect problems in these domains. Furthermore, there are strong reciprocal connections between the anterior cingulate and temporal cortex (Petrides & Pandya, 1988). It is possible that the superior temporal gyrus response seen in the hallucinator group is associated with altered ‘top down’ modulation of this region by the anterior cingulate (Fletcher et al, 1999). Although the causation is speculative, it is possible that impaired anterior cingulate modulation of the temporal cortices is associated with making faulty source judgements about perceived speech. The functional integration between the cingulate and temporal cortices could be tested in future work examining the effective connectivity between regions and how this altered in patients with hallucinations.

In summary, external misattributions of speech in patients with hallucinations can occur independently of any self-monitoring deficit, suggesting that hallucinations may be related to problems with the conscious evaluation of verbal material rather than the breakdown of an ‘efferent copy’. This impairment was associated with the abnormal engagement of the temporal cortex along with the anterior cingulate. Although the study involved the evaluation of external rather than inner speech (which is more relevant to verbal hallucinations), it is possible that the same mechanisms are used to appraise internal and external speech.


Addington, D., Addington, J. & Schissel, B. (1990) A depression rating scale for schizophrenics. Schizophrenia Research, 3, 247251.
Allen, P. P., Johns, L. C., Fu, C. H., et al (2004) Misattribution of external speech in patients with hallucinations and delusions. Schizophrenia Research, 69, 277287.
Allen, P. P., Amaro, E., Fu, C. H., et al (2005) Neural correlates of the misattribution of self-generated speech. Human Brain Mapping, 26, 4453.
American Psychiatric Association (1994) Diagnostic and Statistical Manual of Mental Disorders (4th edn) (DSM-IV). APA.
Andreasen, N. C. (1984a) Scale for the Assessment of Positive Symptoms (SAPS). University of Iowa.
Andreasen, N. C. (1984b) Scale for the Assessment of Negative Symptoms (SANS). University of Iowa.
Binder, J. R., Frost, J. A., Hammeke, T. A., et al (2000) Human temporal lobe activation by speech and nonspeech sounds. Cerebral Cortex, 10, 512528.
Blakemore, S. J., Wolpert, D. M. & Frith, C. D. (2002) Abnormalities in the awareness of action. Trends in Cognitive Science, 6, 237242.
Brammer, M. J., Bullmore, E. T., Simmons, A., et al (1997) Generic brain activation mapping in functional magnetic resonance imaging: a nonparametric approach. Magnetic Resonance Imaging, 15, 763770.
Bullmore, E. T., Brammer, M. J., Rabe-Hesketh, S., et al (1999a) Methods for diagnosis and treatment of stimulus-correlated motion in generic brain activation studies using fMRI. Human Brain Mapping, 7, 3848.
Bullmore, E. T., Suckling, J., Overmeyer, S., et al (1999b) Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of structural MR images of the brain. IEEE Transactions on Medical Imaging, 18, 3242.
Bullmore, E., Long, C., Suckling, J., et al (2001) Colored noise and computational inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains. Human Brain Mapping, 12, 6178.
Carter, C. S., Braver, T. S., Barch, D. M., et al (1998) Anterior cingulate cortex, error detection, and the online monitoring of performance. Science, 280, 747749.
Carter, C. S., MacDonald, A. W., Ross, L. L., et al (2001) Anterior cingulate cortex activity and impaired self-monitoring of performance in patients with schizophrenia: an event-related fMRI study. American Journal of Psychiatry, 158, 14231428.
Corbetta, M., Miezin, F. M., Dobmeyer, S., et al (1991) Selective and divided attention during visual discriminations of shape, color, and speed: functional anatomy by positron emission tomography. Journal of Neuroscience, 11, 23832402.
Dale, A. M. (1999) Optimal experimental design for event-related fMRI. Human Brain Mapping, 8, 109114.
Ditman, T. & Kuperberg, G. R. (2005) A sourcemonitoring account of auditory verbal hallucinations in patients with schizophrenia. Harvard Review of Psychiatry, 13, 280299.
Edmister, W. B., Talavage, T. M., Ledden, P. J., et al (1999) Improved auditory cortex imaging using clustered volume acquisitions. Human Brain Mapping, 7, 8997.
Feinberg, I. (1978) Efference copy and corollary discharge: implications for thinking and its disorders. Schizophrenia Bulletin, 4, 636640.
Fletcher, P., McKenna, P. J., Friston, K. J., et al (1999) Abnormal cingulate modulation of fronto-temporal connectivity in schizophrenia. NeuroImage, 9, 337342.
Friman, O., Borga, M., Lundberg, P., et al (2003) Adaptive analysis of fMRI data. NeuroImage, 19, 837845.
Frith, C. D. & Done, D. J. (1988) Towards a neuropsychology of schizophrenia. British Journal of Psychiatry, 153, 437443.
Fu, C. H. Y., Vythelingum, N., Andrew, C., et al (2001) Alien voices … who said that? Neural correlates of impaired verbal self-monitoring in schizophrenia. NeuroImage, 13, S1052S1052.
Garety, P., Hemsley, D. & Wessley, S. (1991) Reasoning in deluded schizophrenics and paranoid patients: biases in performance on a probabilistic inference task. Journal of Nervous and Mental Disease, 179, 194201.
Gilhooly, K. J. & Logie, R. H. (1980) Age of acquisition, imagery, concreteness, familiarity and ambiguity measures for 1, 944 words. Behavior Research Methods, Instruments and Computers, 12, 365377.
Grezes, J., Frith, C. D. & Passingham, R. E. (2004) Inferring false beliefs from the actions of oneself and others: an fMRI study. NeuroImage, 21, 744750.
Hall, D. A., Haggard, M. P., Akeroyd, M. A., et al (1999) ‘Sparse’ temporal sampling in auditory fMRI. Human Brain Mapping, 7, 213223.
Johns, L. C. & McGuire, P. K. (1999) Verbal self-monitoring and auditory hallucinations in schizophrenia. Lancet, 353, 469470.
Johns, L. C., Rossell, S., Frith, C., et al (2001) Verbal self-monitoring and auditory verba hallucinations in patients with schizophrenia. Psychological Medicine, 31, 705715.
Johns, L. C., Gregg, L., Allen, P., et al (2006) Verbal self-monitoring and auditory verba hallucinations in psychosis: symptom or syndrome specific? Psychological Medicine, 36, 465474.
Levelt, W. J. (1983) Monitoring and self-repair in speech. Cognition, 14, 41104.
McGuire, P. K., Shah, G. M. & Murray, R. M. (1993) Increased blood flow in Broca's area during auditory hallucinations in schizophrenia. Lancet, 342, 703706.
McGuire, P. K., Silbersweig, D. A., Wright, I., et al (1995) Abnormal monitoring of inner speech: a physiological basis for auditory hallucinations. Lancet, 346, 596600.
McGuire, P. K., Silbersweig, D. A. & Frith, C. D. (1996) Functional neuroanatomy of verbal self-monitoring. Brain, 119, 907917.
Nelson, H. E. & O'Connell, A. (1978) Dementia: the estimation of premorbid intelligence levels using the New Adult Reading Test. Cortex, 14, 234244.
Petrides, M. & Pandya, D. N. (1988) Association fiber pathways to the frontal cortex from the superior temporal region in the rhesus monkey. Journal of Comprehensive Neurology, 273, 5266.
Seal, M. L., Aleman, A. & McGuire, P. K. (2004) Compelling imagery, unanticipated speech and deceptive memory: neurocognitive models of auditory verbal hallucinations in schizophrenia. Cognitive Neuropsychiatry, 9, 4372.
Shapleske, J., Rossell, S. L., Woodruff, P. W., et al (1999) The planum temporale: a systematic, quantitative review of its structural, functional and clinical significance. Brain Research Reviews, 29, 2649.
Shergill, S. S., Brammer, M. J., Williams, S. C., et al (2000a) Mapping auditory hallucinations in schizophrenia using functional magnetic resonance imaging. Archives of General Psychiatry, 57, 10331038.
Shergill, S. S., Bullmore, E., Simmons, A., et al (2000b) Functional anatomy of auditory verbal imagery in schizophrenic patients with auditory hallucinations. American Journal of Psychiatry, 157, 16911693.
Shergill, S. S., Brammer, M. J., Fukuda, R., et al (2003) Engagement of brain areas implicated in processing inner speech in people with auditory hallucinations. British Journal of Psychiatry, 182, 525531.
Suzuki, M., Yuasa, S., Minabe, Y., et al (1993) Left superior temporal blood flow increases in schizophrenic and schizophreniform patients with auditory hallucination: a longitudinal case study using 1231-IMP SPECT. European Archives of Psychiatry and Clinical Neuroscience, 242, 257261.
Talairach, J. & Tournoux, P. A. (1988) Co-Planar Stereotaxic Atlas of the Human Brain. Thiieme.
Woodruff, P. W., Wright, I. C., Bullmore, E. T., et al (1997) Auditory hallucinations and the temporal cortical response to speech in schizophrenia: a functional magnetic resonance imaging study. American Journal of Psychiatry, 154, 16761682.