  • Research and Theory on Workplace Aggression
  • Online publication date: March 2017
  • pp 9-33


1 Measurement of Workplace Aggression

Steve M. Jex and Alison M. Bayne

Interpersonal mistreatment is a broad term that has been used to describe a myriad of negative employee behaviors within organizations that are harmful to employees, as well as to organizations as a whole (Cortina & Magley, Reference Cortina and Magley2003). Under this general umbrella of interpersonal mistreatment there are a number of constructs such as workplace incivility, workplace bullying, interpersonal conflict, social undermining, workplace deviance, and counterproductive work behavior. One of the biggest challenges for the interpersonal mistreatment literature has been to somehow distinguish these related constructs in a meaningful way; that is, at both conceptual and operational levels (see Hershcovis, Reference Hershcovis2011).

The most common ways of distinguishing among these forms of interpersonal mistreatment have been to look at differences in severity as well as intent to harm. If we use these two dimensions to distinguish different forms of interpersonal mistreatment, what emerges is an important sub-construct that most researchers have labeled workplace aggression. Specifically, workplace aggression represents forms of interpersonal mistreatment that are (1) relatively severe, and (2) where there is a clear intent on the part of the perpetrator to harm the victim of such behaviors.

This chapter examines and critiques five of the most frequently used measures of workplace aggression. These include the Interpersonal Conflict at Work Scale (Spector & Jex, Reference Spector1998; 884 citations in the previous decade), Counterproductive Work Behavior Checklist (Spector, Fox, Penney, Bruursema, Goh, & Kessler, Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006; 436 citations since publication), Workplace Deviance Scale (Bennett & Robinson, Reference Bennett and Robinson2000; 1,200 citations in the previous decade), Negative Acts Questionnaire – Revised (Einarsen, Hoel, & Notelaers, Reference Einarsen, Hoel and Notelaers2009; Einarsen, Raknes, Matthieson, & Hellsey, 1994; 390 citations since 2009 publication), and the Social Undermining Scale (Duffy, Ganster, & Pagon, Reference Duffy, Ganster and Pagon2002; 595 citations in the previous decade). If we apply the previously mentioned criteria of severity and intent, all of the aforementioned constructs would qualify as forms of workplace aggression – the one exception would be workplace incivility. This is because most forms of incivility (e.g., failing to return a phone call) are rather mild, and the intent behind uncivil behavior is often ambiguous. It is also worth noting that while three of the measures included in this review (the ICAWS, the NAQ, and the Social Undermining Scale) exclusively address workplace aggression, the other two scales (the Counterproductive Work Behavior Checklist and the Workplace Deviance Scale) assess workplace aggression in addition to other content, best described as counterproductive work behavior directed at the organization (e.g., tardiness, stealing supplies, etc.).

The focus of this chapter is to review and critique measures of the major forms of workplace aggression that are being studied by occupational health researchers. We chose to focus this review and critique on workplace aggression because there have been previous reviews that have focused on the measurement of workplace incivility (e.g., Jex, Burnfield-Geimer, Clark, Guidroz, & Yugo, Reference Jex, Burnfield-Geimer, Clark, Guidroz, Yugo, Edwards and Greenberg2010), and there have been few attempts to critique specific measures of any form of interpersonal mistreatment. We begin the chapter with a brief discussion of the general challenges associated with measuring workplace aggression, regardless of the specific measure used. We then focus specifically on five commonly used measures, and then discuss the problems that we identify as being common to all five of these measures. We conclude the chapter with some general suggestions to improve the measurement of workplace aggression.

The Challenges of Measuring Workplace Aggression

Regardless of the specific instrument used, measuring workplace aggression can be a challenging endeavor for researchers. One of the major reasons for this is the nature of the construct itself. Like many constructs in the organizational sciences, workplace aggression is largely subjective. In addition, as stated in the preceding section, one of the defining characteristics of workplace aggression is a clear intent to harm on the part of the person perpetrating the aggression. The concept of intent to harm is relatively clear, yet in practice, unfortunately, this is rather difficult to demonstrate. The reality is that the only person who knows whether or not harm is intended is the person who is perpetrating the aggression.

Another major challenge in measuring workplace aggression, and most other forms of interpersonal mistreatment for that matter, is that respondents are often asked to recall behaviors that may have occurred several months or, in some cases, even more than a year ago. Long time frames are often used out of necessity, since base rates for many forms of workplace aggression are low. Nevertheless, respondents may have considerable difficulty remembering instances of workplace aggression that have occurred several months or years in the past.

A third major challenge in measuring workplace aggression is that some forms of workplace aggression are not observable to victims, and thus if measurement is done from the victim perspective (which is quite common), the level of workplace aggression would be underestimated. For example, it is quite possible for social undermining (Duffy et al., Reference Duffy, Ganster and Pagon2002) to occur without the victim of undermining being present or aware that he or she is being undermined. The same can be said for many forms of Counterproductive Work Behavior (CWB) such as theft, sabotage, or a fellow employee deliberately refusing to provide help (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006).

A final challenge in measuring workplace aggression, as with all forms of interpersonal mistreatment, is that it can be measured from multiple perspectives. As stated earlier, the most common perspective used in measuring workplace aggression has been the victim (Duffy et al., Reference Duffy, Ganster and Pagon2002; Einarsen et al., Reference Einarsen, Hoel and Notelaers2009). An example item assessing interpersonal mistreatment from the victim’s perspective might ask how frequently a person has been shouted at in the previous six months (Einarsen, Hoel, & Notelaers, Reference Einarsen, Hoel and Notelaers2009). However, there are some forms of workplace aggression, most notably CWB, that are typically measured from the perpetrator perspective (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006). An example item from the perpetrator perspective might ask whether a person has played a mean joke or prank on a coworker (Bennett & Robinson, Reference Bennett and Robinson2000). In recent years there has also been some effort to measure mistreatment from the perspective of those who observe such behaviors being perpetrated within their organization. Such measurement might entail asking people about their reactions toward instigators and targets in an observed instance of mistreatment (e.g., Reich & Hershcovis, Reference Reich and Hershcovis2015).

What makes these multiple perspectives somewhat problematic, at least from a measurement perspective, is that very little workplace aggression research has attempted to triangulate measures from these different perspectives. One exception is in the area of CWB where it has been shown, via meta-analysis, that self-reports converge well with measures from other data sources such as supervisors or coworkers (Berry, Carpenter, & Barratt, Reference Berry, Carpenter and Barratt2012). Conversely, Spector, Dwyer, and Jex (Reference Spector, Dwyer and Jex1988) found relatively modest convergence (r = .30) between incumbent reports of the level of interpersonal conflict in their jobs and supervisor reports of incumbent interpersonal conflict in a sample of university-based clerical employees. This level of convergence is similar to other studies that have attempted to triangulate self-report measures of incumbent job conditions with other data sources (e.g., Liu, Spector, & Jex, Reference Liu, Spector and Jex2005; Spector & Jex, Reference Spector and Jex1991), and highlights the potential problems with viewing workplace aggression exclusively through one particular lens. This once again highlights the subjective nature of workplace aggression.

Measures of Workplace Aggression

Despite the considerable challenges associated with measuring workplace aggression, researchers need reliable, valid measures in order for research to proceed. Measures are also useful for Human Resource professionals who wish to assess the level of workplace aggression in their organizations prior to developing interventions. Over the past 25 years there have been a number of measures developed and used extensively in workplace aggression research. These include, in the approximate order in which they have appeared in the literature, the Interpersonal Conflict at Work Scale (ICAWS; Spector, Reference Spector1987; Spector & Jex, Reference Spector1998); The Negative Acts Questionnaire (Einarsen et al., 2009; Einersen et al., 1994), which measures bullying; The Workplace Deviance Scale (Bennett & Robinson, Reference Bennett and Robinson2000); the Social Undermining Scale (Duffy et al., Reference Duffy, Ganster and Pagon2002), and the Counterproductive Work Behavior Checklist (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006). See Table 1.1 for a description and example items from each of the five scales.

The constructs measured by these five scales – interpersonal conflict, bullying, workplace deviance, social undermining, and counterproductive work behavior – all represent behaviors that are potentially harmful to organizations, and they are also largely interpersonal. Despite these similarities, however, there are also important differences. For example, these five constructs differ considerably in severity. Interpersonal conflict is generally considered to be the least severe, whereas bullying and social undermining are much higher in severity. The other major dimension along which these constructs differ is in the dimension of intent to harm. Bullying and social undermining involve a clear intent to harm the target of these behaviors, but intent to harm is often less clear for the other three behaviors.

Table 1.1. Summary and Examples of Workplace Mistreatment Scales

Scale Example Items Instructions & Response Scales
Interpersonal Conflict at Work Scale


Spector & Jex, Reference Spector1998
“How often do you get into arguments with others at work?”

“How often do other people yell at you at work?”

“How often are people rude to you at work?”
Scale: 1 to 5 (Never, Rarely, Sometimes, Very Often, Quite Often)
Negative Acts Questionnaire – Revised


Einarsen, Hoel, & Notelaers, Reference Einarsen, Hoel and Notelaers2009
“Being shouted at or being the target of spontaneous anger.”

“Excessive monitoring of your work.”

“Being exposed to an unmanageable workload”
Measures exposure to bullying within the last 6 months, with the response alternatives: ‘‘Never,’’ ‘‘Now and then,’’ ‘‘Monthly,’’ ‘‘Weekly’’ and ‘‘Daily’’.
Workplace Deviance Scale

Bennett & Robinson, Reference Bennett and Robinson2000
“Taken property from work without permission.”

“Repeated a rumor or gossip about your company.”

“Put little effort into your work.”
Measures frequency of participation in behaviors over a set time period. Scale: from 1 (never) to 7 (daily)
Social Undermining Scale

Duffy, Ganster, & Pagon, Reference Duffy, Ganster and Pagon2002
How often has your supervisor intentionally ...

“Undermined your effort to be successful on the job?”

How often has your coworker closet to you intentionally ...

“Delayed work to make you look bad or slow you down?”
Frequency of each undermining behavior from supervisor and closest coworker in the previous month.

Scale: 1 (never), 2 (once or twice), 3 (about once a week), 4 (several times a week), 5 (almost every day), and 6 (everyday).
Counterproductive Work Behavior Checklist

(CWB-C) (45-item)

Spector, Fox, Penney, Bruursema, Goh, & Kessler, Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006
“Purposely wasted your employer’s materials/supplies”

“Stayed home from work and said you were sick when you weren’t”

“Insulted or made fun of someone at work”
How often have you done each of the following things on your present job?

(Scale: 1 = never; 2 = once or twice; 3 = once or twice per month; 4 = once or twice per week; 5 = every day)

In this section we move away from general discussions of the issues surrounding the measurement of workplace aggression, and discuss each of these frequently used measures. For each measure we discuss the origin of the measure and provide information on its psychometric properties such as reliability and validity.

Interpersonal Conflict at Work Scale (ICAWS; Spector, Reference Spector1987; Spector & Jex, Reference Spector1998)

Interpersonal conflict, as defined by Spector and Jex (Reference Spector1998), is a general form of interpersonal mistreatment that ranges from minor disagreements between coworkers to physical assaults. These authors further note that interpersonal conflict may be overt (e.g., an employee intentionally being rude to a coworker) or covert (e.g., an employee spreading a damaging rumor about a coworker). Oftentimes in organizations, interpersonal conflict is tied to some specific work issue, such as how best to accomplish a particular task (Jehn, Reference Jehn1995). In other cases, however, interpersonal conflict may emerge due to personality conflict or dislike between employees.

Given our definition of workplace aggression provided at the beginning of the chapter, interpersonal conflict can be viewed as a somewhat “fringe” form of workplace aggression. On the one hand, there are instances in which conflicts between coworkers escalate to the point where the parties involved have a clear intent to harm each other. On the other hand, there are many other instances in which employees may disagree about something, yet there is no escalation to the point where there is any intent to harm. For example, two employees may like each other yet still disagree and argue over the best way to accomplish a work task (Jehn, Reference Jehn1995). This form of conflict has, in fact, been shown to enhance the performance of teams.

Scale description and history. The Interpersonal Conflict at Work Scale (ICAWS) is a four-item measure that was developed in 1987 (Spector, Reference Spector1987) in order to measure what was then considered to be an emerging workplace stressor (Keenan & Newton, Reference Keenan and Newton1985). The “development” of the ICAWS, like many measures in the occupational stress literature, consisted of simply creating the items and using them in a particular study. Given this lack of systematic development of the ICAWS (and many other measures, we might add), it is difficult to evaluate the procedures used to develop this measure. Nevertheless, when one views the items on this measure (see Table 1.1), they certainly appear to have a high level of face validity.

Psychometric properties. Despite the fact that the ICAWS was not created using extensive, systematic scale development procedures (e.g., Hinkin, Reference Hinkin1995), it has been used extensively in the occupational stress literature so there is a good deal of psychometric data available. For example, Spector and Jex (Reference Spector1998) summarized the results of their own studies in which the ICAWS was used and found that across 13 samples the average internal consistency reliability estimate was .74. Bowling and Beehr (Reference Bowling and Beehr2006), in a much larger meta-analysis of workplace harassment, found that the average reliability of all harassment measures was .81. Since the majority of samples in this meta-analysis (k = 25) used the ICAWS, this again suggests that the internal consistency reliability of this measure is acceptable. The ICAWS has been found to yield marginal Chronbach’s alphas (e.g., around .70). As a result, Beehr, Bowling, and Bennett (Reference Beehr, Bowling and Bennett2010) added a fifth item in an effort to improve internal-consistency reliability of the ICAWS.

Both meta-analyses by Spector and Jex (Reference Spector1998) and Bowling and Beehr (Reference Bowling and Beehr2006) also presented considerable evidence bearing on the nomological validity of the ICAWS. For example, Spector and Jex (Reference Spector1998) reported that the ICAWS was positively related to workload (r = .20), organizational constraints (r = .44), role ambiguity (r = .29), and role conflict (r = .44), which suggests that these may be conditions that foster conflict or at least suggests that they co-occur with interpersonal conflict. Bowling and Beehr (Reference Bowling and Beehr2006) reported that the ICAWS was related to role conflict (r = .41), role ambiguity (r = .29), role overload (r = .22), and job autonomy (r = -.17).

Both of these meta-analyses also investigated the relationship between the ICAWS and a number of employee outcomes that would be predicted based on what is known about the effects of interpersonal mistreatment (e.g., Hershcovis & Barling, Reference Hershcovis and Barling2010). For example, Spector and Jex (Reference Spector1998) reported that the ICAWS was related to anxiety (r = .36), frustration (r = .32), depression (r = .32), job satisfaction (r = -.32), intent to quit (r = .40), and physical symptoms (r = .26). Bowling and Beehr (Reference Bowling and Beehr2006) reported similar values for all of these outcomes, supporting the notion that interpersonal conflict leads to psychological and physical strain.

Negative Acts Questionnaire – Revised (NAQ-R) (Einarsen et al., 1994; Einarsen et al., 2009)

The Negative Acts Questionnaire – Revised is a broad measure of workplace bullying. The construct of bullying covers multiple dimensions and defining characteristics, but is typically defined as any situation in which an employee is exposed to negative or aggressive behaviors at work that are intended to threaten, humiliate, intimidate, punish or frighten the target (Einarsen et al., Reference Einarsen, Hoel and Notelaers2009). Consistent with the definition of workplace aggression presented earlier in this chapter, bullying is characterized by both moderate to high severity and a clear intent to harm. However, bullying is further characterized by its frequency and duration: while employees may be periodic targets of isolated negative or aggressive behaviors at work, persistent exposure to such behaviors can have a greater negative effect on targets. Thus, while the definition of bullying is concerned with nature of the bullying behavior, it is equally concerned with how long a target is exposed (duration) as well as the number of times a target is exposed (frequency).

Within workplace settings, a distinction has been made between person-related behaviors and work-related behaviors stemming from bullying (Einarsen, Reference Einarsen1999). Person-related behaviors entail negative acts that target a person based on personal characteristics; rumors and slander are examples of such behavior. In contrast, work-related behaviors are negative acts that target a person based on their work, such as constant criticism or overloading a target with difficult or menial tasks. While the bullying construct as defined by Einarsen et al. (Reference Einarsen, Hoel and Notelaers2009) is primarily concerned with negative acts of a psychological nature, more severe, physically aggressive behaviors such as physical intimidation or even violence are part of a wider range of aggressive acts seen in bullying cases (Leymann, Reference Leymann1990).

Another important aspect of the bullying construct is the idea of power imbalance between the perpetrator and victim (Niedl, Reference Niedl1996). In most cases, bullying is perpetrated by an employee with more power or influence than that of the target; in turn, this power imbalance makes it difficult for the target to defend themselves. This imbalance may exist naturally within the formal organizational structure in which the perpetrator and target are situated, or it may form naturally due to informal circumstances such as knowledge or tenure (Einarsen & Mikkelsen, Reference Einarsen and Mikkelsen2003; Einarsen et al., Reference Einarsen, Hoel and Notelaers2009; Hutchinson, Vickers, Jackson, & Wilkes, Reference Hutchinson, Vickers, Jackson and Wilkes2006).

Scale description and history. The Negative Acts Questionnaire – Revised (NAQ-R) is a reliable and valid measure of workplace bullying, which comprises three related factors associated with person-related bullying, work-related bullying, and physically intimidating bullying. The scale can be used as a single-factor, two-factor (personal- or work-related), or three-factor measurement of bullying, enhancing its usefulness for measuring bullying in a variety of situations.

The NAQ-R is based on the original Negative Acts Questionnaire (NAQ; Einarsen & Raknes, Reference Einarsen and Raknes1997; Mikkelson & Einarsen, Reference Mikkelsen and Einarsen2001). The original 23-item measure had items describing both personal and work-related negative acts. However, the scale was originally developed for use in Nordic countries. When the items were translated into English, the new version had item-level issues with both face validity and cultural bias. The NAQ-R was developed in response to the weaknesses of the NAQ, and is intended as a comprehensive yet short scale that is valid, reliable, and adapted for use in Anglo-American cultures (Einarsen et al., Reference Einarsen, Hoel and Notelaers2009).

Scale development involved refinement of the original NAQ, as well as item generation and a focus group study that used 11 focus groups from different organizations within the United Kingdom (Hoel, Cooper, & Faragher, Reference Hoel, Cooper and Faragher2001). These efforts resulted in a 29-item version of the NAQ, which was subsequently reduced to the 22-item NAQ-R scale currently used (Einarsen et al., Reference Einarsen, Hoel and Notelaers2009; Hoel, Cooper, & Faragher, Reference Hoel, Cooper and Faragher2001, Reference Hoel, Cooper and Faragher2004). Given that item development for the NAQ-R was based on refinement of a previously validated measure and involved additional item generation and focus groups, the NAQ-R was developed fairly systematically; this makes the NAQ-R unique compared to many of the other aggression measures discussed in the chapter, such as ICAWS, which was developed through the generation and testing of face-valid items.

Psychometric properties. Part of the goal in revising the NAQ was to improve the face validity and decrease the cultural bias of the items. Examining the final 22-items included in the NAQ-R, it appears these previous issues have been addressed. Reliability tests in the scale development paper (Einarsen et al., Reference Einarsen, Hoel and Notelaers2009) show that the measure has excellent internal consistency: Chronbach’s alpha for the 22 items was .90. Additionally, three measurement models were tested to show that the scale items loaded onto the appropriate conceptual dimensions. Analyses showed that the overall one-dimension model fit the data well, suggesting that the items can be interpreted together as an overall workplace bullying measure. However, the best fit was the three-dimension model, in which items load exclusively on a person-related bullying dimension, a work-related bullying dimension, and a dimension of physically intimidating act. While the three-dimension model is a useful representation for the different types of workplace bullying within the NAQ-R, the three factors were found to be highly correlated, which suggests that the dimensions do not successfully discriminate between different types of bullying behaviors, and that the different types of bullying occur simultaneously.

In addition to the useful distinctions in the factor structure of the model, there is strong evidence for the nomological validity of the NAQ-R. In the Einarsen et al.’s (Reference Einarsen, Hoel and Notelaers2009) study, the scale was significantly correlated with many relevant measures, and all relationships were in the expected directions. Specifically, the NAQ-R was correlated with psychosomatic complaints (r = .41), sickness absenteeism (r = .13), self-ratings of recent work performance (r = -.22), and turnover considerations (r = .36), as well as the 12-item version of the General Health Questionnaire (GHQ-12), which asked about psychosomatic health complaints (r = .43). There were also significant, strong correlations between NAQ-R scores and measures of organizational climate (r = -.53), as well as ratings of leadership styles of one’s supervisor. The strongest correlations were between bullying and autocratic leadership (r = .52), as well as between bullying and a negative relationship experience with colleagues (r = .61), whereas the weakest were between the NAQ-R and organizational commitment (r = -.35), suggesting that the behaviors measured by this scale are more strongly associated with personal relationships than a person’s relationship with their organization.

The scale may also be used for categorizing targets by severity and persistence of their experience with bullying: latent class analysis revealed that the NAQ-R discriminates between different target groups of bullying, differentiating between both the nature and severity of exposure. This ability to discern different levels of severity and exposure also allows the scale to target groups that face occasional aggression or incivility, rather than the persistence and severity of incidents that characterizes bullying.

We should note that the ability to classify scores using latent class analysis based on severity and duration is a particular strength of the NAQ-R. Whereas a wide range of severity is a potential criticism for other measures of workplace aggression, the NAQ-R provides useful classifications. Because the bullying construct captures a considerable range of workplace aggression-related behaviors, it makes sense that the scale would be broad in its range of severity.

Bennett and Robinson’s (Reference Bennett and Robinson2000) Workplace Deviance Scale

Robinson and Bennett (Reference Robinson and Bennett1995, p. 556) defined workplace deviance as “voluntary behavior that violates significant organizational norms, and in doing so, threatens the well-being of the organization, and its members, or both.” Given this definition, it is clear that many forms of workplace deviance also fit the definition of workplace aggression. As a result, many researchers use the two terns interchangeably.

According to Bennett and Robinson (Reference Bennett and Robinson2000), one of the major barriers to research on workplace deviance, at least at that time, was the lack of a psychometrically sound measure; thus, they decided to develop one, and their measure has become one of the most widely used among workplace aggression researchers. It has also been frequently used as a measure of the similar variable CWB.

Scale description and history. The scale development approach used by Bennett and Robinson was to develop a measure that distinguished between the targets of the deviant actions by employees. Using this framework as a general guide, they began by simply asking a sample of employees to describe two incidents in which someone at work did something that was deviant or wrong. These researchers also generated a number of items on their own. An initial pool of 113 items was reduced through expert analysis, item-total correlations, and item variances. Bennett and Robinson also conducted an exploratory factor analysis to verify that items measuring deviant behaviors aimed at either the organization or other people. The results of the study supported the factor structure of the measure, and reduced the final number of items to 24 (16 items for Organizational Deviance and 8 items for Interpersonal Deviance). Bennett and Robinson also conducted a third study with a Confirmatory Factor Analysis (CFA) on the final items to assess the construct validity of the measure. Results of this study are discussed in the following section on the psychometric properties of this measure.

Psychometric properties. For the final version of the measure, Bennett and Robinson reported an internal consistency reliability of .81 for the Organizational Deviance subscale and .78 for the Interpersonal Subscale. In addition, both exploratory and confirmatory factor analysis supported the two-factor structure. Other researchers who have used this measure in subsequent studies have reported internal consistency reliabilities comparable to these values. For example, Mitchell and Ambrose (Reference Mitchell and Ambrose2007) reported an internal consistency reliability estimate of .79 for the Organizational Deviance subscale and .82 for the Interpersonal Deviance subscale. The most recent reliability we could find comes from a study by Guay, Choi, Oh, Mitchell, Mount, & Shin (Reference Guay, Choi, Oh, Mitchell, Mount and Shin2016) where the internal consistency reliability was .91 for the Organizational Deviance measure and .87 for the Interpersonal Deviance subscale. Thus, it appears that the internal consistency of this measure has been acceptable across many samples.

In Bennett and Robinson’s initial paper describing the development of their measure, they conducted a study in order to assess construct validity according to long-standing criteria (e.g., Campbell & Fiske, Reference Campbell and Fiske1959) such as convergence, discrimination, and nomological validity. Based on the data collected in this study, they found reasonably strong support for construct validity. In terms of convergence, the Organizational Deviance subscale converged very well with Property Deviance (r = .59), Production Deviance (r =.70), Physical Withdrawal (r = .79), and Psychological Withdrawal (r = .65). The Interpersonal Deviance, in contrast, only converged well with a measure of Antagonistic Work behavior (r = .62).

Bennett and Robinson chose the variables from Farrell and Rusbult (Reference Farrell and Rusbult1986) as examples of variables that were conceptually unrelated to organizational and interpersonal deviance in order to assess discriminant validity; these included Voice, Exit, and Loyalty. The only significant correlation that resulted was between Interpersonal Deviance and Loyalty (r = -.21), which largely supports discriminant validity.

In order to assess nomological validity, Bennett and Robinson investigated a number of potential causes of deviance (e.g., frustration, procedural justice, distributive justice, interactional justice, normlessness, and Machiavellianism), and investigated two dimensions of Organizational Citizenship Behavior (OCB; Courtesy and Conscientiousness) as potential effects of deviance. In terms of potential causes, Organizational Deviance was correlated only with Procedural Justice (r = -.32) and Machiavellianism (r = .26). Results for Interpersonal Deviance were much stronger, since it was correlated with all causes in the expected direction except for distributive justice. In terms of potential effects of deviance, both deviance subscales were negatively related to both dimensions of OCB as predicted.

Since Bennett and Robinson’s initial 2000 paper that describes the development of their measure, it has been used extensively, so there is considerable evidence bearing on its nomological validity (see Hershcovis, Reference Hershcovis2011; Hershcovis & Barling, Reference Hershcovis and Barling2010). The vast majority of studies that have used this measure have used it as an outcome, so there is much more evidence on correlations with potential antecedents, as compared to potential effects. In this regard, this measure has been consistently shown to be associated with workplace stressors and individual differences that have been associated with workplace deviance (e.g., Alexander, Reference Alexander2011; Ferris, Brown, Lian, & Keeping, Reference Ferris, Brown, Lian and Keeping2009; Mitchell & Ambrose, Reference Mitchell and Ambrose2007; Tepper, Henle, Lambert, Giacalone, & Duffy, Reference Tepper, Henle, Lambert, Giacalone and Duffy2008).

One form of validation that was not examined by Bennett and Robinson (Reference Bennett and Robinson2000), and that has been given scant attention in research using this measure, is convergence between incumbent reports and other data sources such as coworkers or supervisors. One study we located did suggest, however, that convergence between incumbent and coworker ratings was quite good. Specifically, Alexander (Reference Alexander2011) examined convergence in a sample of 50 Human Resource professionals where incumbent self-reports were matched with a nominated coworker, and found that the convergence for Organizational Deviance was r = .53, while convergence for Interpersonal Deviance was r = .59. Although this represents a rather small sample of employees in one profession, it does suggest that employees are aware of the deviant behaviors of their coworkers.

Social Undermining Scale (Duffy et al., Reference Duffy, Ganster and Pagon2002)

The term “social undermining” was originally coined by Vinokur and van Ryn in Reference Vinokur and van Ryn1993, although Rook (Reference Rook1984) was an early theorist who called for more researchers to pay attention to the negative or problematic aspects of social relations (Duffy et al., Reference Duffy, Ganster and Pagon2002). The first definition of the construct created by Vinokur and van Ryn (Reference Vinokur and van Ryn1993) included several aspects of behavior toward a target: negative affect; negative evaluation of the target in terms of his or her attributes, actions, and efforts; and/or actions that decrease attainment of goals (Vinokur, Price, & Caplan, Reference Vinokur, Price and Caplan1996).

Subsequent research on social undermining was conducted mostly in nonwork settings, and was extended to the workplace in the scale development paper by Duffy and colleagues in Reference Duffy, Ganster and Pagon2002. For the purposes of their scale, social undermining was defined as “behavior intended to hinder, over time, the ability to establish and maintain positive interpersonal relationships, work-related success, and favorable reputation” (Duffy et al., Reference Duffy, Ganster and Pagon2002, p. 332). Like other workplace aggression constructs, social undermining is determined by the target’s perspective rather than that of the perpetrator.

The definition of social undermining has two key aspects. First, behavior is only considered as undermining if it is perceived as undermining by the target. Behaviors that may be construed as negative social behaviors, such as making a rude comment or failing to help out a coworker, are only considered undermining if the target perceives them as so. If, for example, a target attributes a rude comment to the actor’s stressful personal life, they may not perceive it as undermining. Second, undermining behaviors are low-magnitude or insidious. More severe acts of aggression, such as physical violence, may negatively impact interpersonal relationships, but are not categorized as undermining behavior because of their direct nature and the immediate severity of their effects.

Undermining is further classified into several forms. First, the construct includes both direct actions and withholding. Direct undermining involves direct, visible actions, such as overtly rejecting a coworker. In turn, withholding is a more covert action that may involve behaviors such as withholding important information. Interestingly, the covert quality of withholding allows perpetrators to play negative behavior off as ambiguous, or unintentional. Because of the potentially ambiguous nature of withholding, this form of undermining is similar to workplace incivility (Andersson & Pearson, Reference Andersson and Pearson1999).

Second, undermining can be classified as verbal or physical, including active and passive verbal undermining. Verbal undermining includes any vocal undermining, where active undermining is directed and public (such as putting down a target), and passive undermining is more ambiguous (such as giving a target the silent treatment). Physical undermining involves any physical action that affects a target’s personal relationships, success, or reputation, such as withholding needed materials or overloading a target with menial tasks.

Scale description and history. The Social Undermining Scale (Duffy et al., Reference Duffy, Ganster and Pagon2002) is a validated 26-item measure of social undermining developed for use in organizational settings. The scale was validated in the Republic of Slovenia, and was created using a combination of previous nonwork measures of social undermining, item writing, focus groups, and exploratory and confirmatory factor analyses. These procedures are in line with systematic scale development recommendations (e.g., Hinkin, Reference Hinkin1995). The preliminary list of social undermining items was created from these processes, with a total of 72 total items covering both coworker and supervisor undermining. After initial data collection, exploratory principal component analysis resulted in two strong 13-item factors, one for coworker and one for supervisor: the supervisor factor explained 33% of the variance, and the coworker factor explained 18% of the variance (Duffy et al., Reference Duffy, Ganster and Pagon2002).

Psychometric properties. In the scale development process, the results of a confirmatory factor analysis produced a two-factor solution that represented supervisor undermining and coworker undermining. However, it was also found that items from each facet also overlapped: some of these items included “hurt your feelings,” “let you know that they did not like something about you,” “insulted you,” and “gave you the silent treatment.” This suggests that supervisors and coworkers engage in some of the same undermining behaviors. Although reliabilities were not reported in the initial scale development paper, they have since been reported as up to α = .93 (Duffy, Scott, Shaw, Tepper, & Aquino, Reference Duffy, Scott, Shaw, Tepper and Aquino2012). An examination of the items shows that they have good face validity and cover the construct domain, including items about direct social undermining and withholding, as well as verbal and physical undermining (Duffy et al., Reference Duffy, Ganster and Pagon2002).

Additionally, the social undermining scale is correlated with many work outcomes that could be predicted based on the negative outcomes of interpersonal mistreatment (e.g., Hershcovis & Barling, Reference Hershcovis and Barling2010). For instance, Duffy et al. (Reference Duffy, Ganster and Pagon2002) reported that the measure is significantly related to active counterproductive behaviors (r = .30), passive counterproductive behaviors (r = .13), and somatic complaints (r = .13), which are all logical outcomes of the stress response associated with perceptions of social undermining.

Counterproductive Work Behavior Checklist (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006)

Counterproductive Work Behavior (CWB) represents behaviors on the part of employees that run counter to the goals of their employing organization (Jex & Britt, Reference Jex and Britt2014). Not all forms of CWB represent instances of workplace aggression, but many do, so many workplace aggression researchers have drawn from CWB measures. It is worth noting that most researchers consider CWB and workplace deviance to be identical constructs (Bennett & Robinson, Reference Bennett and Robinson2000). One of the most widely used measures of CWB is the Counterproductive Work Behavior (CWB) Checklist, which was developed by Spector and colleagues.

Scale description and history. The beginnings of the CWB checklist can be traced to a 1975 paper in which Spector investigated zero-order correlations between organizational frustration and behavioral reactions of employees (Spector, Reference Spector1975a). Though not acknowledged in this paper, these items likely came from Spector’s doctoral dissertation, which involved the creation of a simulated organization where frustrating organizational conditions were created and participants’ reactions were observed (Spector, Reference Spector1975b). The items, which originated from this initial research effort, were further refined over the years, primarily in research conducted by Spector and his students (e.g., Chen & Spector, Reference Chen and Spector1992; Spector & Fox, Reference Spector and Fox2002; Storms & Spector, Reference Storms and Spector1987).

Although there are several forms of the CWB Checklist, we focus on the full 45-item measure along with the dimensions represented by these items. This measure consists of five distinct dimensions. Abuse represents CWBs that are directed at other people, such as rudeness, starting rumors, or making fun of others. Production deviance represents a form of CWB in which employees deliberately work slower than they or capable of, or otherwise work below their capabilities. Sabotage is a relatively serious form of CWB, which involves employees deliberately destroying company property or in some cases simply wasting materials. Theft as a form of CWB does not require a great deal of explanation – this is simply employees taking things from their employer that do not belong to them. However, as can be seen with the theft items, theft can also be from fellow employees and can also take the form of time. Withdrawal represents a fairly broad form of CWB, which involves avoidance of work by employees, which can take the form of leaving early, unexcused absences, or taking longer breaks than are allowed. As can be seen in Table 1.1, Spector and colleagues have used a time-based response scale that ranges from “Never” to as much as “Every day.”

Psychometric properties. Spector and colleagues have long maintained that the items on the CWB checklist are formative rather than reflective indicators (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006), and thus internal consistency reliability estimates are not appropriate. In essence, this means that there is no underlying latent construct that would necessarily cause the items on a given subscale to intercorrelate highly. Rather, the “construct” measured in each of these subscales is defined by the cumulative frequency with which they are performed by respondents. For example, with the dimension of sabotage there is no reason an employee who wastes his or her employer’s materials would necessarily also damage equipment or property. Rather, the sum total of responses to the sabotage items define the construct.

Despite this argument, there have been many internal consistency reliability estimates reported for the CWB Checklist calculated over the years. Some researchers have calculated internal consistency estimates for a composite consisting of the entire 45 items of this measure (e.g., Sprung & Jex, Reference Sprung and Jex2012), others have collapsed the items into two broad factors representing CWBs directed at the organization versus those directed at people (e.g., Berry et at., Reference Berry, Carpenter and Barratt2012), while still others have calculated reliability estimates for each of the subscales representing each of the five different forms of CWB. In general reliability estimates hovered around .70, although there was considerable variability. The largest we found was .99, which was for a composite consisting of all items on the scale (Sprung & Jex, Reference Sprung and Jex2012), while the smallest was .42 for the sabotage subscale (Spector et al., Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006). Generally speaking, reliability estimates were higher as items on this measure were collapsed to higher levels.

In terms of validity, there have been two primary ways that this has been assessed with the CWB Checklist: (1) convergence between incumbent reports and other rating sources (e.g., supervisors or peers) and (2) patterns of correlations between difference dimension of the measures and different antecedents and outcomes. If one looks at convergence, there is reasonably strong support for the validity of the CWB Checklist. Specifically, Fox, Spector, Goh, and Bruursma (Reference Fox, Spector, Goh and Bruursema2007) reported a correlation of .47 between incumbent and coworker reports of CWBs directed at other people, while convergence was only .13 (n.s.) for CWBs directed at the organization. This difference in convergence is understandable, since CWBs directed at other people are often more visible than those directed at the organization.

If we assess the validity of the CWB based on the patterns of correlations with the measure, these findings also support its validity. More specifically, theoretical models of CWB (e.g., Fox & Spector, Reference Fox and Spector1999) have proposed that negative emotions (e.g., frustration, anger) tend to be the most proximal causes of CWB, and results have tended to show that relations between these measures and negative emotions tend to be the strongest correlations with the measure (Berry et al., Reference Berry, Carpenter and Barratt2012). More distal causes, such as job stressors and injustice, tend to be more weakly correlated with the measure. In addition, individual differences that are proposed to be related to CWB such as anger and negative affectivity are also positively related to the CWB Checklist.

Finally, there is mixed evidence supporting the five dimensions that constitute the measure, even though most researchers have tended to collapse those subscales into a smaller number of dimensions, specifically CWBs directed at people and those directed at the organization. Marcus, Taylor, Hastings, Sturm, and Weigelt (Reference Marcus, Taylor, Hastings, Sturm and Weigelt2013) conducted a confirmatory factor analysis of the items on the CWB Checklist and did not support a five-factor structure, as proposed by Spector and colleagues. Rather, they found that the subscales on this measure loaded onto one general CWB factor. However, in defense of the five subscales, Spector et al. (Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006) supported unique theoretically based predictions for each of the different forms of CWB, which suggests that distinguishing between these five different forms of CWB is meaningful.

Critique of Workplace Aggression Measures

Although each of the five measures of workplace aggression is unique, we believe that they share many similarities due to the constructs they are designed to measure. As a result, we chose to provide a critique that focuses on what we see are the major problems common to all five measures, rather than focusing specifically on each measure. Our purpose here is not to single out any of these measures for criticism, because all of them have been useful tools for workplace aggression researchers. Rather, our goal here is to highlight what we see as the most important problems that are common to these frequently used workplace aggression measures. The problems we identify in this section will also provide the basis for the final section of the chapter in which we provide ideas for improving these and other measures of workplace aggression.

One problem common to all five measures described in this chapter is that the behaviors described in the items are devoid of any situational context. That is to say, we do not know the circumstances under which the behaviors described on any of these measures occurred. However, this is vitally important information if we are to understand whether such behaviors are in fact workplace aggression. For example, if a person frequently gets into arguments with others at work, is excluded from work group activities, has their opinions ignored, or is otherwise treated rudely by coworkers, it may be due to the fact that his or her coworkers are acting aggressively. However, it could also be that the respondent is argumentative, rude, instigates arguments with fellow employees, and forces his or her opinions on others. If this is the case, then the behaviors reported are clearly still negative, but they take on a somewhat different meaning. The overall point is that for all of these measures, without any information other than whether or not the behavior occurred, it is impossible to say for sure whether the behavior is workplace aggression.

A second problem is that the items within each of these five measures represent considerable variability in severity. If one considers the items on the ICAWS, for example, we would argue that getting into arguments and experiencing rudeness are much milder forms of interpersonal mistreatment compared to being yelled at and having coworkers do nasty things at your expense. This is also a major issue with both the Bennett and Robinson (Reference Bennett and Robinson2000) and Spector et al. (Reference Spector, Fox, Penney, Bruursema, Goh and Kessler2006) measures. Within these measures there are behaviors that clearly do not rise to the level of aggression (e.g., littering, taking a long break), mixed with items that clearly go beyond typical organization-based aggression (e.g., used an illegal drug, threatened someone at work with violence). Given this mixture of items, it is hard to interpret the composite score when the frequency of such items is added together.

A third issue with all five measures is that they can potentially place fairly difficult memory demands on respondents. The time frames for these measures range from a low of 30 days (ICAWS) to as much as one year (Bennett & Robinson, Reference Bennett and Robinson2000). Even though workplace aggression represents a relatively significant and sometimes even traumatic event for most people, estimating these behaviors even over a relatively short period of 30 days may be a challenge for many people. If a time frame of a year or more is used, this task may become extremely difficult, and respondents may ultimately fall back on relatively recent experiences in order to provide an estimate.

A final issue, and one that is loosely related to the issue of time frame, is that these five measures also vary on the response scales presented to respondents. However, all five use some type of frequency-based measure that is either based on specific frequencies (e.g., “Once a Month”) or descriptive terms that typically range from “Never” to almost constant or “Daily.” Given the memory issues described earlier, we would argue that most respondents probably would not have a great deal of difficulty determining whether they had either never had experienced it or whether exposure was at a constant level. Estimating frequencies in between these end points, however, may be more problematic. The other problem with such frequency-based scales is that the respondent is left to determine the meaning of each of the scale points. More specifically, terms such as “Rarely” (ICAWS), “Now and Then” (NAQ-R), and “Almost every day” (Social Undermining Scale) require varying degrees of interpretation on the part of the respondent.

Suggestions for Improvement

Based on our critique in the previous section, we see a number of ways that measures of workplace aggression could be improved. Perhaps the most obvious way would be making sure that the items on such measures reflect one construct, and one construct only. As we stated earlier, the items on these measures reflect differing levels of intensity and, as a result, reflect different constructs. Given the similarity in many of the constructs under the general workplace mistreatment umbrella (e.g., Hershcovis, Reference Hershcovis2011), it is not surprising that the internal consistencies of these measures are high; however, if a researcher is specifically aiming to measure workplace aggression, the items on such measure should be more homogeneous. Furthermore, based on currently accepted definitions, behaviors should be at a reasonably high level of severity and the intent to harm should be unambiguous.

A second way all of these measures could be improved would be to ask respondents to recall the behaviors described in the measure over a relatively limited time period. As stated earlier, the shortest time period of the five measures was 30 days, while the longest was one year. While we believe that it may be reasonable for an employee to remember being in an argument with a coworker during the past 30 days or less, we find it much less plausible that an employee could remember such an incident after a year or more has passed. This problem could be addressed via the use of Experience Sampling Method (ESM) study designs where data are collected over a relatively short time frame (e.g., two weeks; Fisher & To, Reference Fisher and To2012). This puts much less memory demands on respondents, although a potential drawback is that since workplace aggression is a low-base-rate behavior, this approach may lead to distributional problems. In the few ESM studies in which researchers have measured workplace aggression, however, this does not seem to be the case (e.g., Judge, Scott, & Ilies, Reference Judge, Scott and Ilies2006).

A third suggestion for improving these and other measures of workplace aggression is to make a greater effort to assess the context in which these behaviors occur. This could be done in a variety of ways. For example, in the instructions to such measures reference could be made to specific situations such as business meetings, specific types of tasks (e.g., transactions with customers), or types of people (e.g., employees who do the same jobs). This would help provide a much better understanding of the meaning of such behaviors, and perhaps more importantly, help provide a better overall understanding of the workplace aggression construct. It may be, for example, that some behaviors coming from fellow employees may be seen as workplace aggression, while the same behaviors coming from customers during a service transaction are simply viewed as “venting.”

A fourth and final suggestion, which does not apply specifically to the five measures discussed in this chapter, but that could be used to improve the measurement of workplace aggression, is to incorporate the use of qualitative measures when possible. Although qualitative measurement certainly does have some drawbacks (see Jex & Britt, Reference Jex and Britt2014 for a discussion), it may be useful given the wide variety of settings in which workplace aggression occurs, since the existing measures might not reflect what employees actually experience. One way that qualitative measurement has already been used successfully in the workplace mistreatment literature is in having respondents describe an example of mistreatment and then ask closed-ended questions about things such as severity, perceived motives behind the mistreatment, and respondents’ reactions (e.g., Crossley, Reference Crossley2009).

Summary and Conclusions

Although workplace aggression is relatively easy to define, developing reliable, valid measures of this construct is a challenge. The five measures we reviewed in this chapter have served researchers well, and more generally facilitated a much greater understanding of workplace aggression. Despite their usefulness, there are some problems with these measures, and many of those problems are common across measures. In this concluding section we summarize these common problems and offer suggestions for improving the measurement of workplace aggression.

Although the five measures reviewed all measured slightly different constructs, we believe there are three problematic issues that are common to all, which include (1) mixing together items of different levels of severity, (2) placing unrealistic memory demands on respondents, and (3) a lack of contextual information in the items. Each of these issues is discussed in turn here.

While there are many different forms of workplace mistreatment (e.g., Hershcovis, Reference Hershcovis2011), we would argue that workplace aggression is distinguishable, at least conceptually, based on both severity and intent to harm. If this distinction can be made at a conceptual level, it is certainly possible to create relatively pure measures of workplace aggression. As we indicated in describing these measures, most of these measures appear to mix incivility and aggression items together. If a researcher’s goal is to measure workplace aggression, we recommend eliminating items from measures that are more in line with “low-level” forms of mistreatment such as incivility.

For all of the measures reviewed, we also mentioned the issue of placing unrealistic memory demands on respondents. As we state in the chapter, there is no “magic number” with respect to the time frame respondents should be given. However, we think researchers should move toward shorter time frames to combat possible memory issues. Ideally, we would like to see workplace aggression researchers move toward the use of short-term ESM studies in order to ensure that memory biases do not severely impact their findings.

Finally, we pointed out that most of the items on these five measures do not contain information about the situational context in which the behaviors occur. In our opinion, this is an important omission, because situational context often determines whether or not a particular behavior is actually workplace aggression. We also point out that putting situational context into survey items is challenging, but there are methods by which it can be done. We would urge workplace aggression researchers to explore this in further efforts to refine their measures.

In summary, we believe that existing measures have greatly facilitated the study of workplace aggression. Scale development, however, is an ongoing process, so we would encourage workplace aggression researchers to focus more time on measurement issues in order to improve the overall quality of research in this area.


