Evaluation of initial progress to implement Common Metrics across the NIH Clinical and Translational Science Awards (CTSA) Consortium

Lisa C. Welch; Andrada Tomoaia-Cotisel; Farzad Noubary; Hong Chang; Peter Mendel; Anshu Parajulee; Marguerite Fenwood-Hughes; Jason M. Etchegaray; Nabeel Qureshi; Redonna Chandler; Harry P. Selker

doi:10.1017/cts.2020.517

Evaluation of initial progress to implement Common Metrics across the NIH Clinical and Translational Science Awards (CTSA) Consortium

Published online by Cambridge University Press: 28 July 2020

Lisa C. Welch ,

Andrada Tomoaia-Cotisel ,

Farzad Noubary ,

Hong Chang ,

Peter Mendel ,

Anshu Parajulee ,

Marguerite Fenwood-Hughes ,

Jason M. Etchegaray ,

Nabeel Qureshi and

Redonna Chandler

...Show all authors

Show author details

Lisa C. Welch*: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA
Andrada Tomoaia-Cotisel: Affiliation:
RAND Corporation, Santa Monica, CA, USA
Farzad Noubary: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA
Hong Chang: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA
Peter Mendel: Affiliation:
RAND Corporation, Santa Monica, CA, USA
Anshu Parajulee: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA
Marguerite Fenwood-Hughes: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA
Jason M. Etchegaray: Affiliation:
RAND Corporation, Santa Monica, CA, USA
Nabeel Qureshi: Affiliation:
RAND Corporation, Santa Monica, CA, USA
Redonna Chandler: Affiliation:
The National Institute on Drug Abuse, National Institutes of Health, Bethesda, MD, USA
Harry P. Selker: Affiliation:
Tufts Clinical and Translational Science Institute, Tufts University, Boston, MA, USA Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA, USA
*: Address for correspondence: L. C. Welch, PhD, Tufts Medical Center, 800 Washington Street, Box 63, Boston, MA02111, USA. Email: LWelch2@TuftsMedicalCenter.org

Article contents

Abstract
Introduction:
Methods:
Results:
Conclusions:
Introduction
Methods
Results
Discussion
Supplementary material
Disclosure
References

Rights & Permissions

Abstract

Introduction:

The Clinical and Translational Science Awards (CTSA) Consortium, about 60 National Institutes of Health (NIH)-supported CTSA hubs at academic health care institutions nationwide, is charged with improving the clinical and translational research enterprise. Together with the NIH National Center for Advancing Translational Sciences (NCATS), the Consortium implemented Common Metrics and a shared performance improvement framework.

Methods:

Initial implementation across hubs was assessed using quantitative and qualitative methods over a 19-month period. The primary outcome was implementation of three Common Metrics and the performance improvement framework. Challenges and facilitators were elicited.

Results:

Among 59 hubs with data, all began implementing Common Metrics, but about one-third had completed all activities for three metrics within the study period. The vast majority of hubs computed metric results and undertook activities to understand performance. Differences in completion appeared in developing and carrying out performance improvement plans. Seven key factors affected progress: hub size and resources, hub prior experience with performance management, alignment of local context with needs of the Common Metrics implementation, hub authority in the local institutional structure, hub engagement (including CTSA Principal Investigator involvement), stakeholder engagement, and attending training and coaching.

Conclusions:

Implementing Common Metrics and performance improvement in a large network of research-focused organizations proved feasible but required substantial time and resources. Considerable heterogeneity across hubs in data systems, existing processes and personnel, organizational structures, and local priorities of home institutions created disparate experiences across hubs. Future metric-based performance management initiatives across heterogeneous local contexts should anticipate and account for these types of differences.

Keywords

Performance improvement Common Metrics evaluation clinical and translational science CTSA

Type: Research Article
Information: Journal of Clinical and Translational Science , Volume 5 , Issue 1 , 2021 , e25

DOI: https://doi.org/10.1017/cts.2020.517 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: © The Association for Clinical and Translational Science 2020

Introduction

The National Institutes of Health (NIH) Clinical and Translational Science Awards (CTSA) Program, composed of about 60 CTSA hubs, is charged with growing and improving the nation’s clinical and translational research enterprise. The CTSA Consortium is comprised of academic health care institutions that deliver research services, provide education and training, and innovate improved processes and technologies to support clinical and translational research. A 2013 Institute of Medicine (IOM, now the National Academy of Medicine) report on the CTSA Consortium [1] recommended the institution of “common metrics” to assess and continuously improve activities at each hub, and across the Consortium as a whole. In response, the NIH National Center for Advancing Translational Sciences (NCATS) and CTSA Consortium hubs implemented the Common Metrics Initiative, composed of establishing standardized metrics and using them for metric-based performance management.

Performance management, intended to identify and act on opportunities to improve, has been implemented in a variety of related settings, including clinical care [Reference Pannick, Sevdalis and Athanasiou2,Reference Patrick and Alba3], research hospitals [Reference Catuogno4], nonprofit organizations [Reference Moxham5], governmental organizations [Reference Northcott and Taulapapa6], and academic institutions [Reference Tari and Dick7]. There are fewer examples of implementing performance management across a network of organizations, especially in biomedical research. Federal public health programs implemented by loosely integrated networks of local organizations face three challenges in measuring performance: complex problems with long-term outcomes, decentralized organization of program delivery, and lack of consistent data [Reference DeGroff8]. This is informative because the decentralized organization of federal public health programs mirrors the CTSA Consortium. Although all CTSA hubs strive toward the same mission of catalyzing the clinical and translational research enterprise, each hub has autonomy to develop the approach and processes that are most effective in its local context. To our knowledge, the current paper reports the first evaluation of the implementation of shared metric-based performance management in a decentralized national network of health care research organizations.

Between June, 2016 and December, 2017, an implementation team from Tufts Clinical and Translational Science Institute (CTSI) led the rollout of three Common Metrics and the Results-Based Accountability performance improvement framework [Reference Friedman9] across the CTSA Consortium in three waves of hubs, or implementation groups. As reported previously, implementation groups were used to manage training and coaching of a large volume of hubs and were assigned based on hubs’ preferences [Reference Daudelin10]. The Common Metric topics focused on training scientists for careers in clinical and translational research, supporting efficiency by shortening Institutional Review Board (IRB) time, and ensuring results from CTSA Consortium pilot studies are disseminated (Supplemental Table 1). The Tufts Implementation Program entailed training on the metrics and performance improvement framework and seven every-other-week small group coaching sessions.

A separate Tufts CTSI team conducted a mixed methods evaluation to assess initial progress with Common Metrics. A 19-month follow-up period, ending in January, 2018, was intended to provide sufficient time for hubs to become oriented to the Common Metrics, incorporate the required activities into workflows, and implement performance improvement strategies. This report summarizes hubs’ progress, and factors affecting that progress, in the follow-up period.

Methods

Research Design

We used an intervention mixed methods framework [Reference Fetters, Curry and Creswell11] to describe hubs’ progress and experiences implementing the Common Metrics and performance improvement framework. The posttest design integrated quantitative measures, open-ended written responses, and qualitative interview data to describe what level of implementation hubs achieved in relation to the initial three Common Metrics and why full implementation was or was not achieved.

The primary evaluation outcome was implementation of the initial three Common Metrics and performance improvement framework for each metric. This outcome was measured quantitatively as the extent of completion of 13 activities, clustered into 5 distinct groups (Table 1). With input from the Tufts Common Metrics Implementation Team, we created a rubric with a point value for each activity. The sum of a hub’s points indicated the extent of completion of activities, regardless of the order of completing them. The activities were not weighted for relative difficulty, effort, or time required because hub experiences varied.

Table 1. Implementation of Common Metrics and performance improvement activities: definition and point assignments

* Activities did not have to be conducted sequentially.

** Each distinct activity was assigned 1.0 points. For pairs of related activities (e.g., involving different types of stakeholders when specifying underlying reasons), each part of the pair received 0.5 points to equal 1.0.

To better understand lack of completion of each activity, we elicited reasons as open-text survey responses and conducted semi-structured interviews about contextual factors, challenges, and facilitators.

Data Collection

We collected data at various time points throughout the implementation period using four self-report surveys and a qualitative interview guide (Supplemental Table 2).

Surveys

Before starting the Tufts Common Metrics Implementation Program, participating hubs completed a cross-sectional survey about hub prior experience with metric data collection and performance improvement activities in the previous calendar year. These data were used to construct a composite measure of each hub’s prior experience with data-driven performance improvement.

Additionally, hubs completed a baseline and two follow-up surveys about progress on the 13 activities that composed the primary outcome (Supplemental Table 3). At the start (i.e., baseline), hubs were instructed to choose one of their local metrics that best exemplified how the hub had used metric data in the five months prior to starting the Common Metrics Implementation Program and to report on activities composing the primary outcome. We used these data to sample hubs for qualitative interviews (see below).

Two follow-up surveys collected data regarding hub progress on the Common Metrics. At the end of the implementation program’s coaching period, hubs were instructed to choose one Common Metric that best exemplified the hub’s use of metric data and the performance improvement framework as of that time and report progress on completing the 13 activities for that metric. The second follow-up survey was conducted 19 months after Implementation Group 1 began, which was 17.5 and 15 months after Implementation Groups 2 and 3 began, respectively. This survey recorded any additional performance improvement activities completed for the Common Metric reported on during the first follow-up survey, activities completed for the remaining two Common Metrics and related performance improvement efforts, and additional information about hub experiences.

Semi-structured interviews

The interview guide included open-ended questions and probes to elicit an in-depth understanding of challenges, facilitators, and contextual factors for implementing Common Metrics (Supplemental Table 4). The Context Matters Framework [Reference Tomoaia-Cotisel12] was applied to capture five domains that might have influenced hubs’ experiences with Common Metrics implementation: (1) specific implementation setting, (2) wider organizational setting, (3) external environment, (4) implementation pathway, and (5) motivation for implementation.

The interview guide was adapted for three roles: the hub’s Principal Investigator, the Administrator/Executive Director (or another individual filling the role of Common Metrics “champion”), and an “Implementer” staff member knowledgeable about day-to-day implementation. We piloted each version of the guide during mock interviews with personnel from Tufts CTSI. After each interview, three qualitative team members debriefed and revised the interview guide as needed to clarify content and improve the flow of the interview.

One of the two qualitative team members conducted each interview by telephone. Interviews lasted between 20 and 60 minutes. Each participant was emailed an information sheet describing the study prior to the interview and provided verbal consent. Interviews were audio recorded and transcribed verbatim.

Interviewer training entailed mock interviews and debriefing. To ensure consistency, the two study interviewers listened to and discussed audio recordings of early interviews and more difficult interviews. During weekly meetings, three qualitative team members discussed study participants’ experiences with interview questions and, following procedures for qualitative interviewing, identified additional language to further facilitate future interviews.

Administrative data

Information on hub size and funding cohort was provided by NCATS and confirmed as current through publicly available sources when possible. Hub size was defined as total funding from NIH U, T, K, and/or R grant mechanisms for fiscal year 2015–2016. Hub funding cohort was calculated based on the year the hub was first funded.

Participants

Surveys

Sixty CTSA hubs were invited to participate in each survey by an invitation email to one principal investigator per hub. The email instructed the recipient to assign one person to complete the survey with input from others across the hub. To maximize response rate, reminder emails were sent to the principal investigator. All hubs responded to the survey about prior experience and the baseline survey, 57 hubs (95%) responded to the first follow-up, and 59 (98%) responded to the second follow-up. Surveys were self-administered online using REDCap software [Reference Harris13].

Semi-structured interviews

Interviews were conducted with participants from a sample of 30 out of the 57 hubs that responded to both the baseline and the first follow-up surveys. The sampling plan sought balance primarily across hubs’ experiences with metric-based performance improvement and, secondarily, across other key hub characteristics. First, to ensure hubs with a diversity of experiences on performance improvement, we created a matrix of hub scores on the study’s primary outcome (implementation of the three Common Metrics) at two time points: baseline (i.e., prior experience on a local metric) and the first follow-up survey (i.e., early progress on a Common Metric). Hub scores for each time point were trichotomized into three levels (minimal, moderate, and significant), yielding nine cells representing combinations of baseline experience and early implementation progress (Supplemental Table 5).

After sorting the 57 hubs into the matrix, we targeted 3 or 4 hubs within each cell to achieve a sample of 30 hubs. For cells with fewer than four hubs, all hubs were designated for inclusion. For cells with more than four hubs, we randomly sampled four hubs. We then reviewed the resulting sample to ensure balance across a range of hub characteristics (years of funding, total funding amount, region, implementation group, and number of hub implementation team members reported). Selected hubs that declined or did not respond to invitations to participate were replaced by randomly selecting another hub from the same cell, when available. If no additional hubs were available in the same cell, we recruited a hub from another cell that represented a change in baseline experience and early implementation progress, with the goal of maximizing insight into challenges and facilitators for changing scores.

Recruitment for qualitative interviews began by seeking agreement for participation from the hub’s principal investigator or designee administrator who nominated individuals in the other two roles addressed by the interview guide. If interviews for all three roles could not be scheduled, another hub was selected. A total of 90 interviews across 30 hubs were conducted.

Analytic Strategy

Quantitative and qualitative data were analyzed independently, and results were merged to develop a full description of hub experiences. Results from different data sources expanded our understanding by addressing different aspects of the experience (e.g., completion of activities vs. challenges and facilitators of that completion), and qualitative data provided insights to help explain associations identified in statistical analyses.

Statistical analyses

Hub characteristics were described overall and by implementation group using means and standard deviations for continuous variables and proportions for categorical variables. To assess differences in hub characteristics between implementation groups, we used t-tests for continuous data and chi-squared tests for categorical data. Similar numeric summaries were used to describe the frequencies of completion of activities. We also tested for differences in mean completion of activities for each Common Metric, using a linear mixed effects model with a hub-specific random intercept. Next, we fitted univariable (i.e., unadjusted) and multivariable (i.e., adjusted) linear regression models for the primary outcomes separately for each metric and for the overall sum. We included nine characteristics of hubs across three domains: hub basic attributes (hub size and initial funding cohort), previous experience with metric-based performance improvement, and participation in the Tufts Implementation Program. For the multivariable linear regression model, a stepwise variable selection procedure using Akaike information criterion (AIC) was performed, starting with a full model including all covariates and proceeding with both backward and forward selection.

To construct the composite measure of a hub’s prior experience with metric-based performance improvement, we conducted a factor analysis to create an experience factor score. The factor analysis used 10 survey items (Supplemental Table 6). Each response category was assigned a numerical value with a higher value indicating more experience. For questions with multiple parts, “yes” responses were summed to create a single score for that item. All 10 dimensions were used in an exploratory factor analysis, with results indicating a two-factor model based on the proportion of variance explained. After reviewing for meaningfulness, one factor was chosen. This single-factor score represented the “maturity of a performance management system” and was created using the weighted average of all dimensions involved. The resulting variable is a standardized normal score with a mean of zero and standard deviation of one. A higher score indicates a higher level of the underlying concept of maturity of systems.

Qualitative analyses

Semi-structured interview audio recordings were transcribed verbatim by a professional transcription company. Transcripts were uploaded into the NVivo qualitative data analysis software to facilitate coding and analysis [14].

The codebook was developed using a two-stage consensus-based process. First, the qualitative team developed an initial codebook using main topics of the interview protocol as preidentified categories. Then, analysts reviewed two transcripts, interview notes, and reflections to identify emergent concepts. The preidentified categories and emergent concepts were merged into a single initial codebook. This codebook was reviewed by the qualitative team for clarity and consistency.

Second, analysts applied the initial codebook to two small batches of transcripts (one transcript and then three more transcripts) with participants in different roles (Principal Investigator, Administrator, and Implementer) to ensure definitions were clear and codes were being used consistently. For each batch, the team met to compare the coding and resolve discrepancies, and the codebook was revised as needed.

Once consensus was reached on the codebook and coding was consistent between analysts, one team member coded the interviews using the codebook. To ensure consistency, another team member periodically reviewed a convenience sample of coded transcripts for fidelity to the codebook. The full qualitative team discussed all potential new themes or revisions before any changes were made to the codebook.

Over the course of coding transcripts, themes were grouped into four domains: metric design and content, stakeholder engagement, hub engagement, and perceived value of implementing Common Metrics. Once coding was completed, the four domains were divided among team members so that one analyst read all coded sections within one domain. Those analysts then categorized coded sections into facilitators and challenges, and summarized the range of themes, including illustrative quotations. Each analyst also identified intersections among themes that were discussed by the full team and incorporated into the presentation of results. Subanalyses investigated whether hubs’ engagement with the Common Metrics Implementation differed by participant role.

Open-ended survey responses followed similar consensus-based procedures. Two analysts independently developed initial codes and met to develop an initial codebook. Each analyst then applied the codebook to a subset of responses, met to discuss and resolve discrepancies, and modified the codebook as needed. After nine meetings, the analysts were applying the codebook consistently. At that point, one analyst coded the remaining responses and discussed questions with the other analyst. Given the straightforward nature of the responses, codes were summarized using frequencies and illustrative quotations.

Results

Description of Hubs

The primary quantitative analyses included the 59 hubs that responded to the second follow-up survey at the end of the evaluation study period (Supplemental Table 6). At the beginning of the Common Metrics Implementation Program, hubs ranged substantially in size of their annual budgets and year of initial CTSA funding across a 10-year time span. Across 10 indicators of experience with metric-based performance improvement, hubs generally reported average levels of experience in the middle of the possible response ranges for each indicator.

The composition of three implementation groups did not differ in size of annual budgets or experience with metric-based performance improvement, but did vary in composition based on initial year of funding. Compared to Implementation Groups 1 and 2, Implementation Group 3 was comprised of more hubs first funded in the earliest or latest cohorts of CTSA funding. As reported previously [Reference Daudelin10], Implementation Group 2 attended fewer training and coaching sessions than Implementation Groups 1 and 3 (average of 11.3, 12.6, and 11.9 sessions, respectively), and more hubs focused on the IRB Review Duration (38%) or Pilot Funding (39%) Metrics than the Careers Metric (23%) during coaching.

Completion of Metric and Performance Improvement Activities

After 19 months, all hubs reported that they had begun the work of implementing the Common Metrics and performance improvement for all of the first three metrics. However, less than one-third of hubs (17 of 59) had completed all 13 activities for each metric (score of 30; Fig. 1). About half of hubs (29 of 59) completed between 90% and 100% of activities (score of 27 or higher), one-quarter completed between 70% and 85% of activities (score of 21–25.5), and the remaining one-quarter completed between 27% and 65% of activities (score of 8–19.5).

Fig. 1. Completion of Common Metrics and performance improvement activities per hub: three metrics combined (0–30 points possible).

On average, hubs completed almost all activities related to creating metric results, and the vast majority of activities related to understanding current performance (Table 2). However, variation was evident for activities related to developing performance improvement plans, which were completed less often for the IRB Metric compared to the Careers and Pilots Metrics. When a performance improvement plan was not developed, activities related to implementing it could not be completed. Additionally, not all hubs that developed a plan completed activities to implement the plan. Fully documenting a metric result and the four elements of the improvement plan was completed least often, on average.

Table 2. Completion of Common Metrics and performance improvement activities (N = 59 hubs*)

SD = Standard Deviation.

* One hub did not respond.

** Composition of clusters: (1) creating metric result entails data collection and computing metric according to operational guideline; (2) understanding metric result entails forecasting future performance or comparing results to any other data, and specifying underlying reasons with stakeholders; (3) developing improvement plan entails involving stakeholders, specifying actions, and prioritizing actions based on effectiveness or feasibility; (4) implementing the improvement plan entails reaching out to partners for help and starting implementation activities; (5) documenting includes entering metric result, describing underlying reasons, identifying partners, potential actions, and planned actions.

Factors Affecting Progress

Quantitative and qualitative results together identified seven key factors affecting hub progress. The characteristics that could be assessed quantitatively explained between 16% and 21% of the variation in completing improvement activities across hubs and metrics (Table 3). Qualitative results enhanced our understanding of these effects and identified additional factors.

Table 3. Results of testing for effects of hub characteristics on completion of performance improvement activities (N = 59 hubs^ϵ)

Ref = reference group (indicated by dashes in cell); CMI = Common Metrics Implementation.

*≤0.10; **≤0.05; ***≤0.01.

^ϵ One hub did not respond.

^£ CTSA size is defined as total funding from U, T, K, and/or R grants for fiscal year 2015–2016.

^¥ Attendance at a training or coaching session is defined as at least one person from the hub attended. Implementation Groups 1 and 2 were offered 7 coaching sessions; Implementation Group 3 was offered 6 coaching sessions.

Hub size and resources

Analysis of open-ended survey responses revealed that the most common reason hubs cited for not completing performance improvement activities was lack of time and resources. Hubs size, defined by funding level, varied greatly, and quantitative analysis showed that funding level appeared to have some effect, particularly for the Pilot Funding Metric. Compared to the smallest hubs, mid-size and large-size hubs consistently completed slightly more performance improvement activities on average. Yet, when considering activities completed across all metrics, the effect was largest for mid-size hubs, not the largest hubs (Table 3).

Qualitative results reveal that the size of a hub’s funding award did not fully account for resource challenges. Investment from institutions within which hubs were situated, periods of interrupted funding, lack of data systems or lack of alignment of existing systems with the Common Metrics data requirements, and the availability of needed personnel and expertise all affected whether hubs could devote sufficient time and resources to fully implement Common Metrics and performance improvement activities (Table 4).

Table 4. Challenges to hub progress, with illustrative quotation*

* Unless stated otherwise, themes manifest in more than one way; a quotation represents one manifestation.

** Participant is affiliated with a medical center that functions as a CTSA without current CTSA funding.

^† Indicates that the challenge, under reverse conditions, becomes a facilitator.

Hubs with available evaluation and other metric-related expertise, as well as institutional knowledge and general administrative support, reported that these greatly facilitated implementation. Hubs often formed a core team intended to provide an organized approach to implementation activities. Teams included mutually supporting roles such as site champions to engage stakeholders, keeping the principal investigator aware of activities, and conducting hands-on data collection and reporting. Participants identified three facilitators related to effective core teams: (1) one leader who is accountable for the work, (2) a “champion” or “real believer” on the team to encourage local ownership of the initiative, and (3) a collaborative team climate with effective communication (Table 5).

Table 5. Facilitators for hub progress, with illustrative quotation*

* Unless stated otherwise, themes manifest in more than one way; a quotation represents one manifestation.

^† Indicates that the facilitator, under reverse conditions, becomes a challenge.

Not all hubs, however, had available metric-related expertise, and for many hubs, the local team was relatively small. In smaller hubs, core teams may be particularly lean and exhibit less differentiation in roles related to Common Metrics implementation. To address this, some hubs leveraged other individuals and groups within their hub and academic institution to form extended teams that facilitated completion of data collection and performance improvement activities. In a number of cases, other stakeholders became part of extended teams to facilitate regular collaboration and sustained commitment. Directors of hub programs related to Common Metrics’ specific topic areas, who played critical roles due to their ownership of the data and/or familiarity with the processes in their topic areas, were considered valuable for implementing improvement strategies.

Prior experience with performance improvement and alignment with needs of Common Metrics implementation

Although we anticipated that prior experience with metric-based performance improvement would facilitate completion of such activities for Common Metrics, the quantitative measure of prior experience (maturity of a hub’s performance management system) appeared to have a small negative effect. This effect disappeared after accounting for other characteristics in multivariable statistical models, but similarly unexpected effects related to existing data collection and storage appeared more robust for the IRB and Pilot Funding Metric (Table 3).

Qualitative results revealed that alignment (or lack thereof) of the Common Metrics and performance improvement framework with a hub’s prior experience, systems, and priorities affected implementation (Tables 4 and 5). As noted, one type of alignment was compatibility with technical needs of the Common Metrics, including local structures, processes, metrics, and experience. If systems and processes were aligned with the Common Metrics, prior experience with similar metrics or performance improvement frameworks facilitated implementation of the Common Metrics. When there was lack of alignment with existing systems and processes, more resources were required to conduct the work of the Common Metrics, and this hampered hubs’ abilities to adapt to and engage in that work. Particularly for the IRB Metric, if existing institutional data systems were not aligned with the metric definition, modifying existing systems to follow the metrics operational guidelines absorbed a great deal of time and resources.

A second type of alignment—compatibility of Common Metrics with existing institutional priorities—also shaped hubs’ progress on the work of the Common Metrics. Alignment of the Common Metrics with local priorities (or the ability to create such alignment) made the Common Metrics more useful to hubs. This facilitated institutional investment in the work. In contrast, lack of alignment had the opposite effect on the perceived usefulness of, and investment in, the metrics.

Hub authority

Participating CTSAs were diverse in how they were situated relative to their academic institutions. A hub leader’s position in the institutional authority structure was important for accessing needed data, affecting improvements, and facilitating stakeholder engagement. Hubs with leaders that did not have line authority over the data, processes, or organizational components related to Common Metrics experienced challenges in implementing performance improvement (Table 4). The complexity of processes related to the Common Metrics, such as investigators’ response times to IRB stipulations and the need to coordinate with multiple IRBs, exacerbated this challenge.

Although the problem of lack of direct authority could not be fully mitigated, some hubs noted that coupling the leadership role of the hub principal investigator with a leadership position at the school or institutional level, and integrating leadership relationships across the institution, facilitated the work of hubs generally and the work of the Common Metrics in particular. When direct lines of communication with relevant departments or leaders were not already existing, drawing on or creating personal relationships to build communication about the topics of the Common Metrics was a strategy to help gain buy-in of stakeholders (Table 5).

Hub engagement

A hub’s type of engagement with the Common Metrics was associated with the degree to which it completed the performance improvement activities. Types of hub engagement, identified through qualitative interview analyses, included actively folding Common Metrics and the performance improvement framework into standard work processes (active engagement), complying with an external requirement (compliance-based approach), or some mixture of these approaches within the hub and/or its staff. Not surprisingly, hubs in which all participants reported only a compliance-based approach to the Common Metrics in qualitative interviews completed fewer activities related to Common Metrics and performance improvement (i.e., had lower scores on the primary quantitative outcome) than hubs in which one or more participants reported active engagement (Table 6).

Table 6. Results of testing for effects of hub engagement on completion of performance improvement activities (N = 30 hubs)

Ref = reference group; SE = standard error.

*p ≤ 0.10.

Qualitative results revealed that engagement of a hub leader in particular appeared to affect completion of activities, particularly for the IRB Duration Metric which was often outside the hub’s line authority. Principal investigators played four key facilitative roles: providing strategic and operational guidance, serving as a champion who kept Common Metrics work “on the agenda,” facilitating stakeholder engagement, and providing hands-on oversight during start-up (Table 5).

Challenges for maintaining higher levels of engagement included periods of little Common Metrics-related efforts due to reporting on an annual basis, interruptions to hub funding, and reduced motivation due to perceptions of unclear metric definitions and lack of alignment with existing processes.

Variation in hub engagement revealed through qualitative interviews helped to explain the statistically significant effect of funding cohort. We expected that hubs funded earlier would have more established processes and stakeholder relationships to conduct the work of Common Metrics, but hubs in the middle CTSA-funded cohort completed an average of 15% more activities than the earliest funded hubs. This effect was the largest of all characteristics measured quantitatively, and it remained statistically significant when accounting for other hub characteristics in the multivariable models. Hubs funded in the earliest and latest cohorts completed about the same number of activities. Qualitative interview results explained why the potential benefit of more established processes was tempered.

Specifically, although all funding cohorts included hubs with multiple engagement approaches, a compliance-based approach was more common among hubs funded earlier while active engagement was more common among hubs funded later (Supplemental Figure 1). Qualitative results showed that this trend indicated that hub engagement, at least in part, reflected hubs’ levels of willingness or ability to adjust processes to accommodate the requirements of the Common Metrics, which differed across funding cohorts.

Hubs funded in the latest cohort were less likely to have firmly established processes, which made the introduction of a performance improvement system more useful. Yet, these hubs sometimes had difficulties with resources or contextual issues (e.g., developing relationships with stakeholders). In contrast, hubs funded in the earliest cohort more likely had established processes. If these processes were aligned with the Common Metrics, then the work was more easily completed based on existing workflows. If their processes were not aligned, then adaptation of existing processes presented difficulties. Many hubs in the middle cohort had fewer unresolved contextual issues than those funded later (e.g., they had already built relationships with home institutions and stakeholders), and their existing processes and systems appeared not quite as firmly established as those funded earlier, making it easier to adapt to Common Metrics.

Stakeholder engagement

Engaging stakeholders is a fundamental aspect of implementing Common Metrics using the shared performance improvement framework. Qualitative results showed that challenges for engaging stakeholders included lack of an existing line of consistent communication with other units in a hub’s academic institution, difficulty securing initial buy-in, or sustaining cooperation over time (Table 4). Difficulty with initial buy-in resulted from resistance or “pushback” from stakeholders or from the hubs’ hesitancy to involve stakeholders due to an expectation of resistance.

Facilitators included personal relationships (existing or new collaborations), a culture of cooperation in the academic institution, integration of the Common Metrics with institutional priorities, and structural features of hubs that supported access to institutional leaders and stakeholders (e.g., physical location and size). Line authority over the relevant domain also facilitated engaging stakeholders.

Hubs also identified proactive strategies for enhancing their abilities to successfully engage stakeholders in the Common Metrics (Supplemental Table 7). First, as relevant stakeholders varied by metric, persuading each set of stakeholders about the benefit to them from helping implement the Common Metrics and performance improvement plans was key. Second, creating avenues for discussion, dialogue, and feedback with stakeholders was important, including listening to stakeholders at the “ground level,” not only leaders. Third, engagement of stakeholders may require persistence, both initially and over time. Fourth, positioning the CTSA hub as a “bridge” or “liaison” to engage stakeholders across the institution helped, even at times incorporating key stakeholders from other parts of the institution into roles within the hub to ensure engagement.

Training and coaching attendance

Hub attendance at training and coaching sessions provided by the Tufts Implementation Program appeared related to completion of activities according to statistical analyses (Table 3). As the number of training and coaching sessions attended by at least one hub team member increased, the average number of completed activities also increased. This trend was statistically significant for hubs that attended more coaching sessions. The benefit of attending more training and coaching sessions appeared to differ by metric.

Statistical results suggest that receiving coaching on a metric facilitated the completion of performance improvement activities for that metric. For the Careers and IRB Review Duration Metrics, receiving coaching while working on that metric was associated with completing more of the related performance improvement activities. For the Careers Metric, hubs that did not focus on this metric during coaching completed fewer of the related activities by the end of the evaluation. Although not statistically significant, hubs that focused on the IRB Metric during coaching completed about 1.55 more activities (out of 10) on the IRB Metric compared to hubs that focused on the Careers Metric during coaching.

Discussion

This mixed methods evaluation assessed progress in implementing Common Metrics and a shared performance improvement framework across the CTSA Consortium, a loosely integrated network of academic health care institutions, or hubs, charged with catalyzing clinical and translational research. After 19 months, the vast majority of hubs reported that they computed results for the initial set of three Common Metrics and undertook activities to understand current performance, but fewer hubs developed and carried out performance improvement plans for all metrics. Similar to performance management efforts in loosely integrated public health programs [Reference DeGroff8], heterogeneity across hubs’ local contexts affected implementation of Common Metrics and performance improvement activities across hubs.

The most common reason cited for not completing an activity was limitation of available resources. Although the size of the hub’s funding award played a limited role, other resource-related factors, such as investment from home institutions, periods of interrupted funding, availability of needed personnel and expertise, and effectiveness of core teams, varied across hubs and affected whether they could devote sufficient time and resources to fully implement Common Metrics and performance improvement activities.

Across hubs, alignment (or lack thereof) of the Common Metrics and performance improvement framework with a hub’s local conditions and needs affected implementation. If existing local systems and processes were aligned with the needs of the Common Metrics Initiative, prior experience with similar metrics and/or performance improvement frameworks facilitated implementation. Without such alignment, more resources were required for implementation, and this hampered hubs’ abilities to adapt to and engage in that work. Similarly, alignment of the Common Metrics with existing institutional priorities (or the ability to shape such alignment) made the initiative more useful to hubs and facilitated institutional investment in the work. In contrast, lack of this type of alignment had the opposite effect on the perceived usefulness of, and investment in, the metrics.

A hub leader’s position in the institutional authority structure was important for accessing needed data, affecting improvements, and facilitating stakeholder engagement. Hubs with leaders who did not have line authority over the data or processes related to Common Metrics experienced challenges in implementation. Drawing on or creating personal relationships to build communication about the topics of the Common Metrics was a strategy to help gain buy-in of stakeholders.

Hubs also varied in their approach to engaging with Common Metrics work—including active engagement, a compliance-oriented approach or a mix—and this was associated with the degree to which hubs completed the performance improvement activities. Not surprisingly, hubs in which all participants reported only a compliance-based approach completed fewer implementation activities than those in which one or more participants reported active engagement. The engagement of hub principal investigators was found to be important, particularly to provide strategic guidance and oversight, champion the project, and facilitate stakeholder engagement.

Attending training and coaching sessions, and opportunities for hubs to share experiences and best practices, were helpful for hubs. Although there was evidence of facilitation by these services, not completely clear is whether this related to the content of the training and coaching, the difficulty of the metric the hub was focusing on during coaching, or differences among hubs that chose to receive coaching on one metric rather than another.

Limitations

The design of the Common Metrics Implementation Program necessitated a descriptive evaluation study that focused on understanding hubs’ progress and experiences. First, a controlled comparison group design was not compatible with the goal of having every hub implement the Common Metrics and a shared performance improvement framework to the fullest extent possible during the same time period. Second, without a control group, we considered a quasi-experimental pre–post design, but could not fully pursue this option because assessing change in Common Metrics’ results was not feasible for two reasons. The metric definitions were newly released and not all hubs had retrospective data to compute the metric result for a prior time period, and even if hubs could collect retrospective data, the anticipated timeframe for achieving change in the metric results was longer than the study period. The resulting mixed methods approach yielded a multifaceted understanding of hubs’ progress and related contextual factors, challenges, and facilitators.

Conclusion

Implementing Common Metrics and performance improvement in a large, loosely integrated network of research-focused organizations, the CTSA Consortium, proved feasible, but it required substantial time and resources. There was considerable contextual heterogeneity across hubs in their data systems, existing processes and personnel, organizational structures, and local priorities of home institutions, which created disparate experiences and approaches across hubs. To sustain engagement, future metric-based performance management initiatives should anticipate, and facilitate solutions to, barriers to implementation due to resources and authority and, for heterogeneous networks, account for local contexts. Future efforts should also consider the perceived value of the initiative, which is addressed for the CTSA Consortium’s Common Metrics Initiative in a separate report.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/cts.2020.517.

Acknowledgements

The authors are grateful for the time and effort of many people who contributed to the Tufts Common Metrics Evaluation Study. CTSA Consortium hubs across the country invested resources, personnel, and time to implement the Common Metrics and provide data for the evaluation study. Debra Lerner, MS, PhD, provided expertise in study design and survey research. Annabel Greenhill provided valuable research assistance. Members of the Tufts CTSI Common Metrics Implementation Team, including Denise Daudelin, Laura Peterson, Mridu Pandey, Jacob Silberstein, Danisa Alejo, and Doris Hernandez, designed and carried out the implementation program that is evaluated by this study.

The project described was supported by an administrative supplement award to Tufts CTSI from the National Center for Advancing Translational Sciences, National Institutes of Health (UL1TR002544 and UL1TR001064). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Advancing Translational Sciences, the National Institute on Drug Abuse, or the National Institutes of Health.

Disclosure

The authors have no conflicts of interest to declare.

References

Committee to Review the Clinical and Translational Science Awards Program at the National Center for Advancing Translational Sciences; Board on Health Sciences Policy; Institute of Medicine; Leshner AI, Terry SF, Schultz AM, Liveman CT, editors. The CTSA Program at NIH: Opportunities for Advancing Clinical and Translational Research. Washington, DC: National Academies Press, 2013. doi: 10.17226/18323.Google Scholar

Pannick, S, Sevdalis, N, Athanasiou, T. Beyond clinical engagement: a pragmatic model for quality improvement interventions, aligning clinical and managerial priorities. BMJ Quality & Safety 2016; 25: 716–725.CrossRef Google Scholar PubMed

Patrick, M, Alba, T. Health care benchmarking: a team approach. Quality Management in Health Care 1994; 2: 38–47.CrossRef Google Scholar PubMed

Catuogno, S, et al. Balanced performance measurement in research hospitals: the participative case study of a haematology department. BMC Health Services Research 2017; 17: 522.CrossRef Google Scholar PubMed

Moxham, C. Understanding third sector performance measurement system design: a literature review. International Journal of Productivity 2014; 63: 704–726.Google Scholar

Northcott, D, Taulapapa, T. Using the balanced scorecard to manage performance in public sector organizations. International Journal of Public Sector Management 2012; 25: 166–191.CrossRef Google Scholar

Tari, JJ, Dick, G. Trends in quality management research in higher education institutions. Journal of Service Theory and Practice 2016; 26: 273–296.CrossRef Google Scholar

DeGroff, A, et al. Challenges and strategies in applying performance measurement to federal public health programs. Evaluation and Program Planning 2010; 33: 365–372.CrossRef Google Scholar PubMed

Friedman, M. Trying Hard Is Not Good Enough: How to Produce Measurable Improvements for Customers and Communities. Victoria, BC: FPSI Publishing; 2005.Google Scholar

Daudelin, D, et al. Implementing Common Metrics across the NIH Clinical and Translational Science Awards (CTSA) Consortium. Journal of Clinical and Translational Science 2020; 4: 16–21.CrossRef Google Scholar PubMed

Fetters, MD, Curry, LA, Creswell, JW. Achieving integration in mixed methods designs—principles and practices. Health Services Research 2013; 48(6): 2134–2156.CrossRef Google Scholar PubMed

Tomoaia-Cotisel, A, et al. Context matters: the experience of 14 research teams in systematically reporting contextual factors important for practice change. Annals of Family Medicine 2013; 11 (Suppl. 1): S115–S123.CrossRef Google Scholar PubMed

Harris, PA, et al. Research electronic data capture (REDCap) – a metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics, 2009; 42: 377–381.CrossRef Google Scholar PubMed

QSR. NVivo qualitative data analysis software, Version 10. Melbourne, Australia: QSR International Pty Ltd, 2012.Google Scholar

Table 1. Implementation of Common Metrics and performance improvement activities: definition and point assignments

Fig. 1. Completion of Common Metrics and performance improvement activities per hub: three metrics combined (0–30 points possible).

Table 2. Completion of Common Metrics and performance improvement activities (N = 59 hubs*)

Table 3. Results of testing for effects of hub characteristics on completion of performance improvement activities (N = 59 hubsϵ)

Table 4. Challenges to hub progress, with illustrative quotation*

Table 5. Facilitators for hub progress, with illustrative quotation*

Table 6. Results of testing for effects of hub engagement on completion of performance improvement activities (N = 30 hubs)

Welch et al. supplementary material

Figure S1 and Tables S1-S7

File 110.6 KB

Article contents

Evaluation of initial progress to implement Common Metrics across the NIH Clinical and Translational Science Awards (CTSA) Consortium

Abstract

Keywords

Introduction

Methods

Research Design

Data Collection

Surveys

Semi-structured interviews

Administrative data

Participants

Surveys

Semi-structured interviews

Analytic Strategy

Statistical analyses

Qualitative analyses

Results

Description of Hubs

Completion of Metric and Performance Improvement Activities

Factors Affecting Progress

Hub size and resources

Prior experience with performance improvement and alignment with needs of Common Metrics implementation

Hub authority

Hub engagement

Stakeholder engagement

Training and coaching attendance

Discussion

Limitations

Conclusion

Supplementary material

Acknowledgements

Disclosure

References

Welch et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests