Personal Information as a Knowledge Commons Resource

2 How Private Individuals Maintain Privacy and Govern Their Own Health Data Cooperative MIDATA in Switzerland

Felix Gille and Effy Vayena

2.1 Background

We as private individuals provide a wide range of data about our personal lives via our smartphones, loyalty cards, fitness trackers, and other digital health devices. These devices collect increasingly diverse and comprehensive data about us. Depending on the mobile applications we use, devices collect data about our daily routines, location, preferences, desires and interests, shopping behaviors, mood, illness, and more. When this personal data is linked to health records – including genome data and phenotype data – or linked to other data collected in our environment, such as that collected by state administrations or financial systems, the data has huge potential for public health research and society in general (Wellcome Trust, 2015; Reference Vayena and BlasimmeVayena and Blasimme, 2017). Precision medicine, including pharmacogenomics, particularly depends on the potential of data linkage (Reference Huang, Mulyasasmita and RajagopalHuang, Mulyasasmita, and Rajagopal, 2016). New advanced data processing techniques help researchers to make sense of data in a way that was not possible before. With this new capacity to analyze linked data, researchers today can retrieve and assess valuable and clinically relevant information (Reference Blasimme, Vayena and HafenBlasimme, Vayena, and Hafen, 2018). One way to develop such linked data sets and to make them available for research is through health data cooperatives. An example of such a health data cooperation is MIDATA – a health data cooperative recently established in Switzerland and the main focus of this chapter.

In practice, our society cannot yet fully exploit the potential of linked data sets, even though data cooperatives similar to MIDATA are slowly multiplying. This is because private people act as data sources, but they have minimal control over the data collected and do not know where the data is stored. Health apps are a case in point: these private services operate under a particular business model, that is to harvest a private individuals’ data and exploit it for the company’s own financial profit. In this business model, data is market capital (Reference SadowskiSadowski, 2019). This is a lucrative business, and with an estimated annual growth rate of 20 percent, by 2020 utilizing personal data could deliver an annual economic benefit of €330 billion to enterprises in Europe. Furthermore, the combined total digital identity value of individuals could comprise around 8 percent of the EU-27 gross domestic product (Boston Consulting Group, 2012). Despite these impressive figures, the pressing question is: to what extent do we as data providers and as a society benefit from these business models? One could argue, as ETH Zürich Professor Ernst Hafen does, that ‘the data economy is broken, because we do not have control over our data’ (SWISS RE, 2018). Ordinary individuals are dispossessed of control over their data and cannot access the revenue their data generates.

Exacerbating this sense of powerlessness for private individuals, several large companies recently used personal data in ways misaligned with public norms and values. These scandals led to public outcry against privacy breaches as well as abuse of power. Facebook has been a chief offender, with noteworthy failings such as providing the political consultancy firm Cambridge Analytica access to over 50 million user profiles, which they used to influence the 2016 USA election and the UK Brexit campaign (Reference Cadwalladr and Graham-HarrisonCadwalladr and Graham-Harrison, 2018). In the healthcare sector, the contract between the Royal Free NHS Foundation Trust, London, and Google Deep Mind breached the United Kingdom’s Data Protection Act with controversial implications for patient privacy. The Trust provided Deep Mind with about 1.6 million patients’ personal data as part of a study to test a detection and diagnosis system for acute kidney injury (Information Commissioner’s Office, 2018; Reference Powles and HodsonPowles and Hodson, 2017).

In addition to lack of control and privacy breaches, security breaches present a further challenge to the management of personal data. Whereas a privacy breach refers to a company inappropriately sharing data, a security breach occurs when hackers access data repositories, or data is leaked due to poor data security mechanisms. For example, think of the computer software updates that could have contained the WannaCry ransomware attack in 2017, had they been performed. Instead, staff neglected the updates, nefarious ransomware exploited the vulnerability, and the English NHS performance was compromised for days (National Audit Office, 2018).

These improprieties led to increased public skepticism over whether data-driven private (and to some degree, public) enterprises can be trusted, including within the healthcare system (Symantec, 2015; Reference HafenHafen, 2018). We can easily observe the uptake of the term “trust” as a prominent concept in the public sphere, which hints at a public need to discuss issues of trust. This is alarming evidence that healthcare systems need to be reformed, including the data economy within them (Reference Gille, Smith and MaysGille, Smith, and Mays, 2014). Prevailing public trust in the healthcare system is paramount for a healthcare system to function well (Reference 68Gille, Smith and MaysGille, Smith, and Mays, 2017). If the public does not trust organizations in the healthcare system to protect and appropriately manage the data entrusted to them, value generation is at risk. In particular, the healthcare industry depends on access to personal data, with an estimated 40 percent of the healthcare system’s benefit generated on the basis of personal identity data (Boston Consulting Group, 2012).

To complicate the picture, Mhairi Aitken and colleagues concluded that in fact conditional support for sharing and linking health data exists. This seems to contradict the concerns raised earlier. Public concerns pertain to issues such as confidentiality, individuals’ control over their data, uses and abuses of data, and potential harms that may ensue. However, the public also supports private companies’ research when actual or potential public benefits from research are foreseeable, as well as when the public trusts the individuals or organizations leading and overseeing the research, data linkage, and data sharing (Reference Aitken, Jenna, Pagliari, Jepson and Cunningham-BurleyAitken et al., 2016). Other studies in the field similarly highlight the importance of trust and clear public benefit of research (Wellcome Trust, 2015; Reference Audrey, Brown, Campbell, Boyd and MacleodAudrey et al., 2016).

To restore the data economy – that is, to build public trust toward data-rich enterprises, as well as trust for how data flow between different enterprises is managed; to establish individual control over personal data; and to ensure privacy as well as data security – we need to answer several practical, ethical, legal, and social questions. Ultimately, these issues can be addressed through an appropriate governance model, but this is no easy feat. What governance model do we need for aggregated personal data sets? Who should have legitimate control over personal data, and how can we foster digital self-determination? How can personal data be securely stored? How can we increase transparency about who uses people’s data, and how they use it? Who is accountable for aggregated data sets stored in research facilities?

In response to these questions, health data cooperatives can perhaps provide a suitable model to govern aggregated data sets. Private individuals in cooperatives democratically control the governance processes of the cooperative itself and the data stored within the cooperative. Health data cooperatives therefore may be able to provide a fair governance model for health data ecosystems that may benefit society through innovation, knowledge generation, improved quality of healthcare, or advances in diagnostics and therapy (Reference HafenHafen, 2019). Furthermore, health data cooperatives empower private people, as control of personal data shifts from corporate enterprises back to individuals who provide their data for research. This is of particular importance, as private individuals are the legitimate controllers of their own data, especially when it comes to health data (Reference Wilbanks and TopolWilbanks and Topol, 2016). Moreover, health data cooperatives uniquely combine a list of attributes that are crucial for legitimate data aggregation. Examples include open and collective governance principles; not-for-profit status, as revenues are re-invested into the cooperative itself; and the use of open-source software to simplify the creation of new data cooperatives (Van Reference Roessel, Reumann and BrandRoessel, Reumann, and Brand, 2018).

Prior to this chapter, examples of health-related cooperatives are presented in the previous volume in this series: Governing Medical Knowledge Commons (Reference Frischmann, Strandburg, Madison, Frischmann, Strandburg and MadisonFrischmann, Strandburg, and Madison, 2017). Among these, the authors of the book discuss health data commons, such as Genomic Data Commons (Reference Evans, Frischmann, Strandburg and MadisonEvans, 2017). Evans describes the legal and practical obstacles of aggregating genomic data in commons for the US context. In particular, these obstacles relate to decentralized data storage, consent alignment and data access, as well as aggregation (Reference EvansEvans, 2016). Evans encourages the professional community to overcome these obstacles and to find appropriate ethical governance mechanisms for such commons.

Also, in the 2014 volume Governing Knowledge Commons, both Contreras and Van Overwalle analyze the construction of genome commons (Van Reference Overwalle, Frischmann, Madison and StrandburgOverwalle, 2014; Reference Contreras, Frischmann, Madison and StrandburgContreras, 2014). However, in this example commons are constructed in a different format in which data is not aggregated, but a public network was built allowing data sets to be shared. They observe that the rapidly growing data volume, described by some as a data tsunami, will flood data cooperatives. Importantly, they observe that commons structures should be designed to fit the complex and highly specialized nature of genetic research structures:

Failing to appreciate the structural rules implemented to address these issues, or seeking to dispense with them in favor of a more broadly “open” public goods models … could have adverse consequences. In particular the elimination of rules regulating human subject protection could limit the willingness of individuals to participate in genomic research, and the elimination of data-generator priorities could weaken the incentives of data-generating scientists. Each of these effects could negatively impact the growth of the commons itself.

(Contreras, 2014, 130)

Taken together, the two examples present several structural, practical, ethical, and legal challenges that are inherent in the development of medical commons and likely also apply to the development of health data cooperatives, such as the example presented in this chapter. Tying in with the previous examples and in response to the societal challenges described earlier, the Data and Health Association, founded in 2012, aimed to establish a health data cooperative for Switzerland. As a result of these efforts, the health data cooperative MIDATA was co-founded by a group of researchers of ETH Zürich and the University of Applied Sciences Bern, in 2015 (MIDATA Genossenschaft, 2017; Reference Mòdol, Riemer, Schellhammer and MeinertMòdol, 2019). The basic idea behind MIDATA is conceptually similar to a bank account. A person can open an account to deposit copies of her data (which was collected and stored elsewhere), and then she can choose to make the data accessible to researchers to advance science. This is shown in Figure 2.1.

Figure 2.1 Simplified overview of the MIDATA cooperative (MIDATA Genossenschaft, 2019c)

In addition, people can become formal cooperative members, in contrast to those who only open an account. Eventually, each individual’s account will contain a wide range of different data sets that belong to that one person. The data stored by MIDATA on servers located in Switzerland is encrypted and can only be accessed by the account owner, unless they release it for a specific purpose. To access this rich data source, external parties can submit a proposal for data use. If the proposal is positively reviewed by the ethics committee, each account holder can consent to release her data to the specific project. Account holders need to release their data for each project individually, as this action is a central privacy control mechanism of the data cooperative.

Culturally, MIDATA is embedded in a society with a cooperative tradition in many fields apart from healthcare. This is arguably an advantage for the implementation of MIDATA, as the Swiss society is well familiar with the basic principles of cooperatives. Nowadays, some of the most prominent enterprises in the Swiss public sphere are cooperatives. Two notable examples are the grocery chain Migros and the car sharing platform Mobility, run by the Swiss Federal Railways (SBB). Migros is one of the most recognized grocery stores in Switzerland. The Migros cooperative comprises 2.1 million members (growth rate 0.7 percent in 2016), about 50 enterprises that are linked to the Migros group, and the Mirgos group had total sales of 27738 million CHF in 2016. Migros’ roots stretch back to 1925, when five Ford Model T cars started selling groceries in Zürich. Gottlieb Duttweiler (1888–1962), the founder of Migros and a well-known Swiss personality, aimed to build a direct link between producers and consumers, similar to the health data cooperative described in this chapter. Also, Duttweiler with his wife formulated fifteen theses that make up the moral spirit of Migros. To highlight a few key theses, Duttweiler pointed out the importance of transparency, accountability, and the involvement of women in the decision-making and governance of the cooperative. The cooperative is present in the public sphere, and Migros recently broadcasted a TV advertising campaign called the “Migros Besitzer” – the Migros owner – showing the benefits of membership in the Migros cooperative (Migros, 2017). Similarly, Mobility started as a cooperative in 1987, and is now the largest car sharing platform in Switzerland. Mobility is present in every village with more than 10,000 inhabitants. Today, Mobility has more than 50,000 cooperative members and over 120,000 customers. The declared goal is to minimize the traffic burden in Switzerland and contribute to a more efficient and individualized mobility solution for customers (Mobility, 2019). Mobility’s signature red cars help distinguish Mobility as a highly visible and recognizable cooperative within the Swiss public sphere. Mobility and Migros are just two prominent examples among many that indicate the cooperative idea is already well established in the Swiss media landscape and public sphere.

Given that people in Switzerland are familiar with the concept of cooperatives, and in light of the pressing need to find an alternative governance model for the use of personal data in research, the founders of MIDATA currently have high hopes that MIDATA is a platform that could resolve the challenges raised earlier.

2.1.1 Tool of Analysis: Governing Knowledge Commons Framework

To describe the MIDATA cooperative in a structured and detailed way, the remainder of this chapter will apply the Governing Knowledge Commons (GKC) framework to MIDATA. Brett Frischmann, Michael Madison, and Katherine Strandburg developed the GKC framework, drawing inspiration from Elinor Ostrom and her colleagues’ work on the institutional analysis and development framework (Reference OstromOstrom, 1990; Reference Frischmann, Madison, Strandburg, Frischmann, Madison and StrandburgFrischmann, Madison, and Strandburg, 2014; Reference Strandburg, Frischmann, Madison, Frischmann, Strandburg and MadisonStrandburg, Frischmann, and Madison, 2017). This framework allows researchers to analyze “institutionalized community governance of the sharing and, in some cases, creation, of information, science, knowledge, data, and other types of intellectual and cultural resources” (Reference Frischmann, Madison, Strandburg, Frischmann, Madison and StrandburgFrischmann, Madison, and Strandburg, 2014, 3). In combination with Helen Nissenbaum’s theory that understands privacy as contextual integrity (Reference NissenbaumNissenbaum, 2010), the GKC framework is useful when examining how individuals maintain privacy and govern their own health data cooperative. Broadly, the framework considers the background environment, attributes, governance, patterns, and outcomes of knowledge commons. The remainder of this chapter will follow the structure of the GKC framework to ease comparison across the different case studies of this and earlier volumes of the knowledge commons book series (Reference Strandburg, Frischmann, Madison, Frischmann, Strandburg and MadisonStrandburg, Frischmann, and Madison, 2017, 16–17).

At present, MIDATA is in the buildup phase. Several small research projects, as presented later, contribute to the testing and refinement of MIDATA. This chapter is part of an ongoing (2018–2021) health ethics and policy research project at ETH Zürich, Switzerland, where we aim to develop further the existing governance model of MIDATA. Our main purpose in this research is to create a systemic oversight model for MIDATA (described later) that is considered trustworthy by the general public and MIDATA members. In this research we engage with governance theory, law, and policy, and will conduct interviews with different stakeholders, such as researchers who work within the cooperative, members of the cooperative, and members of the general public. For this chapter we gathered background information by interviewing MIDATA co-founder Ernst Hafen. For the interview we developed questions that follow the content of the GKC framework and Nissenbaum’s privacy theory. In addition, we examined the statutes of MIDATA as well as further policy and administrative documents of the cooperative. The advantage of applying the GKC framework and insights from privacy theory as contextual integrity at this stage is the possibility to leverage the perspective the framework provides to inform MIDATA’s ongoing development. Applying the GKC framework to MIDATA can help us better understand how MIDATA processes and structural components contribute to the community governance of MIDATA. The systematic design of the GKC framework allows us to dissect MIDATA to unfold the involved attributes, the present governance structure, and anticipated outcomes. In addition, the theory of privacy as contextual integrity can help us to improve the governance processes and structures that apply to privacy within MIDATA.

2.2 Attributes of MIDATA

MIDATA is a member-owned cooperative that aims to store and aggregate personal data from people who open an account at MIDATA. Cooperative members are the main actors in the cooperative as well as the main resource providers for the cooperative itself. Ultimately, the goal of MIDATA is to provide a secure storage for personal data, in which account holders themselves retain full control over their data. As MIDATA members, people contribute to research by granting others access to their data. The following sections will describe MIDATA’s resources, goals, and objectives in more detail, and the role private individuals take in this cooperative.

2.2.1 MIDATA Resources

The resources pooled in the MIDATA cooperative are copies of account holders’ personal data. Such data can be transferred to the account by the account holder him/herself; or in some cases apps use MIDATA to store data, and account holders using such an app can allow the data to be deposited in to their MIDATA accounts directly. Accessing copies of personal data has been simplified within the European Union with the newly established European Union’s General Data Protection Regulation. This regulation emphasizes in Article 20 that individuals have the right to copies of their data. This right applies to EU residents, and non-EU residents can ask for copies of data stored by companies based within the EU (European Parliament Council of the European Union, 2016). It is anticipated that this European regulation will supportively affect the data transfer processes for MIDATA, even though the regulation does not apply in Switzerland as Switzerland is not an EU member state (Reference Ngwa and HafenNgwa and Hafen, 2017). Swiss residents can make use of the GDPR when they request data that is stored within the EU.

The allergy app Ally Science is an example of an app that stores data on MIDATA. Launched by Bern University of Applied Sciences and University Hospital Zurich, Ally Science is part of a research project to collect pollen allergy symptoms data in combination with location data. Thereby, the study investigates pollen allergies in Switzerland (MIDATA Genossenschaft, 2019a). If the app user has no MIDATA account, the app user is asked to open a MIDATA account before s/he uses the app. By July 2018, 8,100 app users had registered (Reference HafenHafen, 2018). Another use case example is a research study involving multiple sclerosis patients. Study participants use an app called MitrendS to capture their neurological development over time. Generally speaking, each citizen generates a huge amount of data that can be stored in MIDATA accounts so that each citizen can contribute to the development of the cooperative’s resource pool (Reference Mòdol, Riemer, Schellhammer and MeinertMòdol, 2019).

2.2.2 MIDATA Account Holders, Cooperative Members, and the Swiss Community

Because the cooperative model of MIDATA is designed to be regional, MIDATA is open to any person residing in Switzerland. To open an account on MIDATA, one does not need to be a member of the cooperative. Yet, if a person would like to actively participate in the governance of MIDATA, a community member needs to become a cooperative member for a fee of 40 CHF. The cooperative has an altruistic motivation that goes beyond the MIDATA community itself and seeks to benefit the general society, and this is evident in the objectives outlined later.

2.2.3 Goals and Objectives of MIDATA

The overarching goal of MIDATA is to establish regional, member-owned data cooperatives that contribute to research and ultimately to the benefit of society via their stored data. In more detail and as stated by Article II of the MIDATA statutes, the objectives are:

(1) “The Cooperative pursues as a non-profit organization the following objectives:
1. (a) it operates a secure IT platform (‘MIDATA platform’) for storage, management, and sharing of personal data of any kind, in particular health and education data, and to provide related services;
2. (b) it makes the MIDATA platform available to natural persons (members and non-members) who may use the platform as personal data account holders (‘account holders’);
3. (c) it promotes broad Cooperative membership among account holders, thus allowing them to partake in the governance of the Cooperative, and it helps members pursue common interests;
4. (d) it promotes the digital self-determination of the population by enabling account holders to use their personal data as self-determining agents and according to their wishes, in particular to support research purposes;
5. (e) it promotes the collective interests of the account holders and it enables the utilization of their personal data as a common resource. This is achieved by enabling individual account holders to accept requests for the analysis of their data and to give explicit informed consent for the secondary use of their personal data by third parties in return for an economic remuneration to the cooperative;
6. (f) by providing the MIDATA platform, it fosters the development of an innovative ecosystem in which third parties can offer data-based services to the account holders;
7. (g) it promotes medical research projects and projects that aim to realize a fair digital society and that promote the digital self-determination of the population; and
8. (h) it employs the scientific results and income derived from the secondary usage of personal data in the framework of the aforementioned objectives.
(2) With its operative and commercial activities, the Cooperative strives to achieve a positive effect upon society and the environment.
(3) The Cooperative may engage in all activities that are directly or indirectly related to its purpose.
(4) The Cooperative may support the founding of cooperatives of equal purpose in Switzerland and abroad, and it may form a federation of cooperatives together with them.
(5) The Cooperative may establish branches and subsidiaries in Switzerland and abroad, hold interests in other companies in Switzerland and abroad, and acquire, hold and sell real estate.” (MIDATA Genossenschaft, 2017, 2,3)

Key values represented by the objectives are data security, being open to all people, promotion of cooperative membership, promotion of digital self-determination, promotion of collective interest, fostering innovation and medical research, and re-investment in the goals of the cooperative. Together these values eventually lead into the overarching aim to achieve a positive effect on society and environment. Furthermore, to build a network of cooperatives, MIDATA may help to facilitate similar cooperatives. To finance itself, the cooperative may engage in financial investment activities.

All key values are relevant to this cooperative’s character, but we consider promotion of digital self-determination the one value that makes this cooperative distinctive. As we presented in the introduction of this chapter, outside of the cooperative model, private individuals’ ability to determine how their personal data is used is limited, if not impossible. Therefore, MIDATA aims to foster digital self-determination as one of the key incentives for members of the general public to participate in MIDATA and eventually become MIDATA cooperative members. Objective 1 (d) underlines this clearly by promoting the personal use and free choice over what one wishes to do with his/her data, focusing in particular on research.

When we compare the MIDATA objectives to the cooperative principles as stated by the International Co-operative Alliance (Voluntary and Open Membership; Democratic Member Control; Member Economic Participation; Autonomy and Independence; Education, Training, and Information; Cooperation among Cooperatives; and Concern for Community), it becomes clear that MIDATA is in line with the cooperative tradition (International Co-operative Alliance, 2019). Based on the similarity between the MIDATA cooperative values and general cooperative principles, and given the prevalence of cooperative membership already present in Switzerland, it seems likely people will easily familiarize themselves with the governance principles of MIDATA as they participate in MIDATA activities. This should facilitate easy access and participation in MIDATA. Nevertheless, MIDATA’s focus on scientific data is clearly different from grocery trading and car sharing. It is essential to educate participants so they are equipped to contribute to the MIDATA governance in a meaningful way. Yet, MIDATA participants can build on their previous familiarity with cooperatives in other areas of their lives.

As MIDATA focuses at present on healthcare and public health research, data stored within MIDATA may contribute to the improvement of health for all. Furthermore, it is anticipated that the cooperative as a whole could advance public literacy and public control in the field of digital self-determination. These two aspirations together comprise the key value of knowledge production within the cooperative.

2.3 Governance of MIDATA

The legitimate action arena for MIDATA is research, development, and education. At the moment, MIDATA focuses on healthcare research, in particular research that exploits aggregated data sets. We anticipate that in the future MIDATA will extend to other research fields such as education. In doing so, MIDATA’s success depends on cooperative members investing significant trust in MIDATA, and in the public and research institutions that apply to access their data. Therefore, MIDATA needs to maintain not only an appropriate governance model but also trustful relationships with all stakeholders to be able to compile meaningful data sets, and also to appear as a valuable partner for researchers, so that they invest their resources into MIDATA. When it comes to the involvement of private companies, it will be fundamental to adhere to robust governance structures within MIDATA. In particular, it will be necessary to show how corporate research will benefit the wider society. Furthermore, private companies will need to show how privacy is maintained and disclose their accountability structures (Reference Aitken, Jenna, Pagliari, Jepson and Cunningham-BurleyAitken et al., 2016). That is to show, in an understandable way, who is accountable for the research conducted with the data provided by MIDATA members. Such structures need to not only meet ethical values but also align with good governance. A governance model that is likely to be particularly suitable for MIDATA is the systemic oversight approach (Reference Vayena and BlasimmeVayena and Blasimme, 2018). As data volume increases, the situation requires adaptive governance models that are able to respond to the challenges that come with big data and the accumulation of data ecosystems. Among other challenges, experts anticipate that current informed consent processes are limited in their capacity to provide a meaningful choice to data donors about how they control large volumes of their data. In addition, broad consent to a frankly unlimited future use of data, as often seen in current consent designs for medical research, is not only ethically questionable but also provides the donor with no control over future use of their data. Finally, the increasing use of machine learning algorithms in data-intense research challenges research accountability in a way that an informed consent process is not able to cover appropriately (Reference Vayena and BlasimmeVayena and Blasimme, 2018). In response to these challenges, which are also relevant to MIDATA, systemic oversight is a governance model that builds on the principles of adaptivity, flexibility, monitoring, responsiveness, reflexivity, and inclusiveness. These principles should not be understood as fixed mechanisms but rather as a Leitmotiv for MIDATA governance. As data handling is at the core of MIDATA, it will be pivotal to implement governance mechanisms that are adaptive to new types of data as well as increasing data volumes. Also, as different research projects apply for data use, MIDATA governance mechanisms need to be flexible to meet the requirements of how the data will be used, as opposed to governance mechanisms that are tailored towards the origin of data. Furthermore, as nowadays data sets are linked, and therefore source data will be used to develop new data sets, it is essential to monitor data use beyond the initial research proposal approval, especially since novel data mining and machine learning methods potentially pose risks to privacy and may lead to discrimination. To accommodate for potential malicious privacy breaches or other failures, governance mechanisms need to be responsive and prepared to address such problems. Also, data sets provide information not only about the data donor but also potentially about his/her environment. Therefore, governance needs to be reflexive about these issues. This requires reflexive analysis of assumptions and biases that are embedded in machine learning algorithms. Lastly, governance should include all relevant stakeholders in governance processes to not exclude underrepresented groups. This engagement should foster public dialogue and learning (Reference Blasimme and VayenaBlasimme and Vayena, 2018; forthcoming). As mentioned earlier, our present research activities in the field of governance at MIDATA precisely test the suitability of systemic oversight for health data cooperatives.

Currently, MIDATA comprises four bodies: the general assembly, administration, audit office, as well as an ethics committee. The general assembly is the highest body, qualified among other competences to elect members for the administration, audit office, and ethics committee, as well as to amend the statutes and close down MIDATA. The management contains the management board and coordinates the operational work of the cooperative. The auditor is an independent body in line with the Swiss Code of Obligations. Last, the ethics committee reviews the quality of projects that apply for the use of data stored in the cooperative. This review also assesses how applicants aim to ensure privacy and what privacy preserving mechanisms are proposed by the applicant. Furthermore, the ethics committee advises the general assembly regarding reviewed proposals (MIDATA Genossenschaft, 2017).

Actors involved in the governance process can be divided roughly into four groups: first, cooperative members – membership confers formal governance powers; second, private individuals who have an account on the MIDATA platform but who lack formal governance power as they are not cooperative members. Governance power can easily be acquired by becoming a member of the MIDATA cooperative; third, professionals who are members of different committees and provide expertise and make decisions, for example about the ethical validity of research applicants; and fourth, administrative staff that run the cooperative on a daily basis and maintain the IT infrastructure. From a decision-making point of view, three decision points are of crucial importance. First and foremost, account holders have the exclusive decision power on what data should be stored within the cooperative as well as which data they would like to release into a research project. Then, elected members of the ethics committee (elected by the general assembly) review and decide which research projects are deemed to be in line with ethical as well as cooperative norms and therefore are approved to request data from account holders. Last, the general assembly is the highest decision-making body within the cooperative and therefore has the last word when it comes to committee elections, statutes amendments, or any other structural and far-reaching decision. At the current stage of governance development, people who have an account but are not cooperative members have no powers over cooperative governance. Nonmembers of MIDATA who do not hold an account with MIDATA can open an account at any time. The public visibility of MIDATA mainly relies on media coverage and academic events related to either research projects that work with MIDATA, as described earlier, or related to professionals who are involved in MIDATA. So far, MIDATA was covered in several local and national news articles in Switzerland, as well as in scientific journals (MIDATA Genossenschaft, 2019b).

With respect to privacy governance, at present MIDATA operates two mechanisms to maintain privacy. On the institutional level, there are ethics committee reviews. At the account holder level, there is the dynamic consent process, whereby account holders must actively consent to research as well as release their data into a research project in order for an applicant to access it. Here, MIDATA’s Privacy Policy in articles 5 and 6 clearly puts the account holder in the center of activities that are related to the account holder’s data. Only law enforcement can override the exclusive data access rights of account holders (MIDATA Genossenschaft, 2018). Both measures are crucial to maintain account holders’ privacy. The ethics committee works with professional expertise to review how the applying research projects will preserve privacy within their research projects. Both mechanisms together also control appropriate data flow. In the last instance, it is the account holder’s decision to consent to data sharing depending on whether s/he finds the data flow appropriate.

2.4 Patterns and Outcomes of the MIDATA

It is anticipated that MIDATA will lead to a range of benefits for not only cooperative members but also society in general. Considering the objectives of MIDATA, cooperative members and account holders will benefit foremost from the ability to control and manage their own personal data repository due to their exclusive control rights over their data. Nested within a governance framework designed to promote ethical, secure, and transparent data sharing, account holders should find a dependable platform to store and collect copies of their personal data. In addition, cooperative members have governance powers over the cooperative itself, and can therefore directly and in a democratic manner influence the governance processes within the cooperative. By personal control over data as well as the possibility to take part and shape governance processes, cooperative members maintain high levels of control over their own privacy.

Furthermore, as the cooperative is open to the general public, everybody has the opportunity to open an account on the MIDATA platform and to become a cooperative member. MIDATA membership growth will not only strengthen the cooperative by increasing the data volume and thereby the value for research, but a growing cooperative will also democratize the data economy within Switzerland. This is because ultimately MIDATA provides a governance tool that allows individuals to determine what happens with their data. Now, if a high proportion of Swiss residents store copies of their data in data cooperatives, they will eventually take back the legitimate control of their own data, which was one of the driving motivations for the foundation of MIDATA. This entire process is supported by the coordination role MIDATA plays by connecting its members and building an exchange network for members and account holders.

The expected social benefit of MIDATA will depend on the research conducted with data stored in MIDATA and MIDATA’s actions in the area of public relations, advocacy, and education. MIDATA understands itself as a platform that drives innovation, facilitates medical research projects, and promotes the digital self-determination of private individuals. With this focus, MIDATA may also be able to raise digital literacy among the general public and to act as an advocate for a fair data economy within Switzerland. In addition to the focus on Switzerland, the founders of MIDATA also work actively to develop relationships with other research institutions across Europe, to spread the MIDATA model and to drive the health data cooperative movement. At the moment, MIDATA is building a cooperative ecosystem with the Berlin Institute of Health, Charité, Germany; Medical Delta, City of Rotterdam, the Netherlands; Vito Research Institute, Belgium; Oxford University Hospital Foundation Trust, UK; and the INDEPTH-Network.org. Together, they focus on informing health policy through improved health information in low- and middle-income countries (Reference HafenHafen, 2018).

Considering the benefits for cooperative members, wider society, and international partners, MIDATA has the potential to contribute to legitimate research outputs and innovation in society. As MIDATA is structured following cooperative principles, in combination with the Swiss tradition of corporativism, the actions and outputs of MIDATA are likely to be perceived as legitimate by the general public as well as cooperative members. Nevertheless, it will be crucial to maintain and build a governance structure that addresses the concerns raised in the introduction of this chapter to make MIDATA a true alternative for member-controlled data sharing, and eventually a competitive data platform that attracts stakeholders from across society and research.

2.5 Conclusion

This chapter discussed the Switzerland-based health data cooperative MIDATA. In response to concerns about the present health data economy, MIDATA was founded to provide a governance structure for data storage that supports individuals’ digital self-determination, by allowing MIDATA members to control their own personal data flow and to store such data in a secure environment. The aim of MIDATA is to give data control back to the legitimate data controllers, the people, and thereby allows individuals to regulate their own personal privacy. In addition, in line with basic cooperative principles and considering MIDATA’s aim to advance science, MIDATA may contribute to the advancement of society and innovation.

MIDATA will refine its governance structure to account for the challenges that burgeoning data volumes and diversity present. In particular, current research activities focus on making the governance structure even more robust by adopting the systemic oversight approach. Then, it is anticipated that MIDATA will grow and that the cooperative data storage model will establish itself as a serious alternative to existing data repository models.

3 Pooling Mental Health Data with Chatbots

Michael Mattioli ¹

3.1 Introduction

For an invisible condition, clinical anxiety casts a long shadow. In the United States, about 40 million adults – a full 18.1 percent of the population – suffer from an anxiety disorder, making it the most common type of mental health problem in the country [1]. Anxiety is insidious; it can crop up like a weed at any point in life, coloring one’s days and nights with unaccountable feelings of dread. Anxiety is expensive. A recent study estimated that in the United States, the total costs of treatment, lost productivity, and lost wages are about $42 billion every year (Reference DuPont, Rice, Miller, Shiraki, Rowland and HarwoodDu Pont et al., 1996) [2]. This amount exceeds the most recent annual revenues of Facebook, Coca-Cola, Nike, and Morgan Stanley. Anxiety is thriving. There is a shortage of mental health workers (clinicians) in the United States, and experts believe that this problem of supply and demand will worsen substantially by 2025. Even when help is available, high prices and perceptions of social stigma prevent many people from seeking help [1].

This chapter explores a new type of technology that relies in part upon large sets of patient data to relieve anxiety symptoms (Reference DobbsDobbs, 2017). The technology is software that can simulate certain kinds of conversations with human therapists. Referred to as “conversational agents” or “therapy chatbots,” the idea might sound like science fiction, but it is a commercial reality. Recent studies indicate that conversational agents are useful in helping to reduce the symptoms of anxiety and depression. A recent peer-reviewed study showed that the program highlighted in this chapter, “Woebot,” can significantly reduce depressive symptoms in two weeks of regular use [3]. Some therapy chatbots have also been able to predict the onset of panic attacks or depressive episodes based on patterns in user behavior (Reference Mathotaarachchi, Pascoal, Shin, Benedet, Kang, Beaudry, Fonov, Gauthier and Rosa-NetoMathotaarachchi, et al., 2017). A key feature of therapy chatbots is their ability to improve over time by drawing insights from the ever-growing pools of information they receive from their conversations with users.Footnote ² The potential for good seems significant.

An even more tantalizing possibility is that therapy chatbots offer can solve the problem of collecting, pooling, and drawing helpful insights from large sets of mental health data. In the past, only individual therapists or scientists have been able to collect such information in the course of their work. Patient confidentiality, funding constraints, a lack of technological infrastructure and expertise, and limited numbers of patients have made it infeasible to build useful pools of mental health data in treatment or research settings. One such problem is the fact that many people regard their mental health conditions as private. As mentioned, the desire for privacy often stems from a sense of social stigma that many attach to mental health care. Individuals struggling with, say, anxiety, might prefer to benefit from the insights of a data pool without contributing to it themselves. If everyone felt this way, there would be no data to collect in the first place. The effort would fail – a classic free-rider dilemma.Footnote ³

Woebot and conversational agents like it address this problem in a unique way. The system obscures its most valuable asset – broad knowledge or wisdom about how to help people – behind an automated chat interface that the company tightly controls. Unlike a database, this information cannot easily be copied or otherwise expropriated. The asset, so to speak, can be accessed only indirectly through conversing with the software. Chatting in this way requires users to share information with the system. (Importantly, the company keeps all user data private and has pledged not to monetize any user data.) Woebot shows how a chat interface might be able to regulate the flow of mental health data, and in doing so, ameliorate the collective action problems that make aggregating mental health data difficult.

Students of commons governance could regard chatbots like Woebot as technologies that provide an “imposed pattern” for governing the flow of private information. Like other “imposed commons” that scholars have examined in this series of volumes, therapy chatbots are affected by a vast number of exogenous factors, such as laws. The Governing Knowledge Commons (“GKC”) helps situate these new technologies in their broader cultural, medical, legal, and technological contexts.

Drawing upon the GKC framework, this chapter presents an ethnographic study of Woebot – a therapy chatbot designed to administer a form of Cognitive-Behavioral Therapy (“CBT”). Section 3.1 explains the methodology of this case study. Section 3.2 describes the background contexts that relate to anxiety as a public health problem. These include the nature of anxiety and historical approaches to diagnosing and treating it, the ascendency of e-Mental Health therapy provided through apps, and relevant laws and regulations. Section 3.3 describes how Woebot was developed and what goals its designers pursued. Section 3.4 describes the kinds of information that users share with Woebot. Section 3.5 describes how the designers of the system seek to manage this information in a way that benefits users without disrupting their privacy.

3.2 Methodology

This chapter’s approach follows the GKC framework developed by Katherine Strandburg, Brett Frischmann, and Michael Madison (2014). The framework is an adaptation of Elinor Ostrom’s Institutional Analysis and Development (IAD) (Reference OstromOstrom, 1990; Reference Frischmann, Madison and StrandburgStrandburg, Frischmann, and Madison, 2014). I followed the following process.

A literature review. To gather general information about the state of e-mental health apps and services, I surveyed recently published books, newspaper articles, and academic works related to this topic. This research also covered general interest publications on anxiety and treatments such as Cognitive-Behavior Therapy. From these sources, I identified software and services (apps) designed to help people manage their anxiety. I then contacted individuals who led the companies that produce these apps.
Semi-structured interviews. I interviewed ten experts with knowledge of either Woebot or the e-mental health landscape more generally. I selected some of these individuals because they were cited or quoted commonly in news articles, books, and academic works. I learned of others in the interview group by asking for contacts and referrals in the first interviews I conducted for this project. Interview subjects included professors, technologists, and CEOs. In keeping with the GKC framework, these interviews were semi-structured and focused on the following topics: (1) the scientific, technological, and social contexts in which this form of information-sharing takes place; (2) the various types of data and related informational assets this group seeks to aggregate and organize access to; (3) the “default” status of these assets; (4) the players involved, including corporations and health-care institutions; (5) the community’s goals; (6) rules and related internal governance mechanisms; (7) the technological infrastructure supporting the community.

I conducted all interviews by telephone and recorded them through handwritten notes. The average duration of the interviews was 45 minutes. Some interviews were supplemented with brief follow-up email exchanges. In keeping with Internal Review Board procedures, I furnished each interview subject with an information sheet describing the goals of this study.

3.3 Background Environment: Contexts

This section focuses on the most prominent landmarks in the anxiety landscape: how the disorder has been understood and treated historically, how anxiety sufferers typically encounter and experience the problem, the evolving nexus between mental health and technology, and relevant laws and regulations in the United States.

3.3.1 A Brief History of Anxiety

Culture has shaped how people have understood and treated anxiety throughout history (Reference HorwitzHorwitz, 2013). The very language we use to describe the problem is rooted in cultural lore: The word “panic” derives from the Greek god Pan, whose battle cry was said to fill his enemies with uncontrollable fear. In ancient Greece, soldiers who experienced panic in battle were often regarded as cowards and imprisoned or executed. Some historians credit Aristotle (384–320 BCE) for taking the first step toward understanding anxiety as a pathological condition. In exploring the idea of fear, he wrote, “For the man who is by nature apt to fear everything, even the squeak of a mouse, is cowardly with a brutish cowardice, while the man who feared a weasel did so in consequence of disease.” In other words, fear can sometimes be healthy and warranted and other times a sign of illness. (We can surmise that in Aristotle’s time, mice were generally considered a rational source of fear, whereas weasels were not.)

Other Greek philosophers made some of the earliest and most enduring contributions toward Western understandings of anxiety. Hippocrates (460–367 BCE), for example, sought to understand medical disorders by observing his patients in a systematic, scientific manner. This process led him to develop a system for classifying mental disorders, including anxiety. Today, a similar classification system called the Diagnostic and Statistical Manual (DSM) is widely used in the medical profession to identify mental health disorders (Reference GhinassiGhinassi, 2010). Hippocrates conjectured that the key to treating anxiety was in achieving a kind of harmony between the body and mind (Reference Kleisiaris, Sfakianakis and PapathanasiouKleisiaris, Sfakianakis, and Papathanasiou, 2014).

Socrates (470–399 BCE) expanded the set of tools that experts had to investigate anxiety. The Socratic Method, a process of deeply questioning one’s underlying assumptions, is perhaps his most lasting methodological contribution. Like Hippocrates, Socrates also believed that the key to mental health had to do with the relationship between the body and the mind. Understanding this relationship, he believed, required not only physical examination but also introspection on the part of the patient. The idea is captured well in his famous injunction to his followers, “know thyself” (Reference GhinassiGhinassi, 2010). Plato (427–347 BCE) believed that anxiety was rooted in incorrect or distorted beliefs about the world. His solution was straightforward: educating sufferers about their erroneous beliefs, and through persistent reminding and reasoning, attempts to alter those beliefs.

Arguably, even more nuanced insights about mental health appear in Buddhist teachings from the fifth century BCE. In his teachings, Siddhartha Gautama (The Buddha) explains that people fundamentally misperceive the true nature of the world. The path away from anxiety and other afflictions, he explains, lies in gaining an accurate view of reality as it is, rather than how we imagine it to be. Buddhist traditions usually focus on better aligning the body and mind to gain this clearer view. As Thich Nhat Hanh, a widely known Vietnamese Buddhist master, explains, “Wrong-thinking causes us to see the world in an upside-down way. Our mind is often thinking about one thing while our body is doing another. Mind and body are not unified.”

The Middle Ages, which lasted roughly from the year 476 to 1000 BCE, saw a retreat from the scientific methods developed in ancient Greece in favor of the supernatural. In some cultures, the belief emerged that mental disorders are evidence of demonic possession. This idea is likely the historical root of the stigma that many sufferers of anxiety and depression experience today. In some cultures, the notion of demonic possession persists.

The European renaissance sparked a return to ancient insights. In the fifteenth century, for instance, Descartes developed the idea of “mind-body dualism” – the notion, as Descartes described it, that the mind and the body “compose a certain unity.” This insight led to the modern recognition of the fact that the mind and the body influence each other – “bidirectionality” – a core principle of contemporary treatment. These ideas were not new, of course. The Latin phrase “mens sana in corpore sano” (“a healthy mind in a healthy body”) dates to the second century.

Psychology, the modern scientific study and treatment of behavior and the mind, took root and flourished in the late nineteenth and early twentieth centuries. Some leaders in the field, such as Sigmund Freud and Carl Jung, focused heavily on the subjective experiences of patients. Freud believed, for instance, that the roots of anxiety, depression, and related mental health disorders lay in repressed sexual energy. Although this view is not widely credited today, Freud made a lasting contribution to the modern understanding of anxiety by classifying it into different forms: generalized anxiety, panic disorders, phobias, and obsessions (Reference HorwitzHorwitz, 2013).

Others in the then-emerging field searched for objective sources of information on the disorder. Ivan Pavlov (1849–1936) illustrated that anxiety responses could be conditioned in animals – a discovery that suggested people might develop anxiety disorders as a learned response to life experiences. Pavlov’s work inspired the field of behavioral psychology, which explores the theory that people acquire mental conditions entirely through learning. John Watson (1878–1959), Rosalie Rayner (1899–1935), and B.F. Skinner (1904–1990) are widely known names in this field.

In the 1950s, researchers challenged the notion that anxiety is solely a learned behavior. These experts argued that cognition, or how one sees and appraises the world, must play a role too. George Kelly (1905–67) advanced the idea that people perceive the world as a series of mental constructs. Mental disorders may arise when these constructs are inaccurate. This view seems to align remarkably well with the core teachings of Buddhist philosophy, as well as Plato’s teachings, both mentioned earlier. Recent research indicates that life experiences, including stressful environments and incidents, particularly in childhood, can “activate” the disorder.

Cognitive-behavioral therapy (CBT), one of the most effective techniques to treat anxiety today, is grounded in the idea that anxiety is a self-reinforcing behavior that stems from our misunderstanding of the world. Patients who undergo CBT treatment learn about common types of cognitive distortions, and then learn to spot these distortions in their thinking. For instance, some people who suffer from anxiety tend to think about future events in binary or black-and-white terms. In the mind of a person experiencing anxiety, the results of a routine blood test for cholesterol, for instance, will reveal either perfect health or imminent death. The simple act of recognizing this form of thinking and placing a label on it has been shown to drain it of its power.

In the 1950s, the American Psychiatric Association published the first edition of the DSM. Now in its fifth edition, the DSM is the primary tool that medical professionals use to diagnose mental health disorders, including anxiety. Some critics of the modern psychological establishment argue that the DSM’s focus on classification implies a level of clarity and precision to the diagnosis of mental health disorders that does not exist. This line of criticism is mostly academic, however, and removed from the day-to-day business of treating people with mental health disorders.

Alongside CBT, another important set of tools for treating anxiety are drugs. People have used chemicals derived from plants to manage stress and anxiety for millennia. Alcohol, one of the earliest such substances, remains one of the most popular. Opium, a drug derived from a type of poppy plant, has been used as a sedative at least since 4000 BCE. In the late nineteenth century, a chemist working at the Bayer Company successfully diluted the active molecule in opium – diacetylmorphine. Bayer branded the chemical as “heroin” and commercialized it in 1898. The drug’s dangerousness and addictive qualities were evident within just a few years of public use. Opiates were eventually replaced by barbiturates, another depressant, which were developed chiefly by a German chemist and Nobel Prize winner named Adolf von Baeyer.

In the 1950s, the pharmaceutical company Smith-Kline began selling a drug it called Thorazine. The drug’s popularity for treating mental health disorders fueled research into the development of new drugs designed to address specific disorders. The first drug that was developed specifically to target anxiety was called Miltown. Although highly effective, it was, unfortunately, also highly addictive. Continued research led to the creation of benzodiazepines, including diazepam (Valium) and alprazolam (Xanax). These drugs are often prescribed to treat acute episodes of panic, but like earlier drugs, they can be highly addictive. In the 1970s, Eli Lilly and Company introduced Fluoxetine (Prozac), a new kind of medicine called a selective serotonin reuptake inhibitor (SSRI). Approved by the FDA in 1987, the drug remains widely used and highly regarded, along with other SSRIs, such as Zoloft, Paxil, Luvox, Celexa, and Lexapro. SSRIs are the most commonly prescribed drugs in the United States and by far the frontline defense against anxiety. Despite the significant advances in drug development that took place in the twentieth century, very few new drugs have been developed to target anxiety since the 1990s.

Recent studies have shown that today, drugs (serotonin inhibitors) and CBT are the two most effective and standard tools to manage clinical anxiety [4]. Several complementary and integrated health treatments have also been shown to reinforce the benefits offered by conventional medical treatment. These include stress and relaxation techniques, yoga, meditation, and acupuncture [5]. Getting adequate sleep, regular exercise, maintaining a healthy diet, having a robust social support system, and living with low levels of stress are also important [6]. For people who have access to adequate mental health care, educational resources, and the ability to follow a beneficial lifestyle, anxiety can be highly manageable. Unfortunately, as discussed in the following section, there is a shortfall in access to mental health care.

3.3.2 The Patient Context

Nearly all anxiety disorders involve intense feelings of fear and the anticipation of imminent harm (Reference Hilty and MucicHilty and Mucic, 2016; Reference RachmanRachman, 2020). A short review of several types of anxiety shows subtler aspects of the problem. Under the heading of Anxiety, the DSM identifies the following: (i) generalized anxiety disorder, (ii) panic disorder, (iii) social anxiety disorder, (iv) obsessive-compulsive disorder, (v) post-traumatic stress disorder, and (vi) specific phobias:

(i) In generalized anxiety disorder, feelings of worrying and fear permeate everyday life. Sometimes, these feelings are tied to real causes for worry, but they are disproportionately intense. (Recall Aristotle’s mouse.) Other times, the feelings may be disconnected from any cause that the sufferer can identify – a kind of free-floating dread. These feelings are often accompanied by physical discomfort, such as muscle tension, restlessness, queasiness, and nausea. Those with generalized anxiety disorder are focused heavily on the future, both near- and long-term, and unable to tolerate feelings of uncertainty (Reference RachmanRachman, 2020).
(ii) Panic disorders, another class of anxiety disorder, are typified by intense physical and mental episodes of uncontrollable fear brought on by a perceived threat that typically does not exist. These episodes are different from the chronic worrying and fear that generalized anxiety disorder brings on: they are more physically intense and often mistaken by sufferers for heart attacks.
(iii) Social anxiety is identified by unwarranted feelings of embarrassment and self-consciousness. Unsurprisingly, people who suffer from this problem often isolate themselves.
(iv) Obsessive-compulsive disorder is characterized by repetitive or compulsive behaviors, such as hoarding, arranging objects in specific patterns, and extreme avoidance of germs.
(v) Another manifestation of anxiety, post-traumatic stress disorder (“PTSD”), has received a wealth of attention in the media because soldiers often experience it. Unlike generalized anxiety disorder or panic disorders, which inherently focus on the future, this problem is heavily focused on the past. Often a traumatic experience gives rise to nightmares and related, intensely upsetting memories.
(vi) Lastly, are specific phobias – the experience of anxiety symptoms brought on by certain situations or objects.

As mentioned in the introduction, anxiety is the most common mental health disorder in the United States. According to the Anxiety and Depression Association of America, 40 million adults age 18 or older, or about 18.1 percent of the population, is affected by anxiety every year [1]. Although the United States has the highest rate of anxiety in the world, the disorder is prevalent elsewhere. Interestingly, there does not seem to be a correlation between anxiety and industrialization: rates are very high in France, Colombia, and Lebanon. China and Nigeria, meanwhile, have relatively low rates. Moderate evidence suggests that culture and context impact how often the problem arises. Poverty correlates with higher rates of generalized anxiety disorder, for instance.

Recent research indicates that life experiences, including stressful environments and incidents, particularly in childhood, can “activate” the disorder (Reference GhinassiGhinassi, 2010). Studies have shown that the development of anxiety can be mitigated, however, by positive protective factors. Chief among these are parents who are warm and sensitive to the feelings of a child, and who impart upon the child an internal “locus of control” – a feeling that the child has control over his or her life. Although childhood experiences seem to be foundational, anxiety disorders can arise at any age in a person’s life. The average age of onset for generalized anxiety disorder, for instance, is thirty-one. As mentioned in the introduction, it is estimated that the total annual cost of anxiety runs in the neighborhood of $47 billion annually.

One of the reasons that anxiety rates are soaring in the United States could be the fact that the problem is widely under-treated (U.S. Department of Health and Human Services, 2018). One reason for this is stigma: Many people with anxiety are reluctant to seek treatment because they believe others (family members, neighbors, etc.) will judge them poorly if they learn they suffer from anxiety. As with so many aspects of anxiety, the perception of stigma varies greatly by age, ethnicity, religious, and other cultural lines.

In addition to stigma, recent studies have shown that in many areas of the United States, there is a workforce shortage in child psychiatry and in other behavioral health practitioners (Health Resources and Services Administration/National Center for Health Workforce Analysis, 2015). The problem of access is particularly acute in rural areas: A recent study found that 65 percent of non-metropolitan counties do not have a single psychologist, and 47 percent of non-metropolitan areas do not have a single psychologist (Reference Andrilla, Holly, Patterson, Garberson, Coulthard and LarsonAndrilla, et al., 2018). This problem is likely to worsen. The U.S. Department of Health and Human Services has projected that by the year 2025 there will be shortages of between 6,080 and 15,400 psychiatrists; between 8,220 and 57,490 clinical, counseling, and school psychologists; and as many as 26,930 mental health counselors (Health Resources and Services Administration/National Center for Health Workforce Analysis, 2015). Closely tied to the access problem is the cost of mental health care: Many people with anxiety cannot afford treatment.

3.3.3 Technological Contexts

3.3.3.1 E-Mental Health Services

Alongside CBT and drugs, several new technologies have been developed in the past twenty years that have helped anxiety sufferers. In the medical world, many of these technologies are referred to generally as “e-Mental Health Services.” Since the early 2000s, online services have helped connect patients to therapists online, for instance. Today, several companies provide remote talk therapy sessions carried out over smartphone apps. Similarly, there are a number of services that have made it possible for trained practitioners to administer CBT remotely. Two such services are Joyable and Talkspace. Joyable delivers an eight-week CBT program administered by a trained practitioner, and Talkspace connects users to licensed mental health professionals.Footnote ⁴

Recent studies have shown that online mental health interventions can be as effective as traditional in-person therapy for treating both depression and anxiety. In one study, for instance, the recovery rates of individuals treated with online CBT for panic disorder and agoraphobia were comparable to people treated in-person. Based on a thirty-month follow-up study for the treatment of social phobia, the long-term effects of CBT delivered over the internet was comparable to CBT carried out over the internet (Reference Mucic, Hilty, Parish and YellowleesMucic, Milty, Parish, and Yellowlees, 2016).

Sometimes, e-Mental Health Services can be even more effective than traditional care. As mentioned earlier, the stigma that some patients associate with visiting a clinician can discourage them from seeking help. Because patients can access e-Mental Health Services from their homes, this barrier disappears (Reference Shoemaker and HiltyShoemaker and Hilty, 2016). Recent studies reveal that LGBTQ+ youth, for example, prefer to seek mental health care online (Reference DreyfussDreyfuss, 2019). As discussed later in this chapter, the privacy of user data could either reinforce or erode this perception of anonymity.

In addition to linking patients to traditional mental health providers, many apps facilitate complementary services and treatments that may be helpful. These include thousands of apps and online services related to yoga, mindfulness, eating disorders, tracking mental health conditions, and general stress reduction (Reference Mucic, Hilty, Parish and YellowleesMucic, Milty, Parish, and Yellowlees, 2016). A website called “Psyberguide” provides a helpful catalog of these services, along with descriptions and ratings. The website is maintained by Dr. Stephen Schueller, an associate professor of psychological science at the University of California at Irvine [8].

3.3.3.2 Social Networking

Social media marks another nexus between anxiety and technology. At least in theory, an online social network could help anxiety sufferers learn more about their problems and potentially to connect with others who are navigating this complex and challenging disorder. This could help to alleviate feelings of isolation and helpfully establish new social support systems for sufferers.

An emerging body of research has shown that mainstream social networks such as Facebook and Twitter can contribute to anxiety (e.g. Reference Hoge, Bickham and CantorHoge, Bickham, and Cantor, 2017).Footnote ⁵ The chief symptoms of this disorder are anxiety symptoms experienced by someone who is unable to check their social media notifications for even a short time [9]. Separately, the use of Facebook and Twitter has been linked to symptoms of depression. As of this writing, this body of research is still developing, and it is unknown how significant or widespread these problems are, however (Reference PanticPantic, 2014).

Beyond the most widely visited social networks, a number of online communities exist for people who suffer from anxiety. Health Unlocked, one such service based in London, administers a collection of chat forums where patients can connect, share advice and encouragement with one another online. Interviewed for this book chapter, a person with knowledge of the company’s services described it as “a complement to clinical treatment” that “breaks down the isolation” and possibly provides access to “helpful information you wouldn’t always find from a doctor.” According to the individual interviewed, the service is free, and the company generates revenue by selling anonymized and aggregated data to partners, including pharmaceutical companies.

3.3.3.3 Therapy Chatbots

Therapy chatbots mark another nexus between mental health and technology. The development of such apps could be traced back to the 1950s, when Alan Turing, an English mathematician, considered whether a computer could simulate a human conversation. To analyze the question, Turing developed a thought experiment that involved two people and a machine designed to mimic a person. One of the two human participants, whom Turing called the “interrogator,” types questions to a subject located in a different room. The subject responds through typed messages. The interrogator must then guess whether the subject is the other human or the machine. Turing used this hypothetical setup to show that, in principle, there is nothing to stop a cleverly programmed computer from fooling the interrogator. Turing’s work solidified his place as the grandfather of modern AI.

Turing’s work also inspired generations of researchers to build chatbots. The first, a chatbot named “Eliza” and developed in 1966, simulated a psychotherapist. Although the system could not comprehend what was typed to it, Eliza was able to simulate human-like understanding through simple pattern matching. In 1972, a chatbot named “Parry” cleverly fooled many human interrogators by simulating a person suffering from paranoid schizophrenia. In 1992, a chatbot “therapist” named “Dr. Sbaitso” was developed for MS-DOS-based computers. This “doctor” was notable because it was one of the first chatbots that were widely available to the public. Ultimately, a limitation of the early chatbots was the fact that their behavior was scripted through rules. Given a certain input, they would always produce a pre-programmed output. Programming all of the possible conversations that a person might wish to have with a machine would, in theory, require providing the machine with an internal model of the universe.Footnote ⁶

Since the late 2000s, chatbots or conversational agents have grown more sophisticated thanks to a new approach called “machine learning” (“ML”). In contrast to rule-based systems like Eliza, machine learning is a process by which a computer can identify and match patterns in massive sets of data. Provide an ML-based chatbot with enough examples of typical two-sided human conversations, and over time, the software will learn, through statistics and probability, what a good response to a new question it has never seen should look like. Many people saw the technology on display for the first time when IBM’s Watson system defeated the Jeopardy Champion, Ken Jennings, before a national audience in 2011. For many of us, examples closer to home include personal assistants built into our phones and smart speakers. Apple’s Siri, Google’s Assistant, Amazon’s Alexa, and Microsoft’s Cortana have (for better or worse) integrated themselves into billions of people’s daily routines.

Against this backdrop, a new generation of therapy chatbots has recently been developed and provided to the public (Reference HernandezHernandez, 2018). The first such service, Woebot, is the focus of this study. Tess, a chatbot developed by San Francisco-based X2AI, also delivers CBT interventions, along with similar techniques, including mindfulness therapy, acceptance and commitment therapy, self-compassion therapy, and interpersonal psychotherapy (Reference Fulmer, Joerin, Gentile, Lakerink and RauwsFulmer et al., 2018; Reference GionetGionet, 2018; Reference Green, Pearson, Rajasekharan, Rauws, Joerin, Kwobah, Musyimi, Bhat, Jones and LaiGreen et al., 2019; Reference Joerin, Rauws and AckermanJoerin, Rauws, and Ackerman, 2019; Reference Stephens, Joerin, Rauws and WerkStephens et al., 2019) [14]. The service works entirely through a text-like exchange. Wysa, a startup based in London and Bangalore, similarly delivers “CBT, mindfulness techniques, and other techniques via in-app text changes” (Reference WallachWallach, 2018). When a patient uses Wysa for the first time, the app asks a series of questions designed to help it understand what sorts of problems the user is grappling with. Some chatbot designers are moving beyond text in order to better diagnose patients. AiME (Reference ChinChin, 2018), a therapy chatbot developed by a team of researchers at UCLA, aims to detect risks of depression, anxiety, and addiction based on several minutes of video footage of a user’s face (Reference CondliffeCondliffe, 2017; Reference Garg and GlickGarg and Glick, 2018; Reference McCaneyMcCaney, 2019; Reference Morris, Kouddous, Kshirsagar and SchuellerMorris, Kouddous, Kshirsagar, and Schueller, 2018; Reference RielandRieland, 2018; Reference RossoRosso, 2018). To capture video and audio, the app requests access to the microphone and camera on a user’s device.

Early evidence, while limited, suggests that these tools are effective. A peer-reviewed study shows that Tess can significantly reduce symptoms of anxiety and depression, for instance. Youper, another entrant in the therapy chatbot industry, claims that more than 80 percent of its users “experience a reduction in negative moods after just one conversation.” Studies show that Wysa has promise in reducing anxiety symptoms, particularly in assisting children [10].

Because therapy chatbots need to collect private information from users in order to converse with them, these apps also carry privacy risks. These risks may seem especially serious today, as the technology industry is facing a widespread reckoning with user privacy. The privacy problem has played out in the form of countless user data leaks, hacks, and unauthorized disclosures by large technology companies – most notably, Facebook. The privacy issue is explored more deeply later in this chapter.

3.3.4 The Legal and Regulatory Landscape

The emerging therapy chatbot industry is taking form against a complicated patchwork of law and regulations. This policy framework significantly influences how these technologies are developed, funded, and provided. It is important to note that the areas highlighted here are simply the most visible and may not provide a complete picture of the laws and regulations relevant to every therapy chatbot, however.

3.3.4.1 Privacy Laws

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) forbids healthcare providers such as hospitals from disclosing specific types of patient data. These include patient names, zip codes, and dates of treatment.Footnote ⁷ A number of other privacy laws enacted at the state and federal levels impose civil liabilities for wrongful disclosure of such data by health-care providers.Footnote ⁸ Interestingly, HIPAA rules pertain only to so-called covered entities, which include health-care providers (e.g., doctors, clinics, psychologists), health plans (e.g., health insurance companies, HMOs), and health clearinghouses (entities that process health information into standard formats).

Therapy chatbots appear to operate outside the purview of HIPAA by billing themselves as “wellness apps” rather than new forms of traditional psychotherapy. This distinction seems murky, however, especially in light of the fact that patients are encouraged to share the same kinds of information with these apps as they might traditionally share with a therapist. John Torous, the chair of the American Psychiatric Association’s smartphone app evaluation group, has said that the “gap” in HIPAA coverage raises privacy concerns (Reference De Salvo and SamuelsDe Salvo and Samules, 2016; Reference NuttNutt, 2017).Footnote ⁹ A recent government report on this coverage gap made the same conclusion, stating, “As the electronic sharing and storage of health information increases, and as individuals become more engaged in sharing personal health information online, organizations that are not regulated by HIPAA, the FTC, or state law may collect, share, or use health information about individuals in ways that may put such data at risk of being shared improperly” (Department of Health and Human Services, 2016).

Depending on its geographic reach, a therapy chatbot could also be subject to the requirements of the EU’s General Data Protection Regulation (GDPR). The GDPR ensures that EU citizens are able to access their personal data, as well as information about how their data is processed. It also provides individuals the right to request that their personal data is deleted promptly upon request. The regulation also requires that companies collecting personal data design their products around user privacy. This requirement could include encrypting all personal data, making sure such data isn’t transferred unnecessarily and ensuring that encryption and decryption of data aren’t conducted on remote servers.

3.3.4.2 The FDA

The Food and Drug Administration (“FDA”) has had the authority to oversee the safety of medical devices since the 1930s. It does so by requiring device manufacturers to demonstrate the safety of their products. (If they cannot do so, the devices may not be made available for sale.) Traditionally, the FDA’s definition of “medical device” offered a fairly predictable and clear indication of what sorts of products might receive oversight. The recent rise of smartphones has presented a new and challenging question, however: Can a smartphone running an app constitute a medical device?

Under the FDA Act’s broad definition of “medical device,” the agency has the authority to regulate medical software before it is released to the public. Although the FDA has declined to oversee most types of mobile health apps carefully, it has published guidance documents that leave open the possibility that it could regulate mental health apps. The FDA has said it intends to exercise enforcement discretion over apps that (i) help patients (i.e., users) self-manage their disease or conditions without providing specific treatment or treatment suggestions; (ii) provide patients with simple tools to organize and track their health information; (iii) provide easy access to information related to patients’ health conditions or treatments; (iv) help patients document, show, or communicate potential medical conditions to health-care providers; (v) automate simple tasks for health-care providers. The twenty-first century Cures Act (CURES) excludes from FDA oversight apps “for maintaining or encouraging a healthy lifestyle and is unrelated to the diagnosis, cure, mitigation, prevention, or treatment of a disease or condition.”Footnote ¹⁰ By contrast, according to an expert at Woebot, the FDA might be more motivated in the future to regulate technologies that make clear claims regarding a particular clinical indication.

Some legal commentators are troubled by the FDA’s current approach, in light of the potential risks that patients may face when using psychotherapy apps that don’t work in accordance with accepted clinical procedures. Specifically, on the subject of CBT, Theodore Lee has commented, “untreated or poorly treated conditions can lead to adverse outcomes for the patient or others. The lack of conformity with established CBT principles for most CBT-based apps suggests that patients may not be getting adequate treatment” (Reference LeeLee, 2018). Despite these concerns, however, the FDA appears reluctant to regulate mobile mental health apps, and Congressional action like the CURES Act has reinforced this hands-off stance by removing many such apps from FDA purview.

3.3.4.3 The FTC

The Federal Trade Commission is charged to “protect consumers against unfair or deceptive acts or practices in commerce.” The agency accomplishes this in a variety of ways, including legal enforcement and providing education. Although the FTC has not initiated legal action against any manufacturers of mental health apps, it recently filed a complaint against a “brain training” app called Luminosity that claimed to provide cognitive benefits. Specifically, the app promised to “protect against … age-related conditions such as mild cognitive impairment, dementia, and Alzheimer’s disease; and will reduce cognitive impairment associated with … post-traumatic stress disorder, traumatic brain injury, attention deficit hyperactivity disorder, Turner syndrome, stroke, and other health conditions.”Footnote ¹¹ The FTC claimed that this language was deceptive. Under a settlement agreement, Luminosity paid $2 million and was enjoined from future deceptive conduct.Footnote ¹² The suit serves as a cautionary tale to any company that cannot substantiate the benefits they claim to provide.

3.3.4.4 The VA and DARPA

Interestingly, the US Department of Veterans Affairs (“VA”) has developed and delivered e-Mental Health tools to soldiers as well as the public. As Davor Mucic and Donald Milty explain in their book e-Mental Health,

Once again, the military may be in the lead for providing Internet-based care options, with foci of remote screening and assessment, post-deployment adjustment, suicide prevention and management, and delivery of training/education. Many personnel prefer to receive care at home – particularly those with anxiety, PTSD, and phobia – in addition to those who prefer to avoid the stigma of entering a mental health facility (Reference Mucic, Hilty, Parish and YellowleesMucic, Milty, Parish, and Yellowlees, 2016).

The VA has developed and made available several apps for treating PTSD and other forms of anxiety. These include PTSD coach, an app that provides users with the ability to track and monitor PTSD symptoms; CBT-i, an app designed to deploy CBT to treat insomnia; ACT Go, which delivers acceptance and commitment therapy; and Moving Forward, an app designed to help reduce stress. All of these apps are provided to the public at no charge.

DARPA, the branch of the US military that focuses on developing new technologies, has also been involved in developing therapy chatbots. In 2014, the agency funded a study of “Ellie,” a chatbot therapist developed by researchers at the University of California’s Institute for Creative Technologies (Reference GratchGratch, 2014).

3.3.4.5 Intellectual Property

Intellectual property protection is also relevant to chatbot therapy apps. Patent protection is designed to encourage investments into developing new technologies by offering inventors an exclusivity window of twenty years. To receive patent protection, an invention must be new, non-obvious, and useful. Importantly, it also must qualify as patentable subject matter. For companies seeking to patent chatbot therapy software, this requirement could present some challenges. Although software has traditionally enjoyed robust patent protection since the late 1990s, a Supreme Court ruling from 2014 appeared to narrow this form of protection to include only software that does not constitute an abstract idea and that includes an inventive concept. The Court’s lack of specificity over the meaning of these terms has called into question the validity of many software patents since the ruling.

Despite these uncertainties about patenting algorithms, however, there still appear to be opportunities for patenting in the realm of chatbot therapy apps. Ginger.io, a company that provides a therapy chatbot service, has received eleven patents for, among other things, the modeling of psychological states (e.g., moods, conditions) based on what a user has expressed to their app, and for providing therapeutic interventions such as CBT to users based on those models. IBM, which created the widely known “Watson” AI system, has also received a number of patents in this space. The potential for acquiring patents may explain why academics who have sought to commercialize therapy chatbots have left their universities: Universities often require their employees to assign intellectual property they develop to their employers.

Trade secrecy, meanwhile, seeks to encourage investments in innovation by a different means: providing a cause of action that can be asserted against anyone who wrongfully misappropriates valuable and secret information. Trade secret protection is provided through state and federal statutes. Because valuable software systems and data can often be kept secret through tools like encryption and other security measures, trade secret protection has played an important role in many industries that rely upon data.

3.3.4.6 Public–Private Partnerships

Public–private partnerships are yet another place where law and policy have affected the mental health app industry. In 2018, state and county mental health officials in California worked closely with two Silicon Valley companies, Mindstrong and 7Cups, to evaluate the efficacy of mental health apps for patients of the public mental health system (Reference CareyCarey, 2019). Mindstrong’s app excels in identifying patterns in user behavior that could signal the onset of certain types of psychological episodes. 7Cups, meanwhile, connects patients with trained “listeners” who can connect them to trained therapists.

As of this writing, press reports indicate that the effort has faced challenges related to recruitment and informed consent relating to the disclosure of user data. Commenting to the New York Times, Dr. John Torous, director of the division of digital psychiatry at the Beth Israel Deaconess Medical Center in Boston stated, “If we’re excited about the potential of data, we should be equally worried about the risks, and those seem to be evolving faster than the scientific benefit … There may be guarantees the companies make about not sharing data, but if the company is sold to another company, that data goes with it,” he said. “A lot of apps have that clause buried in 13 pages in mouse print” (Reference CareyCarey, 2019).

3.4 Goals, Objectives, and History

The Woebot story began in 2016 when Alison Darcy, a Stanford University clinical research psychologist, began exploring the potential of a therapy chatbot to help people with common mental health disorders. At the time, Dr. Darcy was working alongside Andrew Ng, an eminent expert in the field of artificial intelligence. Dr. Darcy became convinced that a well-designed chatbot could truly make a difference in the lives of people who suffer from anxiety and depression – so convinced, in fact, that she decided to leave academia to build a commercial product full-time.

The science behind Woebot is based largely on cognitive-behavioral therapy. As Darcy explained in a recent interview with Wired, “CBT asks people to recast their negative thoughts in a more objective light. Patients are encouraged to talk about their emotional responses to life events, and then stop to identify the psychological traps that cause their stress, anxiety, and depression” (Reference MolteniMolteni, 2017). Woebot delivers CBT in the form of tightly scripted conversations, which are usually triggered when users describe problems they are experiencing. In response to a user who tells the system they are anxious, for instance, Woebot might tell the user about common distorted patterns of thinking in people who experience anxiety. The system then might walk the user through a common CBT exercise, such as writing down a recent anxious thought and searching for cognitive distortions in it.

Dr. Darcy and the team she assembled believed that Woebot couldn’t succeed if users didn’t enjoy using it and returning to it. The threshold challenge for her and her team thus became a question: “How do you make CBT in a way that people want to use every day?” Dr. Darcy and her team found answers to this question in an unlikely place: the world of gaming. “As a field, gaming has developed some of the most sophisticated engagement methodologies you can find,” she explained. Although gaming and therapy might seem like disconnected fields on the surface, Dr. Darcy and her team saw a meaningful connection: “A game places you in a suspended reality, and therapy is also a suspended reality. For example we typically want someone to ask themselves, “what if you didn’t have to think in a particular way”? In a sense, “you get to try out, and experience a different identity.” One technique that seems to reflect Woebot’s gaming DNA are quizzes: after teaching a user a new idea, the system often presents a short quiz designed to test the user’s understanding.

During the development process, which took about nine months (Reference MolteniMolteni, 2017), Dr. Darcy and her colleagues asked a team of writers and therapists to help make Woebot’s conversational abilities sound as natural as possible [10]. According to the company, this work was successful. “People started calling it by name,” Dr. Darcy explained. She emphasized, however, that Woebot is not designed to fool users into thinking it is a human. Instead, it is very transparent – humorously so, at times – that its entire personality is nothing more than artifice. Interestingly, Dr. Darcy believes that a human user’s awareness that Woebot is not a human could lead to be more candid conversations.

A peer-reviewed study conducted by researchers at Stanford and Dr. Darcy concluded that Woebot was “a feasible, engaging, and effective way to deliver CBT.” The study compared the outcomes for two groups of individuals recruited from a university setting who agreed to either engage with Woebot or receive information about mental health from the National Institute of Mental Health’s website. The study reported as follows: “The study confirmed that after 2 weeks, those in the Woebot group experienced a significant reduction in depression” (Reference Fitzpatrick, Darcy and VierhileFitzpatrick, Darcy, and Vierhile, 2017).

3.5 Attributes: The Characteristics of Information Shared

Woebot collects a wide variety of data from users. Nearly all of this information is unstructured text collected through conversational prompts presented to users in the Woebot app, which is available for iOS and Android operating systems (Reference MolteniMolteni, 2017). Importantly, as Dr. Darcy explains, the system absorbs “only as much as you divulge to it.” Woebot could, but does not necessarily, record a user’s name, the problems the user wishes to address – for example, anxiety, depression, loneliness – and specific thoughts the user has had associated with those problems – that is, “I’m not popular enough” and “I’m worried I’ll fail an upcoming test.”

To appreciate the variety of data that can be collected from users, consider the following exchange. (This is a real exchange that I carried out with Woebot for this book chapter.)

Woebot: Howdy Mike, so nice to hear from you. How are you feeling today?

User: Anxious.

Woebot: Are you experiencing these feelings right now?

User: Yes, now.

Woebot: I see. Let’s get ready to identify & write your negative automatic thoughts in this moment …

User: OK, I’m worried that I might develop a serious illness because I’m feeling a bit unwell.

Woebot: btw, I’ve remembered your most frequent distortions in thinking. Would you say that “All or Nothing” was in the thought?

User: Yes, actually.

Woebot: Now it’s time to rewrite your thought without any distortions. Give it a go!

User: Feeling a bit unwell does not mean that I have, or will have, a serious illness.

This short dialog shows how Woebot uses its memory of a user’s name and a cognitive distortion the user has experienced to deliver a CBT technique. The exchange also illustrates why a patient-driven modality like CBT is so well suited to a chatbot interface. As Dr. Darcy explained in an earlier interview with Wired, “A good CBT therapist should facilitate someone else’s process, not become a part of it” (Reference MolteniMolteni, 2017). According to Woebot, the service engaged in two million conversations per week in early 2019 (Reference NuttNutt, 2017).

According to Dr. Darcy, the system retains records of all of the chats it has with users and stores them on a secure server. The company then de-identifies and aggregates these records to draw new insights from them. This underscores an important point: Woebot does not use machine learning techniques to generate its written responses to users. Instead, it uses ML to initiate partially scripted exchanges with users. Commenting for a recent magazine interview, Darcy explained, “Woebot uses machine learning to direct the right content or methods to that person in that moment. However, it is true that we do this in a highly controlled way” (Reference LeeLee, 2018).

The company has already found interesting insights in its pool of data. “Some of the things we’re learned have surprised us in the aggregate,” Dr. Darcy explains. “There are things you just don’t can’t see in a traditional setting with one data point. We have been surprised, for instance, to see how many people experience loneliness. Some younger users report feeling lonely while they are in social settings. We’ve also learned from people reaching out in the middle of the night that they have postpartum depression.” Dr. Darcy and her team have also been struck by the remarkable similarity in what users write. “It’s not just that people say similar kinds of things – they say the exact same thing,” she says. Woebot’s data scientists apply insights like these to improving the topics that Woebot focuses on, and how the software interacts with users. In this way, the more that each user interacts with the system, the more helpful the system can become to all users.

3.6 Governance

Woebot Labs is a privately held corporation founded by Dr. Alison Darcy in 2017. The company’s headquarters are in San Francisco. According to reports provided by PrivCo, a service that provides research data on privacy companies, Woebot received $8 million from venture capital investors in March, 2018. In exchange for this “Series A” investment, the investors received equity in the company. The company’s website lists the CEO (Alison Darcy), the Chief Clinical Officer (Athena Robinson), the Chairman of the Board of Directors (Andrew Ng), and the Chief Technology Officer (Casey Sackett) (Reference KnightKnight, 2017). Decisions relating to Woebot and the user data that underlies it are entirely made within the hierarchical structure of the corporation. Corporate decisions may be constrained to some extent by the company’s contractual relationship with its source of VC capital. Although the contract is not public, such deals often grant the shareholders significant power over key corporate decisions, such as the sale of the company.

Beyond the formal corporate structure, professional standards, and codes of conduct constrain the behavior of Woebot’s leadership. The CEO, for instance, is a clinical research psychologist and adjunct faculty member of Stanford School of Medicine. The company’s CCO is a licensed clinical psychologist and an adjunct clinical associate professor in psychiatry and behavioral sciences at Stanford School of Medicine. Thus, the two most important decision-makers at the company are familiar with and must adhere in varying respects to codes of conduct and ethical standards promulgated by Stanford University. These require a psychologist to take reasonable precautions to protect patient confidentiality. Moreover, as a clinical psychologist, the CCO must adhere to similar codes and standards promulgated by the American Psychological Association and the state of California. However, whether Woebot is regarded as a therapeutic intervention and whether users of a therapy chatbot might be “patients” under a particular rule could be debated.

The company’s privacy policy offers a view of how user data is treated [11]. Importantly, the service is available in two forms: through a stand-alone app and through Facebook’s Messenger chat service. Woebot’s privacy policy states, “Conversations with Woebot within Facebook Messenger are subject to the Facebook privacy policy. Facebook can see that you are talking to Woebot, and they can see the content of the conversations.” Users seeking greater privacy than this will likely opt for Woebot’s standalone app, which is available on iOS and Android, the two most widely used mobile operating systems. Without Facebook acting as an intermediary, Woebot alone can control the flow of user data.

To help explain this information flow, the company helpfully provides a chart on its website that divides user data into different categories and explains the company’s purpose for collecting and processing it. The categories of user data include: “Personal information (including first name), password, referral code for participation in special programs, and email address,” “Financial and billing information,” and “Conversation data,” which the company explains includes “Information, participation data, text, graphics, video, messages, responses to treatment and satisfaction surveys, or other materials generated through your interactions with Woebot.” Under a section of the policy titled, “Where does Woebot obtain my personal data from,” the policy states, “Most of the personal data we process is obtained from you when, through the application you: register for a Woebot account and exchange messages with Woebot.”

The policy then defines what the company may do with this data. Under a section of the policy titled “Use of De-identified and Aggregated Information,” the policy states that user data may be used to “create de-identigied and aggregated information,” for “analyses we create.” Such analyses would presumably include the types of internal studies of user interactions with the system described by Darcy earlier in this chapter.

Woebot does not make user data directly available to any users. As explained earlier in this chapter, the company takes precautions against unauthorized data access by storing user data on a secure server. The company’s privacy policy states that the company may share it (with some restrictions) with certain third parties, however. According to the policy, these parties include “service providers who provide technical and support services to help us provide an improve the product and its services.” The policy also explains that the company may share personal data with third parties for a variety of legal reasons, including compliance with applicable law, law enforcement or national security requests, subpoenas and the like. The policy also explains that the company reserves the right to share user data with insurance companies, financial institutions, or in connection with a corporate reorganization, merger, or acquisition. Finally, it states that user data may be shared with third parties in connection with “Special Programs.” The term “Special Program” is not defined in the policy, but the document states that relevant third parties under this provision may include “your employer, certification authorities, or other medical and academic partners who conduct the study.”

On the whole, Woebot’s governance of user data shows the unique potential that chatbots may have to aggregate health data and, in turn, address the mental health crisis. By allowing users to receive help through a smartphone app, the service removes the sense of embarrassment that some people may feel about visiting a therapist in person. This, along with the company’s treatment of user data, affords potential users with a sense that their conversations will be kept private. To benefit from the service, users must converse with Woebot. With each such conversation, the designers of Woebot have more data with which they can improve the service for others. And, because Woebot can talk to many users at the same time, the service can connect with more people than the current population of mental health workers is able to.

3.7 Challenges

Because therapy chatbots have been available to the public for just a short time, a clear picture of the challenges Woebot and its competitors might face hasn’t yet developed. It is possible, however, to discuss challenges that the company and experts in the industry perceive.

Although many bodies of law and regulation are relevant to therapy chatbots, one of the most significant relates to user privacy. Privacy is critical to Woebot for several reasons. First, the product solicits or otherwise prompts users to share information they consider private related to mental health conditions. Interestingly, a 2018 public blog post written by Dr. Darcy suggests that decisions about user data will be guided by long-standing rules and norms from the worlds of medicine and science: “First and foremost, this company was established by psychologists, which means we are subject to an ethical code of conduct. For us, this is so obvious that it seems self-evident.”

The economic potential for mental health data seems significant. As an expert on medical ethics for this book chapter explained, insurance companies have an enormous interest in obtaining this data. “The question that policymakers should focus on,” she said, “is whether they should have access to it. Should people with preexisting conditions revealed by their data still receive coverage? Can an insurer require you to sign up for one of these apps and use them?” These questions led the expert to opine that new laws limiting the data that insurers may receive and act upon should be enacted.

In addition to insurers are advertisers. In the 2010s, a common business model for online services is targeted advertising – the practice of directing advertisements to users based upon what they might like to buy. Online services usually accomplish this by collecting and processing data about their users and building up profiles that reflect users’ probable likes, dislikes, interests, and so forth. For a fee paid by retailers, they direct ads to users. This business model has been phenomenally lucrative to ad sellers, helpful to ad buyers, and it has helped fuel countless avenues of important research that stands to benefit the public. Privacy advocates have long argued, however, that this practice can be harmful when users do not have a clear view of what information has been collected about them, with whom this information has been shared, and the purposes for which it is has been shared (Reference WuWu, 2017). These perspectives gained support when, in 2018, press reports revealed that user data collected by Facebook had been shared with an organization called Cambridge Analytica for the purpose of political advertising. The news stories garnered enough of the public’s attention to blossom into a scandal that, over time, has brought privacy to the forefront of public policy discussions about technology.

Woebot’s creators are keenly aware of how these developments might harm existing users and discourage potential users. Since the product was launched in 2017, Woebot has been unequivocal in their pledge not to monetize user data for third-party advertising (Reference MolteniMolteni, 2017). Interviewed for this book chapter, Dr. Darcy explained that advertising runs counter to the company’s core mission: “What we’re best at,” she explained,

is this ability to challenge distorted thinking. And one of the most commonplace distorted thoughts is “I’m not X enough.” Almost everybody has that core distortion, and it’s something the ad industry has fed off of. It is the belief that allows ads to be effective. The idea that we would sell our data for advertising goes against everything the company stands for.

The post goes on to explain that selling data to advertisers would “instantly undermine” the trust users have placed in Woebot. “Our investment has been in symptom change outcomes – not in data gathering. If it were the latter, we would have built our product to orient around gathering as much data as possible” [12].

Consistently, the company has been a champion of GDPR. As Dr. Darcy explained, “We supported GDPR, and last May we did a lot of work towards complying with it. [The GDPR] shifts technology companies toward providing data ownership at the individual level, and we go for that level of privacy because we believe in it.” Dr. Darcy added that, because leaders within Woebot come from human subjects research backgrounds, they are familiar with obtaining “seriously painstaking informed consent.” She added, “Since the company launched in 2017, all user data is limited to the minimum amount required for the service to function. Data is not linked to users in an identifiable way” [13].

Interestingly, Woebot isn’t the only mental health firm in favor of robust privacy. Another CEO interviewed for this article suggested that HIPAA should be extended to cover mental health data collected by corporations. “[I]f we can get a policy that says digital biomarkers are protected health information, then companies like Facebook and Google which could collect this stuff in the past suddenly couldn’t do that anymore. You’d need to treat the data in a manner consistent with HIPAA.”

Another problem that could undermine public confidence in therapy chatbots is lack of demonstrated efficacy. As discussed earlier, Woebot and other services like it are not required to obtain FDA clearance because of the FDA’s hands-off approach. (As mentioned, if the FDA decides to be more involved in the future, it will likely focus on technologies that make clear claims regarding particular clinical indications – so-called digital therapeutics.) For now, though, consumers have no guarantee that these apps will deliver the results they promise. Additionally, a mental health app that is poorly designed could theoretically harm someone struggling with a mental health disorder.

Regulation isn’t the only way to address this problem, but it might be the most effective. Peer-reviewed studies like the one cited earlier, for instance, could help consumers believe that an app works, but consumers are unlikely to search for and read scientific literature. Psyberguide, also discussed earlier in this chapter, helpfully provides a review system [8]. Ultimately, though, it cannot block a deficient app from being available to the public. Companies that administer app stores, such as Apple and Google, do have this kind of gatekeeping power. Experts interviewed for this article suggested that app stores are not the best solution, however. As the CEO of a Silicon Valley mental health startup commented, “You wouldn’t want Walgreens to put a stamp of approval on drugs, so why would you want Apple to be making decisions about the efficacy and safety of mental health interventions?”

Dr. Darcy suggests that more regulation could be helpful in this area: “I would love to see more regulation, honestly, because the fact is, there hasn’t been any. If health and wellness apps are quickly adopted but lack any support for efficacy, that threatens the public’s confidence.” Of course, a lengthy FDA review process would seem to run counter to the fluid, ongoing process of product improvement that therapy chatbots are based on. To balance the need for regulation with the realities of innovation, Dr. Darcy suggests a new approval process that focuses on companies rather than products – a process that is already in play at the FDA. “I love the idea of shifting away from approving products to approving companies. Allowing them the freedom to innovate quickly. The government might say, ‘you have to show us how your company operates, how your QA systems are carried out, and, and so forth.’”

3.8 Conclusion

The emerging body of scientific literature on therapy chatbots suggests that these services could help large numbers of people reduce their levels of anxiety and depression. This possibility seems important for all of the reasons that these mental health disorders are so harmful: Anxiety can drain the enjoyment from life, interfere with personal relationships, as well as their productivity. Moreover, the problem only seems to be growing more widespread.

Therapy chatbots also may offer some unique advantages to traditional therapy. They can serve millions of people in a cost-effective way, for example. Because they can be accessed within one’s home, they don’t involve the stigma that some people associate with seeking mental health care. They can be accessed by people who cannot easily travel to a clinic. Finally, although these tools are not designed to replace traditional mental health treatment, they could be a cost-effective supplemental form of help for many. These technologies are still new, however, and some in the field of psychology believe their full potential remains an unknown. As one expert recently stated in a newspaper interview, “These things can work well on a superficial level with superficial conversations, but are they effective tools, do they change outcomes, and do they deliver more efficient care? It’s still early” (Reference NuttNutt, 2017).

Ultimately, the success of therapy chatbots may not only depend on how well they can be engineered but how effectively they can manage pools of private user data. As discussed in the introduction to this chapter, chatbots suggest a promising new technique for aggregating and making use of large sets of mental health. By conditioning access to the benefits of the existing pool – for example, knowledge – on new interactions – for example, typed exchanges – they effectively establish an incentive that might not otherwise exist. By delivering aggregated knowledge through an interactive system that is tightly scripted – rather than, say, a database – chatbots can effectively obscure private user data. This privacy is reinforced by other measures the company has taken, such as encryption.

As effective as this approach seems, the Knowledge Commons Framework shows why privacy remains an important area of work for the company. In light of the stigma that some people feel regarding mental health disorders, there is a need for users to feel that their data will not be disclosed. The degree of privacy the law ensures, meanwhile, is relatively low. HIPAA doesn’t appear to cover chatbots billed as wellness apps, and the FDA has not sought to regulate data security in these types of apps tightly. Meanwhile, high-profile hacks and disclosures at companies like Facebook have made the public aware that companies have an enormous financial impetus to use private data to make money. As a result, the app’s creators and other companies like it must rely on public messaging. This has taken a variety of forms, including pledges not to use user data for advertising, corporate advocacy in favor of more robust privacy laws, and frank statements to the press that professional codes of conduct limit the behavior of corporate leaders. Ultimately, changes to the law combined with ancillary efforts like these may lead to a future where therapy chatbots can help people who suffer from anxiety and reveal new insights about the disorder itself.

4 Privacy in Practice A Socio-technical Integration Research (STIR) Study of Rules-in-Use within Institutional Research

Kyle M. L. Jones ¹ and Chase McCoy ²

4.1 Introduction

The ubiquity of information systems on university campuses for supporting university work has led to an undeniable increase in the quantity of institutional data. Higher education institutions have taken note of the trove of data to which they now have access, arguing that they have a responsibility to use data in service to their administrative and educational missions and to act upon accountability pressures from external constituents to use data to identify actionable insights directed toward institutional improvement (Reference Prinsloo, Slade, Menon, Terkla and GibbsPrinsloo and Slade, 2014). In response to this institutional data influx and to address mounting external pressures, learning analytics (Reference Johnson, Smith, Willis, Levine and HaywoodJohnson, Smith, Willis, Levine, and Haywood, 2011) and other data-based research and practitioner communities have emerged, while existing communities, such as institutional research, are transforming their practices to account for the evolving data environment (Reference Zilvinskis, Willis and BordenZilvinskis, Willis, and Borden, 2017).

Within this landscape, significant new privacy issues are emerging as a result of changing data use practices and the sociopolitical pressures on higher education institutions to surface, analyze, and act on data. One of the questions associated with these issues concerns how higher education actors are handling private data, especially student data, in praxis given the increasing sensitivity of the data (Reference Slade and PrinslooSlade and Prinsloo, 2013). However, the existing rules and norms that govern the privacy practices of institutional researchers and other data practitioners are often unable to account for the nuances of data privacy in praxis (Reference FullerFuller, 2017b; Reference ZeideZeide, 2016), which has led to informal and implicit institutional policies about student data privacy (Reference FullerFuller, 2017b).

Understanding the rules and norms that shape the practices of institutional researchers and other data practitioners in regard to student data privacy within higher education could be researched using descriptive methods, which attempt to illustrate what is actually being done in this space. But, we argue that it is also important for practitioners to become reflexive about their practice while they are in the midst of using sensitive data in order to make responsive practical and ethical modulations.

To achieve this, we conducted a socio-technical integration research (STIR) (Reference FisherFisher, 2012). STIR provides structured opportunities for research participants to integrate perspectives and methods from the social sciences and humanities. The STIR method targets small teams or groups of participants, often scientific laboratory researchers. We adopted this method to STIR a single institutional researcher over an extended period of time. The participant’s responsibilities entailed, among other things, conducting statistical analyses on important administrative metrics, such as retention, recruitment, and enrollment for their university’s administration. Drawing on Reference Crawford and OstromCrawford and Ostrom’s (1995) institutional grammar, we assessed the rules, norms, and strategies that governed the participant’s practices as they related to data privacy.Footnote ³ This theoretical join of STIR and institutional grammar helped us to answer the general research question: What rules-in-use govern the participant’s privacy practices, and how might STIR lead to modulations in those practices? In summary, the findings reveal that the participant was encouraged to reflect on the conditions of her context and her agency to make modulations of her own work and consider whether existing rules, norms, and strategies are justifiable. These reflections, in turn, led to active modulations where her practices were modified to more explicitly consider privacy or, at the least, brought about ideas for future privacy-focused initiatives (e.g., data management strategies and documentation processes).

4.2 Data Analytics in Higher Education and Challenges to Contextual Integrity

4.2.1 The Value of Analytics

The advent of new technologies and analytical techniques are enabling the proliferation of data and information within higher education institutions. Reference Goldstein and KatzGoldstein and Katz (2005, 11) explain that “the challenge [to colleges and universities] is no longer the lack of access to timely information”; it is the ability to make actionable decisions based on available information. In the early aughts, universities began to develop capacity for what was then called “academic analytics.” Like business intelligence, academic analytics is the use of various technological systems and applications to analyze accessible institutional data in support of decision-making.

Much of the capacity-building done in support of academic analytics has led to additional analytic practices to serve various ends, in part due to function creep. Most prominent among these practices is the learning analytics movement. Since 2010, institutions have methodically worked to make data about students once “unseen, unnoticed, and therefore unactionable” to be visible and analyzable (Reference Bienkowski, Feng and MeansBienkowski, Feng, and Means, 2012, ix). Defined, learning analytics is “the interpretation of a wide range of data produced by and gathered on behalf of students in order to assess academic progress, predict future performance, and spot potential issues” (Reference Johnson, Smith, Willis, Levine and HaywoodJohnson et al., 2011, 28). A driving goal of learning analytics is to “tailor educational opportunities to each student’s level of need and ability” (Reference Johnson, Smith, Willis, Levine and HaywoodJohnson et al., 2011, 28), but learning analytics is not just about learners: it is also about the learning context and can be used to “assess curricula, programs, and institutions” (Reference Johnson, Smith, Willis, Levine and HaywoodJohnson et al., 2011, p. 28). To a lesser extent than learning analytics, institutions have also begun using their information infrastructures to mine and analyze data about faculty performance and productivity (see Reference FlahertyFlaherty, 2016; Reference PatelPatel, 2016).

Why are higher education institutions pursuing analytics (academic, learning, faculty, or otherwise)? Reference Campbell, DeBlois and OblingerCampbell, DeBlois, and Oblinger (2007, 42) present analytics as a sort of salve for higher education, writing that “academic analytics is emerging as a new tool that can address what seem like intractable challenges.” As in other contexts, institutional actors and higher education pundits have applied powerful metaphors to express – and influence – the role of data mining at the university level (Reference Stark and HoffmannStark and Hoffman, 2019). Some argue that the data and information institutions can aggregate and analyze is akin to valuable natural resources, like oil and gold, that they have social, political, and economic value (see Reference Mayer-Schönberger and CukierMayer-Schönberger and Cukier, 2014; Reference WattersWatters, 2013). Proponents of analytics argue that the insightful information they create can help institutions defend themselves against mounting accountability pressures and provide useful insights regarding resource usage in languishing economic times (Reference Prinsloo, Slade, Menon, Terkla and GibbsPrinsloo and Slade, 2014).

4.2.2 Competing Interests

The turn toward data analytics in higher education raises particular questions about the effects caused by an increasingly data-driven, technocratic institution (Reference Slade and PrinslooSlade and Prinsloo, 2013). Maturing institutional data infrastructures enable the administrative surveillance of researcher productivity, instructional methods, and the day-to-day life of students, which in turn allow for granular reforms of programs, practices, and people – all in the name of institutional effectiveness (Reference SelwynSelwyn, 2014; Reference WilliamsonWilliamson, 2018). Reflecting on this point, Reference JohnsonJohnson (2016, 27) argues:

Data systems … are too often assumed to be value-neutral representations of fact that produce justice and social welfare as an inevitable by-product of efficiency and openness. Rarely are questions raised about how they affect the position of individuals and groups in society. But data systems both arbitrate among competing claims to material and moral goods and shape how much control one has over one’s life.

It could be that data analytics privilege bureaucratic and politically expedient outcomes in ways that suppress what is otherwise “educationally desirable” (Reference Slade and PrinslooSlade and Prinsloo, 2013), including developing just educational systems that support student autonomy and well-being (Reference Rubel and JonesRubel and Jones, 2016). Important questions emerge: Who has the power to wield institutional data, to what ends are analytics directed, and whose interests are served (or ignored)? Reference KitchinKitchin (2014, 165) reminds us that “there is a fine balance between using data in emancipatory and empowering ways, and using data for one’s own ends and to the detriment of others, or in ways contrary to the wishes of those the data represent.”

Government actors, institutional administrators, parents and guardians, and, among others, companies who develop and participate in educational analytics all have varying interests in maximizing value from analyzable data (Reference FergusonFerguson, 2012; Reference Rubel and JonesRubel and Jones, 2016). Such stated benefits include increases in academic success for students, but analytics also enable others to gain financial, social, and reputational advantages.

Consider the following examples of plausible conflicts of interests. Administrators want to decrease time-to-degree measures and increase graduation rates. One method may be to use analytics to direct students to enroll in academic programs or courses for which they meet the threshold for predicted success, say 75 percent. Students share the same goals, but forcing them down an academic path not of their choosing will not benefit them if they find their future careers to be dull and uninteresting. Where faculty are concerned, analytics may enable tenure and promotion committees to do peer-institution comparisons of research output and impact, which help them to make quicker recommendations, in addition to strategically build a core faculty according to standardized metrics. However, these analytics are decontextualized and limited; tenure and promotion candidates may not be provided the opportunity to tell a complete story about their body of work. These competing interests highlight the fact that data and information are not neutral artifacts, but instead they are “cooked” with the motivations of those who wield data and analytic tools (Reference Bowker and GitelmanBowker, 2013).

4.2.3 The Appropriate Flow of Information

When the appropriate flow of information changes, and those changes run counter to normative expectations, privacy is put at risk within a given context (Reference NissenbaumNissenbaum, 2010). With higher education analytics, the creation of new information flows – many of which contain identifiable data – and the alteration of existing flows to support analytic practices have raised privacy concerns, primarily but not exclusively regarding students (Reference Pardo and SiemensPardo and Siemens, 2014; Reference Rubel and JonesRubel and Jones, 2016). The problem with higher education analytics is that all of these parts are affected in some way by emerging analytic infrastructures, related practices, and changes in who is able to access and use data – which are indubitably affected by shifting politics and administrators’ neoliberal interests (Reference HeathHeath, 2014). Some existing informational norms are, therefore, uncapable of providing clear direction in this era of analytics. As a result, institutional actors may find themselves making sensitive, and often critical, data privacy decisions based on their own personal values and ethical judgement. For the purposes of this chapter, we focus on institutional researchers whose very role dictates that they access, manage, and analyze an array of data to inform institutional practices.

4.3 A Socio-Technical Integration Research Study of an Institutional Researcher

4.3.1 Downstream Effects, Impacting Midstream Practice

There is a need to better understand how higher education’s information workers, like institutional researchers, make sense of their moral practices as they implement data analytics into important decision-making strategies. Instead of looking at downstream effects and then shining the proverbial light after the fact, there is a need to look at – and influence – the design of ethically sensitive data technologies and practices closer upstream. These efforts are crucial for identifying problems before they are baked into socio-technical data analytics systems and individuals are made into and considered as data (Reference Jones and McCoyJones and McCoy, 2018). We argue that the socio-technical integration research (STIR) method can lead to positive upstream engagement and useful modulations at the midstream level.

STIR enables research practitioners – laboratory scientists, engineers, technologists, and information professionals – to consider perspectives from the humanities. STIR projects pair practitioners with embedded social scientists who together work to “unpack the social and ethical dimensions of research and innovation in real time and to document and analyse the results” (Fisher, 2010, 76). These partnerships enable researchers to study the practices of their research practitioner partners, while engaging them in conversations that explore the societal and ethical dimensions of their work. Surfacing these issues provides the conditions necessary for research practitioners to reconsider their efforts and make midstream modulations that reduce downstream harms.

4.3.2 Socio-Technical Awareness

During their time together, the STIR researcher takes the opportunity to move the practitioner toward “reflexive awareness,” or an attentiveness to “the nested processes, structures, interactions, and interdependencies, both immediate and more removed, within which they operate” (Reference Fisher, Mahajan and MitchamFisher, Mahajan, and Mitcham, 2006, 492). Such awareness provides the conditions necessary for the practitioner to consider one’s socio-political position, usage of resources, ethical reasoning, among other things, which can give rise to “goal-directed” (Reference Fisher, Mahajan and MitchamFisher, Mahajan, and Mitcham, 2006, 492) modulations that directly impact current practices. To build toward this opportunity for change, the researcher structures discussion protocols around these basic questions, which are asked in relation to a specific practice:

1. What are you doing?
2. Why are you doing it?
3. How could you do it differently?
4. Who might care?

The first question establishes the particulars of a practice (e.g., cleaning a laboratory table with disinfectant or developing an algorithm), while the second prompts the practitioner to take up the underlying justification(s) for the action. Question three begins to nudge the practitioner toward reflexivity by providing the intellectual time and space to consider alternative ways of doing and other justificatory reasons. The fourth and final question stimulates the practitioner to reflect on the present and proposed altered practice by considering relevant stakeholders and downstream effects thereon.

4.3.3 Modulations in Practice

There are three stages of identifiable modulations: de facto, reflexive, and deliberate. With de facto modulations, research data indicates that socio-technical integration occurs, but the research participant does not actively reflect on the integration because there is no incentive to do so. Reflexive modulations by participants arise because of heightened awareness of socio-technical considerations brought about by working with the researcher. In these cases, participants explicitly notice how social influences (e.g., actors, politics, values, resources, etc.) interact with a given practice. At the deliberate modulation stage, participants begin to act on their reflexive modulations. They take stock of their heightened awareness of the socio-technical milieu to plan strategies, curate resources, and make changes in their practices. Such changes may simply make their current practice more efficient and effective, and this would be a first-order deliberate modulation. But if the participant makes changes to alter the goals, objectives, and assumptions of the project due to enhanced social sensitivity, then these changes would be second-order deliberate modulations. In the remainder of this chapter, we discuss our work using STIR to study an institutional research practitioner and the participant’s privacy practices in praxis.

4.3.4 Joining STIR with Institutional Grammar

We integrated the STIR method (Reference Fisher, Mahajan and MitchamFisher, Mahajan, and Mitcham, 2006; Reference Fisher, Schuurbiers, Doorn, Schuurbiers, van de Poel and GormanFisher and Shuurbiers, 2013) and Reference Crawford and OstromCrawford and Ostrom’s (1995) institutional grammar for identifying rules-in-use as expanded to address information privacy concerns (Reference Sanfilippo, Frischmann and StrandburgSanfilippo, Frischmann, and Strandburg, 2018). The STIR approach was used to probe the research participant into considering and acknowledging the implicit socio-technical characteristics that guided the participant’s practice and those within the participant’s office, and with the intention that these have been made explicit that they would lead to identifiable modulations in the participant’s privacy practices. Institutional grammar’s rules-in-use were used to assess rules such as policies and laws, norms that govern privacy practices in institutional research, and strategies that shape the privacy practices of institutional researchers.

4.3.5 Study Design

The study’s participant was a single institutional researcher at a mid-sized public university. The participant’s institutional research responsibilities entail, among other things, conducting statistical analyses on important administrative metrics, such as retention, recruitment, and enrollment, and providing this information to their institution’s administration. Over four months, we conducted twelve semi-structured in-person and virtual interviews with the participant. Furthermore, during the interviews, the participant often shared data artifacts, such as an ongoing project on enrollment projections and trends, while discussing the practices associated with their everyday work. While studying one participant is a unique sample size, the STIR method has traditionally been used with small teams of scientific laboratory workers. Studying just one institutional researcher is adequate given the often solo nature of this type of professional’s work. Moreover, working with one individual allowed us to develop an intimate rapport and gain access to sensitive information shared by the participant, which may have been held back if we had also been working with her peers.

We developed an interview protocol to guide the participant to reflect on her privacy practices and those of her staff within the office of institutional research. The interviews sought to elicit from the participant reflections upon four decision components: the institutional research activities that they engage in (opportunities); the reasons for and against their practices (considerations); possible alternative approaches to their activities and reasons that might lead to acting on those alternatives (alternatives); and the possible outcomes if such outcome were acted upon (outcomes) (Reference Flipse, van der Sanden and OsseweijerFlipse, van der Sanden, and Osseweijer, 2013) to identify the rules-in-use, values, goals, and other socio-technical variables that shaped the practitioner’s privacy practice.

4.3.6 Data Analysis Procedures

We digitally recorded all interviews, using the audio to create transcriptions for coding purposes. We imported transcripts into MAXQDA, a qualitative data analysis application, and then coded interviews based on a two-stage approach. First, Reference Crawford and OstromCrawford and Ostrom’s (1995) institutional grammar approach was used to identify the rules-in-use that governed the practitioner’s institutional research privacy practices. These codes assessed the rules, norms, and strategies, and each of these rules-in-use’s associated attributes, aims, conditions, deontics, and consequences associated with the Governing Knowledge Commons (GKC) framework devised by Reference Sanfilippo, Frischmann and StrandburgSanfilippo, Frischmann, and Strandburg (2018). At the same time as these items were coded, we coded for the level at which a particular rule-in-use existed: individual, office, institution, external to the institution. Second, the interviews were then coded based on the STIR approach to identify the four socio-technical decision components, followed by codes for identifying the various socio-technical modulations that emerged throughout the interview process. What follows is relevant background information on institutional researchers and the findings we uncovered from our GKC-informed STIR.

4.4 The Role of Institutional Researchers in Higher Education’s Analytic Practices

4.4.1 Higher Education Intelligences

Institutional research is a branch of educational research that concentrates on improving “understanding, planning, and operating of institutions of postsecondary education” (Reference PetersonPeterson, 1999, 84). The role of institutional researchers is then to provide information to institutional administrators to aid in the improvement of planning, policy generation, and effective decision-making. Reference Volkwein, Liu, Woodell, Richard, MacLaughlin and KnightVolkwien, Liu, and Woodell (2012, 23) suggest that the institutional researcher is engaged in three areas of study, which they call the “golden triangle of institutional research”:

1. institutional reporting and administrative policy analysis;
2. strategic planning, enrollment, and financial management;
3. outcomes assessment, program review, accountability, accreditation, and institutional effectiveness.

Furthermore, institutional researchers are called upon to not only provide information to facilitate improvement in these areas, but to actively engage in information sharing practices that contribute to organizational learning and, in turn, improve institutional effectiveness (Reference Borden, Kezar, Howard, McLaughlin and KnightBorden and Kezar, 2012).

Effective institutional research practices require that the institutional researcher engage three types of intelligences as they relate to their institution and higher education in general: technical and analytical; issues; and contextual and cultural (Reference TerenziniTerenzini, 1999). Given the diversity of these intelligences, institutional researchers have to balance various, and often competing, demands from administrators internal to their institutions, and from their external constituents, including state and federal policy makers, and their local communities (Reference VolkweinVolkwein, 1999). One such balancing concerns what Reference VolkweinVolkwien (1999, 13) calls, “enrollment pressures.” Institutions are “asked to simultaneously admit more students (for financial health and access) and become more selective (to bolster academic standards and performance measures).” For this reason, he likens the institutional researcher to Janus, the two-faced Roman God of “doors and gateways” in that they have to look inward toward internal improvement, while contemporaneously facing outward to ensure that they are attuned to external accountability demands.

4.4.2 Information and Knowledge Managers

In addition to appraising the demands of variegated internal and external actors, Reference SerbanSerban (2002) emphasizes the institutional researcher’s role regarding managing the flow of institutional data and information throughout their institutions. It is for this reason that the institutional researcher should also be understood as their institution’s knowledge manager, whereby they are responsible for the “processes that underlie the knowledge management framework – creation, capturing, and sharing of knowledge – that serve both internal and external purposes and audiences” (Reference SerbanSerban, 2002, 105). Understanding and addressing the complexity of the flow of institutional information and data throughout institutions of higher education is necessary given growing interest in advanced analytic practices.

Where once there was a time when institutional researchers served as their institution’s “one source of truth,” this new environment is leading to situations where “decision makers at all levels are establishing their own data collection processes and analytics” (Reference Swing and RossSwing and Ross, 2016, 5). Reference Zilvinskis, Willis and BordenZilvinskis, Willis, and Borden (2017, 12) argue that broad interest in analytics across campus units and offices creates a new situation where institutional researchers are playing a different role, writing: “[w]orking on learning analytics projects requires IR staff to engage with colleagues who tend to use information in operational and individualized contexts rather than the more strategic and aggregate uses to which they are accustomed.” This is so because institutional research offices are no longer the primary source of data, information, and analytic insights; each academic unit and office increasingly uses highly contextual data to serve their information needs. For instance, advisors are adopting analytic systems to analyze student movement through curricula, and information technology offices are developing their own metrics and data dashboards to evaluate system usage and services. Reference Swing and RossSwing and Ross (2016) contend that because data flows are becoming more complex and analytics more widespread, institutional researchers should become more actively engaged in managing and shaping of policies regarding the flow and use of institutional information and data.

4.4.3 Governing Sensitive Institutional Data and Information

Among professional institutional research associations and in the research literature, there have been ongoing conversations about principles, rules, and national and institutional policies that do or should govern uses of institutional information (Reference ShiltzShiltz, 1992). Much of this work concerns privacy as it relates to security, confidentiality, and appropriate use. And since student data and information is of chief importance in institutional research, policy conversations tend to revolve around students and less about faculty and staff.

On a national level, institutional research practices are bound by the Family Educational Rights and Privacy Act (FERPA). The law dictates that educational institutions who receive federal funding must protect and hold in confidence student data and information considered a part of a student’s identifiable educational record. Institutional actors have the right to gain access to such records when they have a legitimate educational interest in doing so. However, Reference FullerFuller (2017a) argues that institutional researchers are often unaware or undertrained regarding FERPA. According to a survey of 232 institutional researchers, 53 percent self-taught themselves about FERPA, while 22.5 percent had received no training. Lacking in knowledge of FERPA and its relation to data privacy matters should be a matter of concern, given that data breaches and other FERPA violations have led to numerous institutions being litigated in recent years (Reference FullerFuller, 2017b). Knowledge of and training with regard to FERPA is especially important given that the law’s definitions and requirements are imprecise and/or able to be bent based on an institution’s interpretations (Reference ZeideZeide, 2016).

Information sharing and information flow practices are also guided by professional ethics, institutional policies, and personal values (Reference FullerFuller, 2017b). Regarding professional ethics, the Association for Institutional Research (2013) outlines how institutional researchers should handle privacy issues in their Code of Ethics. However, the code is scant on this issue and merely states that institutional researchers should balance privacy risks and confidentiality against the potential benefits that the information can provide to the institution. Additionally, institutional researchers’ practices are supposed to be informed by internal institutional policies on data privacy. However, Reference FullerFuller (2017b) claims that many institutions do not have formal written policies.

4.4.4 Ethical Murkiness

The ambiguity in federal law, the “squishiness” of codes of ethics, and possibly the lack of guiding institutional information policy leads institutional researchers into a murky, ethical gray area. Researchers acknowledge that educational data analytics – a social and technological practice – raises significant ethical concerns (see Reference Slade and PrinslooSlade and Prinsloo, 2013). If data analytics produced by institutional researchers and others are to be considered trustworthy and legitimate, then they must attend to the ethical issues, the so-called “critical barriers” that will determine success and failure of data-based analytic initiatives (Reference Gašević, Dawson and JelenaGašević, Dawson, and Jovanović, 2016, 2). Higher education analytics are “moral practice[s]” (Reference Slade and PrinslooSlade and Prinsloo, 2013, 1519) that must account for actual and potential harms brought about data and information access, analysis, and use (Reference Pardo and SiemensPardo and Siemens, 2014).

4.5 Governance in Practice: STIR Findings

4.5.1 Attributes

The findings below highlight how various information resources, policies, institutional actors, and various – and sometimes divergent – goals and objectives influence and frame the work done by the institutional researcher who participated in our socio-technical integration research (STIR). But briefly, it is important to highlight how these attributes make up the contextual background of the participant.

The participant’s resources are data-based and informational. She relied on datasets in various forms to complete her responsibilities, and she primarily used a centralized data warehouse to access and export data to her local computer for statistical analysis and data visualization purposes. Some but not all data sets were shared on a local network in the office of institutional research with specific user permissions set to limit access and protect sensitive data. Notably, the office’s information infrastructure was described differently in terms of its data security protections when compared to other offices on campus. The datasets were comprised of identifiable and de-identified student data, in addition to “raw” and aggregate data provided by other institutional offices, including among others human resources and admissions.

The participant’s office was, as expected, comprised of a staff of roughly twenty individuals. The makeup of the staff included administrators, data analysts, institutional researchers, and part-time graduate assistants. It would not be accurate to think that this office worked in a silo; they often collaborated with institutional administration to provide them actionable information and worked with other offices on campus when specific projects needed access to and analysis of institutional data. The work the institutional research office did was shaped and limited at times by the political interests of those to whom they reported data findings, as well as policy set by the institution’s office of information technology.

The goals and values of the institutional research and her office were not made explicit during interviews. However, such things were made clear upon examining the office’s documentation. The office strives to provide actionable information to support decision-making throughout the campus, as well as support the institution’s wider goals around student success and the effective operation of the campus. It is notable that the office explicitly aims to provide access to a data infrastructure and related tools, signaling that its staff wishes to be enablers of data – not gatekeepers – and help institutional colleagues leverage data in innovative ways.

4.5.2 Existing Rules-in-Use

The analysis of the conversations with the participant uncovered the rules-in-use (norms, rules, and strategies) that govern her work practices with institutional data. Furthermore, the analysis found that addressing the levels at which rules-in-use occur is important for understanding how rules-in-use emerge and differ in practice. For this analysis, we found rules-in-use at the following levels: individual, office, institution, and external-to-the-institution. The level of the rules-in-use impacted their scope and determined what and how they were governed. Understanding rules-in-use and the levels at which they occur will be important for our to-be-discussed STIR findings, where we explore the participant’s reflections upon and her modulations of her privacy practices and, in turn, the rules-in-use that govern her practice.

The participant’s privacy practices are governed by a variety of rules, most of which occur at the external-to-the-institution, institution, and office levels. Regarding the rules external to the institution, her and her colleagues are required to follow the appropriate FERPA guidelines. At the university level, university rules require that the participant and her colleagues are compliant with rules related to data sharing and use, such as ensuring that data consumers have the proper data use training and have signed the institution’s data use agreement in order to receive and share data. At the office level, the participant described rules primarily related to working with the university’s institutional review board (IRB) prior to conducting research and ensuring that she and her colleagues are up to date with institutional and federal privacy policies.

Regarding the norms, these predominantly occur at the institution and office levels. At the institution level, the norms revolve around data use for institutional improvement. According to the participant, the institutional norm is that data should be shared and made available to those seeking to improve the educational mission of the institution to develop useful insights. As the participant stated, “I think it’s been a policy [at the university] that we share information and we don’t try to silo things.” Given this, it is expected that her office collaborates with and supports other offices across the campus. The office level norms that guide the participant and her office’s practices relate primarily to protocols for how to share data with those external to their office and how student data should be de-identified in their institutional research products, such as in reports, dashboards, and data sets.

The strategies that govern the participant’s privacy practices primarily occur at the office and individual levels. These relate to spatial privacy practices and the appropriate use of student data. Spatial privacy practices refer to how the participant and her colleagues consider and modify their work and office spaces to ensure that student data are kept secure. Regarding office level strategies for appropriate data use, the participant stated that they assist data consumers to develop their business use cases when requesting access to institutional data. In addition, her office collaborates with other campus offices when they have questions about data access and use rules and norms.

Based on the analysis of the rules-in-use occurring at the office and individual levels an interesting trend emerged. The office and individual rules-in-use that the participant and her colleagues developed were in response to instances when the institution failed to adequately govern the participant’s privacy practices. In some instances, the office and individual’s rules-in-use explicitly contradicted that of the institution, but in other situations individual rules-in-use at the office and individual levels were developed in response to gaps at the institutional and federal levels. These contradictions and gaps will be further explained in the following section, where we discuss the findings of the STIR analysis.

4.5.3 Rethinking Rules-in-Use with Socio-Technical Integration Research

Conversations with the participant probed to examine the socio-technical conditions of her work regarding data practices and privacy. These probing questions enabled the participant to reflect on her workaday routines, but also to nudge the participant to examine the criteria (e.g., values, principles, procedures) that inform decisions a part of her routines. As we describe below, the participant was acutely knowledgeable of privacy issues, related processes and procedures, and had even developed unique privacy-protecting strategies.

The participant was keenly aware of the fact that data to which she had access, especially student data, were sensitive and needed to be kept secure. She expressed a personal ethos of responsibility, suggesting several times that data handling actions needed care and attention to potential downstream privacy effects. When asked why she felt this normative responsibility, she replied, “Why is it so important to protect student data in the way that we are? Because we’re here for the students because we want to make sure that we’re not creating any kind of violation, that we’re not violating this trust that they have.” Notably, she suggests that her ethos is one to which her institutional research peers subscribe as well. Additionally, the motivation for protecting the privacy of those she analyzes in data is due to a sense of obligation to uphold the trust data subjects have in her, but also the institution, to use data appropriately. If trust was something that could be violated, we asked, then what would be the consequences? To this probing question, she suggested that 1) students would not be willing participants in research projects and 2) her office would “lose access to some of the data that we need to be able to do our jobs” due to non-compliance with existing policies.

Before beginning her analytical work, the participant claimed that she strategically worked with her institution’s institutional review board (IRB), using it as a means to discuss and protect data privacy. “We tend to err on the side of caution,” she said, “and at least talk to IRB about every single project that we do … . We want to at the very least make IRB aware of [the project] and get some sort of approval.” Pursuing an “ethical consultation” with IRB, to the participant, would help her understand if her work was “consistent with good ethical research practices,” in compliance with federal rules, and in alignment with what the institution expects regarding access to student data. The IRB could help her limit downstream harms, such as the following she expressed in a conversation:

It is easy to see how that sort of access to data could be abused, um, should it get into the wrong hands. People could theoretically be linking, you know, survey responses with income data from the [Free Application for Federal Student Aid (FAFSA)] or things like that if it’s not used properly.

After working with the IRB, the participant described the process for gaining access to data. Institutional policy limited who within the institution could gain access to different sources of data. Depending on the source of the data, the participant would have to consult with a dedicated data steward (e.g., the registrar for enrollment data, the bursar for financial data, or a library administrator for library data). Conversations with the stewards required her to “build a business use case to justify” data access and use. The participant emphasized that the creation of the “business use case” was a collaborative process, and she stated that finding the necessary justification would not be as easy at other institutions where data sharing was more restrictive and there was less value placed on analytics based on combinable data from across various offices.

Before conversations gained significant traction, the participant would have to prove that she had successfully passed the institution’s FERPA training and consented to its data use agreement; both processes informed the participant of her legal and institutional responsibilities. Since her office also serves as a source of institutional data and information, the participant also asked for proof of the same compliance credentials from those with whom she worked within the institution. Notably, she questioned if others thoughtfully considered the compliance measures like she did, saying, “I would hope that everybody reads that information and takes it seriously; I don’t have any kind of assurance.” When she provides a data set, she makes the data requester “promise” that data will not be shared unless carefully outlined and approved ahead of time, detail how the data will be used, and share their data deletion strategies, all in order to make sure that institutional policy compliance is assured. When data requests proved difficult to determine access and use privileges, she consulted with other data stewards to seek their interpretations. The participant did not detail requirements that guided these types of conversations and blindly trusted that all data stewards would be just as rigorous in their analysis of data use requests.

About her office’s data privacy practices, she revealed a significant detail concerning the physical layout of her own office and that of her colleagues. The conversation unfolded in this way:

Researcher: You were saying that each analyst I think has a door, right? It’s not in a cubicle. And you were saying your monitors are faced away from the wall.

Participant: We’re all kind of positioned in a way that nobody just walking by can just take a look and peek at your machine.

Researcher: So how did that come about? It’s an interesting decision to make.

Participant: We were just very sensitive about the fact that we had student-level data in our records on our computer at any given time, and we just wanted to be cognizant of the fact that somebody could just come in just happen to accidentally peep over and take a look at something that they weren’t supposed to be looking at. We are responsible for data security for this information … . It is something that I think we do need to be conscious about.

When her office hires student workers who do not have the benefit of a secure office, students are made aware of the fact that their work may involve sensitive data and that they should situate their computer screens to limit others from looking over their shoulders. The participant noted that her office’s privacy-protecting strategies were not as stringent as those in other offices, such as financial aid, whose employees “keep the windows drawn” and do not allow unaccompanied visitors.

Regarding digital data privacy practices, the participant expressed two strategies. First, another employee in the office was in charge of maintaining scoped data pulls from administrative systems (e.g., the student information system) and subsequently checking the veracity of the data. Having a point person for this data practice reduced inaccurate data and limited access to data unnecessary for informing analytic projects (i.e., they followed data minimization principles). Second, any analytics created by the office abided by their own rule that for data including a sample size of five or less, the reportable number changes to an asterisk. For instance, if a data dashboard includes aggregate data demonstrating that three Hawaiian/Pacific Islander students reflected a certain behavior or outcome, then the number would be masked to reduce reidentification risks. The participant emphasized that this rule went beyond less stringent requirements set by FERPA and guidance by the registrar’s office.

When the participant described her privacy practices – or a lack thereof in some cases – we prompted her to discuss alternative ways of thinking, doing, and valuing. The purpose of this strategy was to provide intellectual space and time to consider how outcomes of her privacy practices could be different and to reconsider the stimuli motivating standard practices. Responses to this strategy ranged from affirmative alternative designs (e.g., “I could do this … ”) to negative responses (e.g., “There is no other possibility … ”) due to existing conditions. The following highlights two instances where the participant outlined possible alternatives and outcomes.

An ongoing frustrating experience for the participant concerned her relationship with the institution’s office of information technology (OIT). Part of her position’s responsibility covered creating internal use and publicly available data dashboards, which required permission from an internal review panel and OIT. She expressed that even in cases where the internal review panel gives permission, OIT removes the dashboard – but they often fail to tell the participant that they had done so. Consequently, she “has to play ball” with OIT even if their decision would align with the internal review panel in the first place; if she does not, she loses her completed work.

The purpose for OIT rules is to protect data and the privacy of data subjects. But, the participant argued that these rules were too restrictive and unconstructive:

[OIT] kind of feels like everybody outside of [OIT] is the crazy grandmother who’s going to be signing up for Nigerian banking schemes, and they’re going to click on every link and wantonly do all kinds of stuff to make the databases vulnerable … . I do have a degree of empathy. I just kind of wish they would go about it in a way that they don’t treat anybody outside of [OIT] like they’re an idiot.

We confirmed with the participant that the issue boils down to a lack of trust between OIT and the office of institutional research, among others, and asked what she could do to get a different result. Even though she expressed skepticism that OIT would change its behavior and views, she noted that communications between her office could be more strategic. For instance, any issues with OIT decisions should be communicated from administrators from the office of institutional research, and not staffers. Additionally, attempts to “make nice” with OIT are preferable and probably more efficacious in the long run than battling OIT’s decisions. These alternatives were not optimal, but the participant perceived they could prove to be better than existing practices.

Another instance of alternative practices and outcomes concerned the development of new policy. The participant’s status as an administrator, not just an analyst, meant that she had policy-making privileges. If she desired and felt it would be useful, she could develop standardized data use practices with related compliance measures to guide her work and that of her peers within the office of institutional research. When the conversation shifted to this possibility, the response was negative. Her argument against forcing new policy was as follows (note: names changed to protect those referenced):

Well, because we have, like I said, Jane has got a very vested interest in, you know, FERPA and a lot of experience with that. Um, Danica was one of the data stewards for institutional research-level data. You know, we have some expertise in this office, you know, Jared and Kristin manage all of the survey information. Jonathan deals a lot with [human resource] data. We have a lot of expertise in a lot of different data sources and we want to consult everybody and also make sure that we’re on the same page. That’s just kinda the culture of the office. I think also that Susan [a peer administrator] has established that we’re a collaborative group and we want to make sure that we have buy-in from everyone before moving forward with that kind of thing.

Considering the alternative enabled the participant to take stock of a potential outcome, which even though it was denied still proved useful. Thinking through the possibilities enabled her to consider the expertise of her peers (e.g., Jane and FERPA), knowledge of institutional policy and procedure (e.g., Danica as a data steward), location of various data and who knew of such data (e.g., Jonathan and HR data), and the norms and expectations around collaboration (e.g., as developed with Susan’s leadership).

4.5.4 Modulating Practice

With the participant made more aware of the socio-technical dimensions of her work due to the STIR conversations, she began to think through different strategies for navigating rules-in-use regarding her data practices and data privacy. Analysis of the findings suggest that the participant engaged in a greater number of reflexive deliberations than deliberate modulations. In what follows, we report on the participant’s most clearly articulated reflexive deliberation regarding foregrounding privacy and a series of deliberate, second-order modulations.

Towards the end of the interview sessions, the participant began to reflect on the topics covered, issues discussed, and useful takeaways. Not all of these contemplations led to weighty considerations, but one significant reflexive deliberation concerned a new approach to her thinking on privacy. When asked to consider what, if anything, had been influential about her time with the researchers, she answered in detail:

Just to be constantly cognizant about who we’re sharing data with, what relevant policies exists, what are the complications with being able to share information and things like that? Just being cognizant. A lot of times I do try to be cognizant about what are the FERPA implications, what are the [institutional] data sharing policy implications. Um, admittedly I always need refreshers. I feel like I, I’ve done several trainings on them, but they’re, they’re so detailed that I’ve always constantly needing refreshers and I usually err on the side of caution. Um, but to just be constantly cognizant about that when I’m sharing information, I think, is a good step … . These conversations have again, kind of pushed it more to the forefront.

The simple statement of “being cognizant” reveals a heightened awareness in the participant’s mind about the important of privacy in her daily practices. It also demonstrates that she recognizes that privacy entails a variety of different rules-in-use, not just following institutional policy (though noted). This reflexive deliberation suggests that “being cognizant” takes focus and awareness, yet the common tasks and pressures – not to forget institutional politics – of the job intervene. To be cognizant involves pursuing strategies, such as vetting individuals requesting sensitive data and more carefully navigating information sharing expectations. Moreover, having privacy in the forefront of her mind, she believes, will enable her to think more carefully about the downstream consequences of sharing sensitive data. Finally, this deliberation also gave rise to a recognition that committing to ongoing professional education about her privacy responsibilities and the policies that govern her practices would better assist her in her work.

Based on the conversations with the participant, three types of deliberate modulations emerged: documentation of office data sharing and security practices; collaboration with data consumers on campus to help them use institutional data properly; and creation of a campus group to determine appropriate data sharing practices. Notably, these modulations reflect an “action arena” in the Governing Knowledge Commons (GKC) framework (Reference Sanfilippo, Frischmann and StrandburgSanfilippo, Frischmann, and Strandburg, 2018). These modulations will be addressed shortly, but it should be noted first that all of the participant’s modulations are considered second-order deliberate modulations, as opposed to first-order.

First-order modulations focus on actual changes in the STIR participant’s practices, whereas second-order are modulations where the participant alters her project’s goals, objectives, and assumptions to such a degree that they “come to challenge their own established routines of thought and practice, and also crucially, the various external forces which shape these” (Reference WynneWynne, 2011, p. 794). In the case of this study’s deliberate modulations, by the end of the study the participant had yet to make actual changes in their practices, rather they reflected upon the need for substantial changes in their future practices, and in some cases, set the stage for first-order modulations to occur.

The first second-order deliberate modulation that the participant reflected upon focuses on creating opportunities for her and her staff to document the implicit strategies and norms that guide their practices, such as spatial privacy practices and the norms that guide how they share data with campus data consumers. As the participant stated, documentation had not been an integral part of her office’s culture: “I think being a little bit more intentional about documenting policies for how we share data and things like that would probably be a good idea for our office and something that we haven’t really thought too much about.”

The participant reflected upon how the conversations led her to start a conversation with her supervisor about creating documentation opportunities, which could include setting aside dedicated office time during the week, such as “documentation Fridays,” focusing on documentation strategies as a team during a staff retreat or during office meetings, or by encouraging staff to document their practices concurrently as they work. The participant reflected that documenting office practices would help to reify and make explicit her office’s rules and norms, and to help justify their practices to others within the institution:

We should be able to justify what we’re doing. We’ve always done it that way is not a good excuse for doing anything. So, we should be able to justify what we’re doing and we should be able to document it for our own purposes as well as to better explain to people how we’re doing something.

For the second of the second-order deliberate modulations, the participant reflected upon the need for her and her staff to actively work with data consumers on campus to inform them about proper institutional data use in ways that align with campus and federal-level rules and norms. As the findings suggest, the campus requires data consumers to adhere to rules in regards to having appropriate FERPA training and data use agreements signed. However, the participant stated the need for ensuring that the campus’ existing rules and norms are followed by working with data consumers to ensure that they understand proper data storage practices and protect institutional data they receive:

[Our conversations have] gotten me to think a little bit more intentionally about making sure that people have use agreements about making sure that people understand data appropriately. Making sure that people are understanding data security and how they keep their information a little bit better and just being a little bit more thoughtful in those kinds of conversations.

The final second-order deliberate modulation addresses the need for creating a campus-level initiative focused on developing standardized data sharing best practices. As addressed in the previous section, throughout the conversations the participant reflected upon the lack of campus-level rules and norms governing data sharing practices. Given this, the various offices on campus with data sharing responsibilities, including her own, have created their own rules and norms governing how data should be shared.

Historically, campus offices had limited interactions with one another regarding campus data sharing practices; data analysis had not been central to their respective work. But with it becoming so and the pressures increasing to make data-informed decisions, the participant recognized the need for developing consistent practices and policies across campus. “We’ve never really gotten together a group of people,” she said, “and just kind of discussed it out here, discussed things with the exception of like a handful of large-scale projects, um, that are about to be released.” There was a need, the participant expressed, for creating a campus group to discuss standardized data sharing practices on the campus, and that it was “not something that I had really thought to do before.” She and her supervisor were planning to meet to discuss how they might go about creating this campus group.

4.6 A Concluding Discussion

The socio-technical milieu within higher education is drastically influencing data and information practices, according to the literature. With various analytics initiatives emerging and institutional actors trying to determine the right sources and types of data as inputs, it should be expected that the rules-in-use, especially policies, governing cutting-edge practices are not clear-cut and standardized. Moreover, as these actors take on new data-driven roles and responsibilities, especially within institutional research, it will take time for useful strategies to form and norms to settle.

We see in the data, the STIR of a single institutional researcher, some evidence of changes in information flow, reactions to it, and ways of thinking and doing to reestablish privacy-protecting rules-in-use. A single participant does not make for generalizable results about changes in higher education writ large. However, using STIR to address rules-in-use about privacy has led to notable insights and a potentially valuable research agenda.

The norms, strategies, and rules that govern interactions with sensitive data and information are often taken for granted. They may drive workaday practices, but they rarely give rise to reflexive or deliberative moments about alternative ways of doing. Additionally, rules as they exist as policies are to many individuals simply things one follows – not things one seeks to create or change. But with STIR, and as made evident in the findings, we see that there is an opportunity to make rules-in-use worthy of deliberation, as something that when given the space and time to consider can become something to rethink and reconsider. As the findings suggest, the act of naming and describing what structures privacy practices creates the circumstances necessary to then evaluate rules-in-use, solidify and support those that are successful, fill gaps where they exist, and plan for improvements.

Within the context of higher education and in other contexts where data analytics are gaining interest and momentum, it is an opportune time – if not a necessary responsibility – to investigate data practices. The consequences of predictive analytics, algorithms, black-boxed technological systems, and the data on which they all rely are getting serious scholarly consideration. But, looking downstream is only one way of approaching these issues. Instead, looking upstream at seemingly boring and benign practices, and prodding those actors to reflect on their practices, can produce significant insights for the actors-cum-research participants that lead to altered or new practices more attuned to the socio-technical mélange and its implications. Applying the STIR method to address informational privacy rules-in-use, ethics, or otherwise can advance research in this important area.

5 Public Facebook Groups for Political Activism

Madelyn Rose Sanfilippo ¹ and Katherine J. Strandburg ²

5.1 Introduction

The rise of social media has raised questions about the vitality of privacy values and concerns about threats to privacy (Reference Marwick and boydMarwick and boyd, 2014). As online communities have flourished, debate over appropriate information flows among users, as well as to platforms, service providers, and surveillance networks, has grown (Reference Ellison, Vitak, Steinfield, Gray and LampeEllison et al., 2011; Reference Marwick and boydMarwick and boyd, 2014). Facebook has received considerable scholarly attention (e.g. Reference Acquisti and GrossAcquisti and Gross, 2006; Reference boyd and HargittaiHargittai, 2010) in this debate. Social media use has become pervasive not only in economic and social interactions (Reference Kaplan and HaenleinKaplan and Hainlein, 2010) but also in politics and political organizing (Reference González-Bailón and WangGonzalez-Bailon and Wang, 2016; Reference NamNam, 2012). The convergence of politics with social media use amplifies the privacy concerns traditionally associated with political organizing (Reference Breuer, Sayre and YousefBreuer, 2016; Reference NamNam, 2012), particularly when marginalized groups and minority politics are involved (Reference StacheliStacheli, 1996).

Despite the importance of these issues, there has been little empirical exploration of how privacy governs political activism and organizing in online environments. This project explores how privacy concerns shape political organizing on Facebook, through detailed case studies of how groups associated with March for Science, Day Without Immigrants (“DWI”), and Women’s March govern information flows. Each of these groups emerged from distributed grassroots efforts, gaining critical visibility and participation because of their use of social media both to organize demonstrations and events and to share political knowledge ideas. This comparative case study employs an empirical framework that we developed in earlier work (e.g. Reference Sanfilippo, Frischmann and StrandburgSanfilippo, Frischmann, and Strandburg, 2018), in which we synthesized contextual integrity analysis (Reference NissenbaumNissenbaum, 2009) with the Governing Knowledge Commons (GKC) framework (e.g. Reference Frischmann, Madison and StrandburgMadison, Frischmann, and Strandburg, 2014) for institutional analysis of commons governance of knowledge (e.g. Reference OstromOstrom, 2011).

The specific activist movements we have studied – the March for Science, the DWI, and the Women’s March – are recent examples through which we can explore how privacy governs both the organization of activism, particularly in an age of social media and surveillance, and the sharing of knowledge in public Facebook groups. All three movements grew quickly in scale and structure following the 2016 presidential election, leading up to demonstrations in early 2017 through extensive use of social media and under the attention of traditional media.

The January 21, 2017, Women’s March on Washington, along with the 672 other satellite marches that took place on all seven continents, drew between three and five million people (e.g. Reference Boothroyd, Bowen, Cattermole, Chang-Swanson, Daltrop, Dwyer and GunnBoothroyd et al., 2017; Reference Wallace and ParlapianoWallace and Parlapiano, 2017). While the Women’s March focused on women’s issues, it became a venue for expressing solidarity and concern relative to many other human rights issues (Reference Boothroyd, Bowen, Cattermole, Chang-Swanson, Daltrop, Dwyer and GunnBoothroyd et al., 2017; Reference Fisher, Dow and RayFisher, Dow, and Ray, 2017). This movement originated with a single individual on Facebook proposing a march on Washington, DC, and eventually evolved into a sustained organization advocating and informing regarding gender and equality issues in a historical moment when opposition to misogyny and sexual violence is historically prevalent (Reference Fisher, Dow and RayFisher, Dow, and Ray, 2017; Reference Moss and MaddrellMoss and Madrell, 2017). While historically, many movements have petered out after large-scale demonstrations, the Women’s March organization has continued to organize events and information campaigns, including the Women’s Convention and a second annual Women’s March in 2018. It has also grown in public support, as the #MeToo movement has intersected with it. The national movement also has been criticized, however, by some who see it as less inclusive of women of color than some of the local efforts have been (Reference Boothroyd, Bowen, Cattermole, Chang-Swanson, Daltrop, Dwyer and GunnBoothroyd et al., 2017; Reference Rose-Redwood and Rose-RedwoodRose-Redwood and Rose-Redwood, 2017).

The April 22, 2017, March for Science drew on many existing advocacy networks, piggy-backing on and transforming an annual Earth Day observance in response to a post-truth era in which the legitimacy of science and objectivity are questioned; 1,070,000 people are estimated to have attended worldwide (Reference MilmanMilman, 2017). While movement was not initiated on Facebook, it employed Facebook to muster pro-science support and respond to perceptions of manipulation of scientific information and “fake news” (e.g. Reference Reardon, Phillips, Abbott, Casassus, Callaway, Witze, Lok and MegaReardon et al., 2017). Like the Women’s March movement, the March for Science movement has maintained visibility after the March and continues to respond to anti-science, anti-fact rhetoric in politics, media, and social networks. Unlike the Women’s March, however, it did not coalesce into a formal organization.

The February 16, 2017, DWI was different from the other two movements we studied in that it did not focus on a large-scale march in Washington, DC. DWI was primarily intended to illustrate the importance of immigrants to the economy, as well as to advocate for immigration reform. It also expressed opposition to the proposed border wall and to racial profiling (Reference Robbins and CorrealRobbins & Correall, 2017). Demonstrations were held in thirty cities across the United States, while individuals who did not march expressed solidarity through other forms of protest and boycotts. Some restaurants and businesses also closed to express their support (Reference SteinStein, 2017). The DWI movement was organized through more distributed networks of immigrants and their supporters, working in various ways to advocate and make their influence felt through a day in which they did not participate in the economy (e.g. Reference BlitzerBlitzer, 2017). Facebook’s network structure was employed by the movement to allow groups to connect, learn from one another, and share information (e.g. Reference Hamann and MorgensonHamann and Morganson, 2017).

These three cases are particularly fruitful to compare because they were similar in many respects, but differed in significant ways relevant to privacy. All three movements emerged in response to the rhetoric and political positions, or actions, of President Donald Trump and were active during the same period. Yet the historical legacy of each movement also shaped each response in a distinctive way. The Day with Immigrants and Women’s March movements stemmed from historically entrenched marginalization and inequalities based on ethnicity and gender, respectively – and from a concern that society was beginning to move backward on these issues. The March for Science, on the other hand, was a response to a relatively more recent and emerging concern about disintegrating societal respect for and trust in science.

Social media also played an enormous role in facilitating interactions between organizers and publicizing each of these movements and their events. These cases, addressing distinct issues, while operating in similar contexts and on the same timescales, thus allow for the exploration of privacy, as it plays out through governance of personal information flows, for both political organizing and Facebook sub-communities. Privacy practices and concerns differed between the cases, depending on factors such as the nature of the group, the political issues it confronts, and its relationships to other organizations or movements.

In order to better understand what these cases can reveal about privacy as governance of personal information flow, in the context of political activism and for organizations coordinating through Facebook, we employ the Governing Knowledge Commons framework, which we have recently adapted for the study of privacy (Reference Sanfilippo, Frischmann and StrandburgSanfilippo, Frischmann, and Strandburg, 2018).

5.2 Research Design

This case study focuses on six March for Science groups (Austin, TX; New Jersey; New York, NY; Princeton, NJ; Seattle, WA; and Washington, DC), seven Women’s March groups (Ann Arbor, MI; Atlanta, GA; Global, which represented all international events; Madison, WI; New York, NY; Princeton NJ; Washington, DC), and eight DWI groups (Baltimore, MD; Elkhart, IN; Greenville, NC; Nashville, TN; Students; VA; WA; Washington, DC). In addition to analysis of 4,352 posts from the DWI, 294,201 from the March for Science, and 196,358 from the Women’s March, we conducted interviews with decision-makers from these twenty-one groups, structured by the questions within the GKC framework, and distributed surveys to their members. Our textual analysis placed particular emphasis on public Facebook discussions of privacy, information flows, and institutional legitimacy and development. We also used network analysis to interrogate the relationships between national and satellite groups within each movement, between groups associated with different movements, and between each movement and external organization.

Additional details about our methods, and these cases, can be found in two companion journal articles, analyzing aspects of participatory privacy (Sanfilippo and Strandburg under review) and how privacy in knowledge resources governs online political movements (Reference Sanfilippo and StrandburgSanfilippo and Strandburg, 2019).

5.3 Background Environments for the Case Studies

5.3.1 Day Without Immigrants

Participants, including members and followers, and non-member discussants within various DWI Facebook groups provided a rich depiction of the background context in which they were organizing, emphasizing how both supportive participants and their opposition saw the movement as fitting into a larger narrative of social and political dimensions, including recognition of how the movement related to the 2016 presidential election. Discussion, and embedded arguments, focused on what was perceived to have changed under President Trump; while many recognized the historical path, “The immigration system is broken and for most of the Central America and South America countries the only way to come here is a stupid lottery created 50 years ago,” others recognized the rhetorical changes in the national dialogue as leading to something different, for example contesting that “Donald Trump was talking about … CRIMINAL illegal immigrants.” Action was perceived to be required in order to fight against both the historical trend and the current threats. As one organizer explained, “Gracias. Injustice under Trump is not only seen more, but more severe. It is an important time to act and I need to be a part, for myself and for others.”

The DWI movement thus emerged from a background environment in which many supporters, as potential participants, had serious concerns about potentially dire consequences for themselves or their family members if they were identified in a public forum. President Trump’s anti-immigrant campaign rhetoric and promises heightened fears of discrimination and racial violence for many immigrants, especially Latinos, whether documented or undocumented, as well as fears of mass deportation for undocumented immigrants, including hundreds of thousands of DACA recipients and millions of DREAMERS (Reference KocherKocher, 2017).

One consequence of the heightened vulnerability of undocumented immigrants was that many of their documented friends and relatives felt a similarly heightened responsibility, despite their own fears of discrimination and retaliation, to take up visible roles in the movement, since such roles were perceived as too dangerous for the undocumented.

5.3.2 Women’s March

The Women’s March was galvanized by President Trump’s election in a contest that most expected Hillary Clinton, the nation’s first female presidential candidate, to win. Many women experienced Trump’s win over Clinton, despite her substantial expertise and government experience and his complete lack of either, as a wake-up call. Women’s outrage over Trump’s election was heightened by his record of disrespectful behavior and comments, including those recorded on the infamous Access Hollywood tape, and the numerous sexual harassment and assault allegations that emerged shortly before the election.

The background environment included the long-standing movement for women’s rights. Equality and respect, as both objectives and core values of the historic women’s movement, shaped Women’s March institutions in fundamental ways. Specific values emphasized centered on inclusion and privacy, as concerns about safety, in the wake of sexual harassment and assault allegations against President Trump and recent history of #gamergate, made concerns about targeting individuals serious (Reference Moss and MaddrellMoss and Madrell, 2017).

5.3.3 March for Science

Unlike the other two movements, the March for Science did not grow out of a historical political movement responding to discrimination or hostility. Rather, it was a response to a fear that science was in danger of losing the nearly universal status and acceptance that it has enjoyed in modern times. An anti-science strain had been emerging in US politics for a few years, particularly within the Republican Party, particularly in relation to issues such as energy and climate change (Reference SelepakSelepak, 2018). The 2016 campaign, culminating in President Trump’s election, crystallized and heightened scientists’ fears that facts, objectivity, and scientific evidence were being supplanted in public discourse and decision-making by reliance on experience, personalization of information in favor of beliefs over facts, and conspiracy theories. These threats to scientific values were accompanied by more practical concerns about the possibilities of funding cuts, disruption of accepted scientific standards for peer review and allocation of funding, and corruption of the output of government-supported scientific endeavors.

Reflecting its emergence out of concerns for loss of historical status, the March for Science drew from the outset directly on the support and resources of large and established scientific professional organizations. In this respect, it differed notably from both the Women’s March and DWI. In deciding whether to become involved in the political arena by organizing a March for Science, scientific organizations and individual participants confronted the question of whether participating, as scientists, in a movement that responded directly to the election of a particular president would violate scientific norms of objectivity that had ordinarily been understood to include political nonpartisanship. Some members of the scientific community viewed participation as a means of opposing an already dangerous politicization of science, while others feared that employing the tools of political advocacy on behalf of science would contribute to its further politicization.

5.4 Attributes

5.4.1 Community Members

5.4.1.1 Membership

In the most expansive sense, community members could include everyone who supported the movement in any way at all, including expressing agreement with its aims, providing financial support, attending a national or satellite march, RSVPing on EventBrite, joining an associated Facebook group, attending offline events or meetings, or serving as an organizer for one of the events. Beyond even the most inclusive definition of community “members” taking part in the movement, each of the movements we studied had the potential – and the goal – to impact outsiders. In a broad sense, each of these movements had the potential to affect society at large. More specifically, potentially impacted outsiders encompassed all women, immigrants, and scientists who did not participate, including some who remained outside because they were concerned about the consequences of participation.

Our discussion of “members” of a particular group or march will ordinarily refer to individuals who joined one of the Facebook groups studied. Membership in these public Facebook groups was not representative of all members or supporters. This is most emphatically the case for the DWI movement because privacy concerns were a serious barrier to publicly affiliating with the movement on Facebook. To give us further insights into community membership, we have supplemented the detailed picture available from our main sources with estimates of march attendance and numbers of EventBrite RSVPs for our case study groups, as well as data about all organizations affiliated with the national groups and geographic co-location data for national and all satellite marches.

Table 5.1 shows the numerical relationships between estimated attendance, EventBrite RSVPs, and Facebook group members for the groups included in our study. For each movement, we compare projected attendance based on different platforms with estimated aggregate attendance at the 2017, or first wave of, national and satellite marches. (We do not consider follow-up marches in this study.)

Table 5.1 Attendance versus expectations

Movement	Day Without Immigrants	March for Science	Women’s March
Attendance	3940	206000	1578500
EventBrite RSVPs	1278	172000	798800
Facebook RSVPs	621	61500	402026

For all three movements, the number of march attendees was larger than the number of EventBrite RSVPs, which was yet larger than the expected numbers based on responses within the associated Facebook groups. As a fraction of attendees, DWI supporters were the least likely to register in advance, either by RSVPing through EventBrite or by participating in a Facebook group, including by formally joining or by responding to event invitations. Though there are various possible explanations for this trend, it seems plausible that this failure to register in advance reflects greater concern within DWI about leaving persistent, personally identified, public trails of their involvement in the movement.

The preference for EventBrite over Facebook across all three movements also reflects privacy concerns, since EventBrite RSVPs are not public, do not link to vast amounts of personal information, and can be made effectively anonymous. As one key organizer from the New York City Women’s March explained:

We needed to have a way of developing unique registrations, but at the same time, we’re dealing with people involved in activism, and we want people to know that their information is secure, right? So, that’s one reason we started using EventBrite, just to get the numbers. Now, the thing is, you don’t have to put your real name in there. You don’t even have to put in a real email address. You can put in an address, you know, that’s a temporary one. People make those all the time … but, you know, it was a way for us to keep track.

Participation in all three movements was open to the public at nearly every level of involvement. Anyone could (and was encouraged to) attend the marches. All three national marches, as well as the satellite groups that we studied, maintained public Facebook groups. Anyone could join these groups, yet membership was not required to participate in public discussions and there were few limits placed on discussion contributions. Though membership in the public Facebook groups associated with the various national and satellite marches was generally open to anyone, privacy concerns created significant barriers to membership in the Facebook groups associated with DWI, as discussed in greater detail later.

The openness of these public Facebook groups meant that not only supporters but also individuals who were hostile to a group’s goals and objectives could join. This openness to dissenters sometimes resulted in contentious – or even troll-like – exchanges. Maintaining this openness, despite the dissension it invited, was considered important by many groups not only as a means to inform and engage the broader public but also as a way to signal willingness to engage in reasonable debate. In particular, discussions within the Facebook groups of various DWI movement groups emphasized the hope that openness would help to legitimize the planned marches in the eyes of the public.

5.4.1.2 Community Demographics

The demographic makeup of each movement largely aligned with objectives. Thus, Women’s March organizers and other participants were overwhelmingly female – and, as some impacted outsiders critiqued, overwhelmingly white (Reference Fisher, Dow and RayFisher, Dow, and Ray, 2017; Reference Rose-Redwood and Rose-RedwoodRose-Redwood and Rose-Redwood, 2017). Many DWI group demographics roughly reflected the demographics of the undocumented population, though organizers specifically were more often those with more legal security, such as DACA recipients, Green Card holders, and legally documented relatives. Furthermore, a minority of satellite groups within this movement more broadly reflected the demographics of recent immigrants, rather than reflecting the undocumented population. Moreover, participants in both of these movements included family members and other supporters who did not fit the mold.

The March for Science was supported by numerous scientific organizations and many participants were research scientists and medical professionals. However, participants, including many organizers, also included many K-12 educators, students, university administrators, and corporate R&D staff. Families turned out for the affiliated Earth Day events, capitalizing on the openness of their designs and appealing to the assertions that trust in the objectivity of science should be learned early and believed pervasively. Moreover, while STEM fields have continuing, and historical, gender diversity problems, various March for Science groups were not only more gender-balanced but organized by women.

Moreover, some satellite marches, such as the Atlanta March for Social Justice and Women, intentionally defined themselves differently and in more inclusive ways, to better reflect the diversity of stakeholders in their local communities. As one co-organizer of the Atlanta event explained:

We are a non-profit organization and we organized the March for Social Justice, which coincided with the Women’s March all over, to represent all marginalized communities and to stand up for our rights and opportunities in the face of opposition. We wanted to stand together, for women, for our LGBTQIA friends, for people of color, for my fellow Muslim Americans and all religions, for … autonomy over our bodies and safety, for voting rights, for things we deserve and expect, but are held back and repressed, for victims of sexual violence who aren’t believed. For immigrants, of all statuses … I am involved because I want to stand up for myself and my community, not just by marching, but by making the demonstration and ongoing activism as impactful as possible.

In contrast to efforts by this satellite march to be inclusive in organization and participation, DWI movement groups were almost intentionally obfuscating in their design; many of these groups were composed, at least with respect to public affiliation, primarily of the friends, documented significant others, and children of undocumented immigrants, rather than by undocumented individuals – the most seriously impacted stakeholders. The absence of these individuals from publicly acknowledged membership did not necessarily preclude their participation or the reflection of their preferences, but rather mediated it for safety. As one organizer explained:

My mom and my aunt took me to a protest when I was a kid and from then on, I knew this was important and I wanted to help. I was born here, but people in my family are undocumented. … How can you just watch bias or discrimination and not do something, you know? So in this case, my aunt had been discussing this with other activists around immigration, but wasn’t going to go on social media, but I could and did, and then it could get around to lots of other people.

This individual was thus simultaneously in the roles of organizer-group member and proxy for family members, in order to protect the privacy of those family members.

5.4.1.3 Outside Groups

In addition to their individual members and organizers, these movements were supported in various ways and to greater and lesser degrees by pre-existing organizations. The Women’s March and March for Science each were affiliated with a large number of outside organizations, including thirty-nine organizations that interacted with both. The DWI movement, by contrast, had very few connections with outside organizations. Only one organization, the AFL-CIO, supports both DWI and the Women’s March. DWI shares no external affiliates with the March for Science.

These external affiliations reflect the dramatically different roles that outside organizations played in different movements. At one extreme, many March for Science events were not only supported by but also directly organized by pre-existing organizations – professional, academic, and advocacy-oriented. At the other extreme, the DWI movement intentionally eschewed most public affiliations with advocacy groups, even when receiving other kinds of support from such groups. The Women’s March was positioned at a relative middle ground with respect to the influence of outside organizations. Interestingly, the Women’s March also institutionalized itself, formalizing an organization that was much more lasting and stronger than either of the other movements.

5.4.2 Resources

Across all three movements, information resources were developed, aggregated, and shared to inform members, other participants, impacted individuals, and the general public about the movements’ objectives and events, as well as to share human interest stories and news coverage of their activities, and provide information resources for impacted populations. By using public Facebook pages, groups from all three movements made these shared information resources open resources, leveraging the visibility Facebook, and other social media platforms, provided to cultivate their own knowledge resources for use by affected individuals beyond participating stakeholders.

Prioritization of choice and opportunity shaped not only each group’s online information sharing practices, as described with respect to the mailing lists and organizational partnerships, but also the information resources constructed (Reference KitchKitch, 2018). Facebook and social media were most visibly central to the Women’s March, perhaps partly because of the role social media played in the group’s origins, but also because social media were seen as legitimate forums for personal information and sharing by participants in this movement. Facebook was not, however, considered to be perfect for this or any movement. Other tools and channels, such as EventBrite, were deemed necessary both to allow organizers to better anticipate numbers of demonstrators and to ensure that feminists and supporters who did not participate in Facebook could be adequately up to date on important information.

Facebook also had a significant impact on these activist movements because of the way it affected dialogue, by increasing disinhibition and opening groups to criticism, rather than simply insulating them. Interestingly, the smaller the group, the less it appears to attract critical posts, even when a small satellite was organized in a place where one would expect little ideological alignment between the movement and the populace at large. This was especially the case for small DWI groups, but also seemed to be true for smaller Women’s March satellites. Furthermore, some groups defused opposition by describing their membership in more socially acceptable terms; for example, groups organized by students and “DREAMERS” were perceived to be “sort of good that kids care about STEM” and “the kind of immigrants we want” (Facebook posts). “Asian and well-educated” immigration groups were also complimented, even by individuals who publicly stated their opposition to the DWI movement overall. In contrast, plenty of opposition was expressed relative to “Mexican, undocumented” groups. Obvious stereotypes played out in this environment, particularly relative to the perceived legitimacy of action arenas and resources, as individuals were open about their fears and prejudices in the mediated environment. In the words of one Facebook post, “Like you would even say that if this wasn’t fb!”

In the course of their activities, these groups handled large amounts of personal information. Some of this information was in the obvious form of contact information for members of the Facebook groups and those who RSVP’d using their real identities on EventBrite. But other personal information emerged less obviously, from discussions in the Facebook groups as individuals debated issues and told personal stories, or in photos taken at demonstrations. All of these forms of personal information were absolutely central resources for these movements, enabling coordination, publicity, public engagement, and more for these groups.

5.4.2.1 Contact Information

Norms about how to generate and secure RSVP lists and contact information reflected at least a partial consensus across movements about how to handle this form of personal information in online political organizing. Groups in all three movements used both public Facebook pages and EventBrite RSVPs, to developing lists of those who intended to participate in the marches. EventBrite was considered to be more accurate than Facebook, as well as more private, though neither provided complete lists of attendees. As an organizer of the NYC Women’s March explained, “I really recommend that to people if they’re doing their own event. Have multiple channels, even if they’re different, have multiple channels. Because we had people registered through Facebook, and I think it was only 50,000. [laughter] And over 200,000 in EventBrite alone.”

Yet while their basic approaches to gathering information resources were superficially similar, the movements diverged in important ways in their assessments of appropriate flow of these information resources. The DWI movement eschewed affiliations with outside groups in order to create a barrier to information flow about the identities of those who were involved with the movement. The Women’s March was affiliated with a large number of outside organizations, but groups often did not share member lists or contact information with those organizations.

The New York City Women’s March group, for example, did not share contact information or other personally identifiable information about members, attendees, or subscribers with outside organizations, no matter how closely interests and objectives overlapped. Nor did it accept similar information from such organizations. Instead of sharing contact information, the new group pursued its shared objectives with outside affiliates by publicizing events and information relating to its partner organizations, so that the group’s members could find out about other organizations and events reflecting their interests. They went so far as to delete the contact information and email list it had generated for the 2017 march, so as to prevent anyone under any circumstances from obtaining that information. As one key organizer explained,

we needed to have a way of developing unique registrations, but at the same time, we’re dealing with people involved in activism, and we want people to know that their information is secure, right? So, that’s one reason we started using EventBrite, just to get the numbers. Now, the thing is, you don’t have to put your real name in there. You don’t even have to put in a real email address. You can put in an address, you know, that’s a temporary one. People make those all the time … but, you know, it was a way for us to keep track. We were still, you know, knowing some people would put their real name in there, we wanted to establish a list for when the march was over. We had an opt-in process, after the march, for people wanted to continue to receive email from us and then we destroyed the original database.

To work around the difficulties imposed by the decision to delete the lists, subscribers to the list had to opt-in to be contacted for future events.

Both of these approaches were in stark contrast to the easy exchange of contact information that occurred between science and technology organizations and March for Science groups. For these groups, the appropriateness of sharing contact information with affiliate organizations was established by the identification of these affiliates as reputable professional organizations. One Women’s March organizer, who was also involved, to a lesser degree, in organizing a March for Science satellite march, identified a source of this difference in perspective between the March for Science and the other movements, stating “it’s an issue of trust in the powers that be. People who have been historically discriminated against don’t trust the status quo in the same way that scientists who are respected professionals might. It’s completely different.”

5.4.2.2 Stories, Narratives, and Other Forms of Personal Information

All of these movements also accumulated personal information in the form of stories and other personal details and each dealt with such information differently. The desire to use personal stories as a way of appealing to the public was common to all three movements, but levels of comfort in linking those humanizing stories to real identities differed greatly. The March for Science encouraged publicly identifiable scientists to share their personal stories and commitment to the movement. One March for Science organizer explained that “it was best when we could get famous faculty to buy-in and talk, not just demonstrations and things for kids, but in a way that was more political, because it got attention and was … I guess, more credible.” However, very real fears about employability were expressed by junior scientists, making privacy a core organizing value. In contrast to leveraging privacy by obscurity, as within the Women’s March, or privacy by proxy, as within the DWI movement, participants within the March for Science used established figures and institutions as shields, providing privacy to graduate students, for example, who served in key organizational roles.

DWI groups, at the other end of the spectrum, were more uniformly averse to sharing personal stories in ways that identified the status of particular individuals. Posts on public Facebook groups devoted to the movement illustrate direct admonitions against individuals who shared the stories of their friends, as well as metadata revealing deleted posts preceding these discussions as evidence of moderation to control information flows regarding immigration status. Furthermore, comments posted provide general warnings to be careful with information about immigration statuses, along with extensive discussions revealing support for this specific aversion to sharing among discussants and members of these groups.

Facebook discussions and interviews, relative to the DWI movement, included many expressions of fear and concern about what ICE and law enforcement might do with the personal information of any undocumented members who were publicly associated with the movement. These fears strongly affected the willingness of DWI supporters to formally affiliate with the movement on Facebook, as participants in the groups, or even to maintain Facebook accounts. As a result, while Facebook was often used to publicize demonstrations and share resources, other, less permanent and public tools, such as encrypted messaging apps promoted by Cosecha or Snapchat, were employed for organizing and connecting undocumented individuals to the movement. Even Snapchat was often avoided, because of its relationship to Facebook.

Women’s March groups varied in their approaches to sharing personal stories. Rather than avoiding documentation of personal information or identities, as with immigrants and their advocates, the Women’s March sought safety in numbers and privacy through obscurity, tightly controlling images documenting the events and contact lists, as much as possible. Because the #MeToo movement unfolded during the course of interviews for this study, a number of interviewees mentioned the impact that the movement was having on women’s willingness to share their stories, yet worried about the potential consequences of being able to connect stories shared online with real world identities. For example, one Women’s March interviewee noted that “we didn’t really have strict rules about people disclosing personal information, sensitive stories or anything, but I’m getting worried about it, still moderating the group, with #MeToo, because I’ll feel responsible when some weirdo decides to target someone because of what they’ve posted in our group, like the physical threat is real.” These comments evoke similar concerns to those expressed by members of the DWI movement relative to threats of deportation or problems from legal authorities that might arise if the immigration status of anyone associated with the movement could be gleaned from its Facebook groups.

Most groups addressed concerns about inappropriate personal information flows primarily using rules and norms about sharing using the group’s Facebook pages and other, now traditional, communications technologies, such as email. Additionally, however, interviewees from a number of DWI groups reported receiving lessons from organizations such as Cosecha, to help them understand how to appropriately use privacy-enhancing technologies, such as encrypted messaging or Slack channels, rather than more visible platforms like public Facebook groups.

5.4.2.3 Photos and Other Visual Resources

One arena of particular concern related to the identifiability of those who attended events from photographs taken or posted that captured participation in political demonstrations. While most groups, across all three movements, gladly accepted photographs of themselves that individuals shared with the group, some struggled with how to manage photographs of others that had been taken or shared without their consent. As one March for Science organizer explained:

You sort of expect to be seen when you go to a protest in a public place, but there is a sense of safety in a crowd. You’re not the only one, but you can be singled out in a picture, and that becomes more complicated … especially when it’s suddenly online forever. We get that this is a real career risk.

In addition to privacy concerns, sharing of photographs and other information resources sometimes raised questions about intellectual property and ownership, as one organizer of a Women’s March satellite explained:

You’ve got volunteers creating graphics for logos and banners and things … tee-shirts …. And those people aren’t savvy enough to negotiate the rights to that stuff, so then you have weird rights issues that come up. Umm … there’s navigating that whole nightmare, then, umm … and something we’re still navigating, is that we had a whole group of photographers taking pictures. Umm … and we were lucky, in that our lead, the leader of the photographers got written agreement with the group of photographers that we had, to get the rights for use of those images collected.

Groups also worried about how to ensure that photographs of their events and marches were preserved. Many followed the early example provided by the NYC Women’s March, in securing the photos on their own machines or servers, rather than entrusting them to social media alone. The NYC Women’s March began this practice in hopes of developing an archive of such images.

5.4.2.4 Information Quality and Fake News

Organizers also worried about ensuring the reliability and quality of the information that were shared on their groups’ Facebook pages. As one March for Science organizer explained, “It’s not unfounded to be concerned that facts and science are undermined anytime any influential person says, ‘Oh, fake news,’ it’s actually happening.” Many organizers expressed concern about establishing trust in the resources they developed and shared on behalf of their groups. To address these concerns, groups not only doubled checked FAQs, to ensure that the correct and most up to date information was being shared, but also made efforts to provide quality control regarding news stories and links shared by participants within Facebook groups.

These information quality concerns extended in some ways to personal information. Organizers were concerned not only about the obvious things, such as obtaining correct contact information and ascertaining RSVP quality, but also about ensuring that lies weren’t being spread through their platforms via personal stories. As one DWI organizer explained:

Fake information is a problem, but honestly … if people take 5, 10 minutes of time to really look into a story or recheck the facts … it’s not any longer than a few google clicks away to verify. There have been … I have seen some fake news about some woman who got deported, like, making fun of her … it turned out it was fake news. What ended up happening? Nothing really happened. It’s a thing, it looks sensational, but it’s usually obvious if it’s really if you look up the story and only find one fringe article … I kind of double check all my stuff before I post it … you’ve got to be careful, that’s the problem with … but a hoax will die out or blow up really quickly … you can update or repost to keep things prominent, to folks everywhere, all the time.

Many groups, across movements, documented their processes of vetting stories, including news stories, that were shared within the group’s Facebook discussions. For example, a DWI post reflected a norm of vetting through careful reading, “Thanks for sharing this. I will wait to comment until I have a chance to read all of this!” while a March for Science post reflected a norm of confirming sources, “[T]his looks great, but we will get more sources before we add it.”

Organizers across all three movements emphasized that it often took only a little work to verify information veracity.

5.4.3 Goals and Objectives

The objectives underlying the movements are tied strongly to the exogenous influences associated with their formation, as well as to historical interests shaping their contexts. These objectives reflect common values such as equality, transparency, truth, and fairness, as well as interests unique to each group. The general goal of each of these three movements, though details are specific to each movement and group, is, in part, to resist Trump’s agenda in favor of more progressive policies and changes. All groups also shared the objective of informing the public about their focal issues. Groups uniformly placed high value on knowledge production and, as described in the previous section, on maintaining the veracity and quality of the group’s information resources.

All of the movements were confronted by entrenched political opposition. Sometimes there were subareas within a movement’s objectives where consensus with at least some opposition actors could be reached. For example, children of immigrants were viewed relatively favorably and DACA was a more acceptable issue to address than amnesty. As one Facebook user stated “Yes the people who have crossed the border without a choice are the children they bring, are they criminals too?” Supporters and many who originally identified as the opposition agreed that children had no choice and thus this was seen to be a legitimate objective.

Sometimes, however, opposition led to counter-movements, or even harassment and threats. While some Women’s Marches saw anti-abortion, or more specifically, anti-Planned Parenthood, counter-protesters, DWI groups experienced the most visible detractors, with bigoted and hateful language directed toward them, as well as threats made toward exposing individuals to scrutiny over their own statuses. Even the well-sourced resources and campaigns developed by the March for Science was subjected to the same hostility to objectivity that the movement emerged to address. When intelligent public information campaigns were disseminated, comments were posted decrying them as propaganda; for example “Drinking the left wing koolaid made by the C.linton N.ews N.etwork.”

The national March for Science group developed and disseminated a relatively clear set of top-down objectives, while the Women’s March and DWI movements presented patchworks of objectives even at the national level. Regardless of the degree of consistency at the national level, satellite groups unavoidably augmented and modified national objectives to reflect local contexts.

Many March for Science satellite groups were defined more specifically by goals and objectives determined in the local context. For example, the backgrounds of the individuals determining the agenda and specifying the goals varied from group to group. For example, one group might be organized by the American Association for the Advancement of Science (AAAS) members, while another was organized by high school science teachers. Differences between groups within the March for Science movement could be sufficiently stark to provoke some prospective participants to eschew the closest group in favor of a geographically inconvenient alternative. For example, an organizer for the March for Science Austin reported:

[I]t didn’t really seem like I was going to be able to get to DC right around the beginning of the semester, and the Stand up for Science event on my campus, while technically a satellite march, seemed to have some influences that I didn’t agree with … I figured if I was, umm, going all the way to Austin, I may as well make sure it was something I wanted to be a part in.

This interviewee joined and helped to organize a satellite group located hours away because of dissatisfaction with the fact that the local satellite group was focused too much on public health and improving trust in biomedical resources, rather than encompassing broader issues such as climate change or general esteem for science.

The Women’s March had a strong national presence, but many individual satellite groups were very well organized and developed their own values and goals very independently, with only the loosest of coordination with the national group. DWI groups were the most grassroots in nature, yet shared information, practices, and institutions horizontally to great effect, illustrating the strength of the network in developing something to a large scale, despite the fact that some satellite demonstrations did not even occur on the same date as the national event. For example, while most demonstrations took place on February 16, 2017, Milwaukee had its demonstrations three days prior. Organizers of individual satellite marches and walk-outs emphasized the importance of solidarity in objectives, as much as in actions, throughout interviews. One organizer explained group-to-group interaction, as a process of assimilation that did not involve any top-down pressure or aggregation, but rather depended on coordinated interactions between many individuals:

Yeah, well we do things a lot, but also with other lists and groups, that I follow, a lot. One of them, it’s called … I’m going to look it up real quick, hold on … anyways, it’s One Texas Resistance, that’s close to the border, you know what I mean? People here feel pressure, people here connect, but to connect to people there, through social media, is powerful. Share words, show solidarity, share pictures … it’s a good thing … And the fact that I have an education and the knowledge to spread the word, that’s what I’m trying to make a difference on, get it all out there. The fact that I have this information, I want to spread it to more people.

Satellite groups often wanted to learn from one another and leverage national numbers to attract attention to their causes, while serving their own specific communities.

Tensions and dissension about goals and objectives sometimes led satellite groups to differentiate themselves from the national group. Even provocations by outsiders might lead to productive discussions about how the scope of a group’s objectives could be narrowed in a way that was acceptable to at least some detractors. For example, a post stating “Immigrants are great! But illegal aliens should not be here. I wish the two wouldn’t be used to mean the same thing” triggered a discussion that led one local movement to focus on demands for immigration reform, consideration for families, and respectful dialogue, rather than including calls for amnesty. This approach made that satellite group a relative outlier within the larger movement.

Even the language used to frame problems and objectives was sometimes contested. For example, posts in various DWI Facebook groups related to whether the objectives, and underlying problems, were about illegal or undocumented individuals, “aliens” or “criminals.” Extreme objectors posted things like:

Ok so you don’t like “illegal”. We will just start calling them what they really are. Criminals..I am a “legal” US citizen, but if I break the laws in our country I am ..wait for it … A Criminal

In response to these debates, very different agreements and discussions evolved within different local groups across the country as outside stakeholders contested the legitimacy of the groups and groups responded.

Other provocations re-shaped demonstrations. For example:

If all of the undocumented students don’t show up to school on one day

Then you’ve got a real good list of all the undocumented students to be reported to ICE or whatever

Doesn’t it seem counterintuitive saying “hey I’m an illegal” during this presidency?

In response to this post, this group designed a more inclusive demonstration, with friends and supporters obfuscating the status of their undocumented classmates.

5.5 Governance

5.5.1 Institutions

Polycentric institutional arrangements defined governance within these cases. Within legal and regulatory institutions regarding freedoms of association and expression, Facebook and other platforms, including EventBrite, Twitter, Slack, and Snapchat provide overarching infrastructure in which individual groups created their own institutions, which alternately abide, work around, or contradict the constraints placed upon them within the nested structure.

Governance mechanisms within these movements were distinctive, yet membership rules, resource contribution, and knowledge sharing expectations had many similarities. For example, very broadly, the membership norm was to have low barriers to entry, without expectations to contribute information or actively organize. Yet there was a clear norm that attendees should publicize and share event details and information resources that were generated by the more active participants.

Across all three movements, there was little explicit discussion of rules within Facebook groups or on other public platforms. Nonetheless, normative expectations about civility in dialogue were often specified (e.g. “yes, i’m sorry it is a little condescending, please edit that out if you could or i’ll delete it when i get back :) your heart is in the right place but no human is superior than another human”). Norms of authority were reflected in requests to provide references or sources in order to back up claims (e.g. “I didn’t see that. Where did you read that? I’m not saying it isn’t true, but”) or assist other members (e.g. “can you share those details”) were ubiquitous.

The logic, as well as the process, of decision-making in these groups was often non-transparent, despite the groups’ ostensibly open nature. For groups at the national level, as well as satellites in large cities such as NYC or Seattle, the lack of transparency sometimes stemmed from the fact that such large groups became dysfunctional when too many people were involved in organizing and decision-making. For smaller satellites, lack of transparency often emerged simply because one or two individuals alone acted as the driving forces of these groups. Furthermore, because much of the decision-making process in these groups occurred off of the public Facebook groups, the best sense of how governance was handled comes from the Facebook group FAQs and from interviews. Moderation of discussion in the Facebook groups was an apparent form of governance. And, of course, governance is also formalized in the technological infrastructure and configurations of platforms.

Informal norms and strategies, relative to how people interacted and what was acceptable information to share, were emerged in an ad hoc fashion and by example, rather than from thoughtful strategic planning. Thus, there were many levels of institutionalization and few internal objections to governance designs for these groups. Overall, responses to our survey suggest that the governance was perceived to be legitimate, though there was sometimes disagreement about particular choices.

Privacy, as governance of personal information flow, was often not discussed publicly, even when privacy was a primary focus of many FAQs, discussions amongst organizers, and provided the logic behind numerous decisions. It is notable that interviewees cited their movements’ privacy values very confidently, even when they had not discussed them explicitly with others in the group; “I’m not sure … I … as we have talked, this idea has come to me, but it’s not something we decided on … I think we are informal and local, loose connections to other immigrant groups, because it provides privacy and keeps people safe.”

Some specific rules-in-use about personal information were discussed prominently, however, including those against oversharing in light of threats associated with disclosure outside of the group (e.g. “Everybody: Don’t share immigration statuses. Privacy settings don’t hide comments. It’s a public group”). Questions about appropriateness were common in the Facebook postings, for example “should we really post pictures? there is no expectation of privacy at a public event” (Facebook post). Beyond privacy-related rules-in-use reflected in the posted discussions, heavy moderation was easily evident on the public Facebook pages of some groups. For example, of 4,352 posts associated with the seven DWI groups that we studied, only 3,946 still have readable text, indicating that 406 have been deleted. Timestamp and post ID, without a post, are downloadable metadata through the API, and in some cases it is possible to infer what the post may have discussed, by looking at responses that are still visible on the page. This illustrates both a privacy threat and an instance in which polycentric governance is competitive, rather than complimentary. While many of the deletions by organizers moderating these pages were intended to remove inappropriate flows of personal information that had been shared, as enforcement of rules governing DWI groups, overarching platform level governance by Facebook undermined this effort, to an extent, by maintaining context and metadata.

There were very real concerns expressed by movement members about surveillance and about what Facebook would do with the data it had amassed about individual participants and groups overall. While this concern was most prominently expressed in relation to DWI groups, members of groups across all three movements echoed the concern. As one organizer explained:

I think Facebook makes a lot of decisions that don’t make people happy, that may not … well affect people, about their privacy, but most of that is not special to us. What is different, what is important and maybe a problem, is how Facebook works with law enforcement, with ICE. I don’t know, really, what happens … I don’t think they have to tell us, but people talk and people are afraid.

Many individuals, across all three movements, expressed doubts about the legitimacy of Facebook’s role in making decisions about their groups’ information. Furthermore, there was entrenched distrust that Facebook would adhere to what was stated in policies; “We agree to Facebook when we use it, but they don’t always do what they say and it changes” (March for Science participant, survey response). Nonetheless, most felt it was necessary to sacrifice in order to benefit from Facebook’s social capabilities. They did not see any alternative.

While rules and expectations about conduct within groups were often implicit, aside from rules about sharing information and admonitions to mind norms about civil conversations, there was active enforcement of content and civility standards through moderation. These practices were frustrating to some good-faith participants, who did not understand the implicit expectations, yet faced explicit consequences in moderation. Moderation more often manifested when group organizers simply deleted posts that didn’t reflect community standards but, occasionally, organizers would articulate expectations and give participants an opportunity to edit their posts, with deletion serving as a last resort. Beyond censorship, there were no other real consequences or sanctions for violations and conflicts between members or members and outsiders were resolved on an ad hoc basis or through excision by moderators, rather than through any formal or consistent procedure.

5.5.2 Actors in Governance

Actors involved in governance, regardless of movement or satellite group, were not always key stakeholder groups and did not include all members in any case. Yet interactions between groups and the permeability of boundaries were often relatively open, allowing for interested and committed individuals to be as involved as they wanted to. DWI groups were relatively the most different than other groups, in terms of having most key organizing roles filled by advocates and family members, rather than undocumented individuals themselves, as least in terms of how these groups documented and presented themselves to outsiders in order to protect impacted individuals’ privacy. As one DWI organizer, for a student-led satellite group, explained:

There are a lot of immigrants within the community. Stamford, as a city, is made up of over 30% immigrants, or non-citizens, so really like a third of people here were born outside of the United States and so we were trying to show that. We were also trying to get rid of peoples’ fear, particularly those of students, so that … there was a lot of confusion.

The organizers, in some cases relatives of immigrants, were themselves simply “civically engaged” and wanted to ensure that their families and classmates were welcome in the community. Other DWI groups were organized by DREAMERS, as well as friends and relatives, and made decisions independent of, yet representing, the impacted communities. In contrast, most Women’s Marches were organized by women.

March for Science decision-makers often included a combination of scientists, science educators, and STEM students, along with other STEM advocates. As one March for Science organizer explained:

it seemed that a lot of science faculty were eager to attend, and some of them to speak or do demos for kids or something, but they didn’t have time or experience to organize. Also, some of them were concerned about whether that was really professional for them. For me, I work in communications for the university, I didn’t have the same concerns and I had the time. It was important to me, in terms of my kids and a world I wanted to live in. I started our local movement, but had a lot of really great students, from biology and chemistry, even engineering, who wanted to help and did a great job.

This movement was in many cases more collaborative.

Actors involved in governance often expressed their desire to be inclusive, yet also their fears of being influenced by those with contrary values, particularly given concurrent activism by “white supremacists and Neo-Nazis.” Boundaries were open, but actively policed, in order to avoid co-option. It was also particularly interesting to see who the groups, or the interacting public, attempted to exclude, given that they were working within open, public Facebook groups. Stakeholder groups wanted to engage the public, but not necessarily everyone in their own networks. As one organizer of a DWI satellite group responded, when asked about establishing boundaries:

Well, sort of. We do not want our employers involved. We do not share with police or immigration officers … we prefer those we fear not to be too close, though I think they say keep enemies close … I do not think they are all enemies, but there is fear.

In this sense, members within groups often leveraged their control, as enabled by Facebook, over who specific posts would be shared with, based on whether they were friends and family, or colleagues, driven by concerns about repercussions from nested oversight.

Decision-makers in all groups were often those who felt personally invested and frustrated with current politics, becoming involved through social media and interpersonal interactions and relationships. Actors who engaged in organizing in most cases were in small, close knit groups, yet at the national level and in some of the largest satellites, large groups coordinated. Smaller satellites were often organized by 1 to 25 individuals, in contrast to the roughly 1000-member group that organized the national Women’s March. Overall, decision-makers were generally perceived to be legitimate in organizing, though criticism remained about the non-representative nature of many of the Women’s March groups, in particular.

5.6 Patterns and Outcomes

The public support, across all three movements, illustrated the relative successes, in terms of shaping debate, yet varied dramatically in terms of the consequences. While the March for Science supporters and organizers faced relatively little opposition, the Women’s March has faced audible criticism and the DWI movement has documented numerous repercussions, documented within media coverage in the form of the number of individuals who lost their jobs for failing to report to work during the demonstrations.

Benefits to members and others are relatively difficult to ascertain at this point in time, though many have discussed attitudinal changes relative to gender discrimination, manifesting in #MeToo and accountability for aggressors and perpetrators of sexual harassment and assault as being tied to the Women’s March movement. However, the attention gained by all three movements is important, as they demonstrated more widespread support than anticipated. Their creative outputs, in the forms of informative campaigns and knowledge resources about their positions, also persist and are being broadly disseminated, establishing social interactions from both the resources themselves and sustained from interactions at demonstrations. Many of the interactions that emerge from the communities’ activities are perceived to be a wonderful benefit toward larger community building by active participants. As one Women’s March satellite organizer explained, “I showed up to the meeting alone. I met some great ladies, most of whom I’m still in contact with …. It’s wonderful.”

Perceptions of success by many members, stakeholders, and the media, were contested, however, by ideological opposition, particularly around the legitimacy of the groups as grassroots, rather than mouthpieces of existing organizations, like Planned Parenthood, and in accepting the attendance numbers as fact.

5.7 Implications

5.7.1 Privacy Values in Public Facebook Groups

Despite the obvious impact of privacy values and practices on political demonstrations and movements, the most innovative applications of privacy as governance within these communities shaped the use of public Facebook groups, often in ways specific to ideological or sensitive uses of this technology. Four distinct privacy values emerge, shaping public Facebook groups as commons arrangements: privacy through obscurity; privacy and chilling effects; privacy through autonomy; and participatory privacy.

Privacy was, in many instances, sought through obscurity. Various groups and individuals emphasized the anonymity provided to them by the crowds in demonstrations and the relative safety they felt within the sheer number of supporters, often beyond the expectations of organizers and the media in advance of events. Governance structures around photographs and publicity particularly emphasized this preference, minimizing the ability to single out participants publicly, to the best of their abilities. Large groups also emphasized privacy through obscurity in their configurations of Facebook groups and common patterns of engagement, allowing public numbers, without identities, within RSVP design and through the use of “following” rather than “joining” mechanisms.

Privacy concerns also generated chilling effects, regarding participation patterns. This was closely coupled to fear of repercussions, rather than inappropriate flows alone, with many informants for this research conflating these two problems. Comparing immigration advocacy groups to the Women’s March, there were differences in not documenting or “lurking” participation for the immigrants’ groups, in comparison to subtle behavioral shifts, or deleting documentation, for the women’s groups. While merely symptoms of the more primary surveillance harms and possible repercussions, they represent distinct burdens and tradeoffs surrounding participation (Brennan-Marquez & Susser, work in progress). In this sense, privacy dramatically shaped participation, in addition to resources and governance.

Participatory privacy – including anonymous and pseudonymous participation, as well as non-identifiable participation in a crowd – presented a related set of strategies yet was independent of chilling effects and drew on diversity of modes of interaction. Specifically, the use of multiple platforms and institutional designs that obfuscated identification of sensitive attributes associated with stakeholders in the communities provided a means of participatory privacy, whether though encrypted channels for organization of immigration groups or proxy participants, as well as the use of existing organizational infrastructure, by March for Science groups, to protect junior scientists’ careers. Central to Facebook specifically, choices made by individual participants to utilize the least publicly transparent modes of interaction with public groups illustrate nuance to participatory privacy. Following, rather than joining, a group allowed individuals to include information from the groups in their newsfeeds and notifications, without anyone else knowing they were followers. Similarly, liking a group without joining it supported sharing the interest only with friends, rather than with the general public.

Privacy was also attained by many through autonomy. A commons arrangement, in which individuals and collectives had control over flows, rather than accepting other decision-makers’ choices around appropriateness or imposed flows, provided acceptable outcomes regarding personal information. Groups made choices to use multiple channels and delete contact resources, in order to contradict default practices or designs of platforms and prevent privacy harms imposed by platforms, law enforcement, or political opposition. These creative choices and work-arounds reflected a desire to overcome the structural and institutional efforts by online platforms, like Facebook, to undermine social norms about appropriate information flows (e.g. Reference StrandburgStrandburg, 2004, Reference Strandburg2006) and were relatively successful at providing privacy to participants and stakeholders.

5.7.2 Commons Governance for Grassroots Political Organizing

The GKC provided a useful lens through which to explore how grassroots political organizing in public online spaces functions through commons governance, including privacy as governance. Not only were numerous norms and strategies revealed, as evidence of dependence on lower levels of institutionalization in complex and diverse contexts, but also the polycentric nature of governance arrangements was made visible.

Specifically, not only did individual movements have unique needs, which generated arrangement patterns, but individual groups illustrated creative ways of interacting with platforms like Facebook in their efforts to appropriately structure their communities. Differences in choices about platforms, as well as what arrangements of tools and configurations within those platforms, illustrated experimentation and context specificity, as well as distinct limitations of Facebook. While Facebook allows groups to control events, those created by individuals cannot be shared with other accounts, from an administrative standpoint, leading to inconveniences, as well as lack of functionality, for example, depending on the scope and development patterns for each group. Another example of a flaw for organizing related to what groups had access to, which varied by scope; larger groups did not have access to full lists of respondents to verify against EventBrite RSVPs, though Facebook has access to who might attend. Furthermore, many unique arrangements were designed to protect privacy, through the use of less public channels.

5.7.3 Emerging Best Practices

While the scale and frame of this study do not lend themselves toward best generalizable principles for online organizing, a number of privacy practices can be identified from these groups as emerging best practices, reflecting a grounded approach toward description, rather than prescription. Specific privacy institutions are shared within and across movements, with respect to how to handle personally identifiable stories, contact information, event photographs, and discussion moderation.

First, and most broadly accepted, it appears to be a key to success to use multiple platforms, particularly for predicting turnout through RSVP functions and in generating and protecting email lists or other contact information. This allows groups to reach more diverse communities with shared interests, as well as to triangulate. While groups had different expectations around what to do with those forms of information, it was considered best practice to have multiple platforms and not entrust everything to a single platform.

Second, photographs should be archived off of public platforms, regardless of whether they are deemed appropriate to also be hosted on commercial public platforms. Private servers, rather than cloud services, were more often trusted, though some did depend on secure cloud back up. Many groups also backed up contact lists, though there was not a clear consensus on how to do this, and notable exceptions actually deleted all of this information, as well.

Third, in organizing around contentious political issues within public online spaces, moderation has been deemed absolutely critical, even if the extent of moderation and the underlying rationales are different. While some groups were concerned about removing hate speech or fake information, others were concerned about detraction from their carefully constructed messages; whatever the reason, some control over the content is important in pursuing specific goals and maintain civil spaces.

Fourth, efforts were made in diverse groups to protect the personal stories and prevent identifiability of vulnerable members. Even though scientists perceived as public intellectuals were encouraged to disclose personal stories, junior scientists were protected in ways that were similar to individuals’ immigration statuses and women in vulnerable situations, often associated with sexual harassment or domestic violence.

Book contents

Part I - Personal Information as a Knowledge Commons Resource

Summary