Introduction
Standard surveys often exclude members of certain groups, known as hard-to-survey groups. One reason these individuals are excluded is difficulty accessing group members. Persons who are homeless are very unlikely to be reached by a survey that uses random digit dialing, for example. Other individuals can be accessed using standard survey techniques, but are excluded because of issues in reporting. Members of these groups are often reluctant to self-identify because of social pressure or stigma (Shelley, Bernard, Killworth, Johnsen, & McCarty, 1995). Individuals who are homosexual, for example, may not be comfortable revealing their sexual preferences to an unfamiliar survey enumerator. A third group of individuals is difficult to reach because of issues with both access and reporting (commercial sex workers, for example). Even basic demographic information about these groups is typically unknown, especially in developing nations.
One approach to estimating demographic information about hard-to-reach groups is to reach members of these groups through their social network. Some network-based approaches, such as respondent-driven sampling (RDS), recruit respondents directly from other respondents’ networks (Heckathorn, 1997, 2002), making the sampling mechanism similar to a stochastic process on the social network (Goel & Salganik, 2009). RDS (see Chapter 24 in this volume) affords researchers face-to-face contact with members of hard-to-reach groups, facilitating exhaustive interviews and even genetic or medical testing. The price for an entry to these groups is high, however, as RDS uses a specially designed link-tracing framework for sampling. Estimates from RDS are also biased because of the network structure captured during selection, with much statistical work surrounding RDS being intended to re-weigh observations from RDS to have properties resembling a simple random sample. Though methods such as RDS can be advantageous (researchers interview members of hard-to-survey groups directly, for example), financial and logistical challenges often prevent researchers from employing these methods, especially on a large scale.