Sampling

Bernard Chazelle

doi:10.1017/CBO9780511626371.005

4 - Sampling

Published online by Cambridge University Press: 05 October 2013

Bernard Chazelle

Show author details

Bernard Chazelle: Affiliation:
Princeton University, New Jersey

Book contents

Get access

Summary

This chapter is about extracting small representative samples from large data sets. In the process we develop a complete computational theory of geometric sampling, with an eye toward the derandomization applications that will be discussed in later chapters. It is difficult to overestimate the impact that this theory has had in computational geometry in the 1990's.

The combinatorial discrepancy of a set system indicates how well, relative to its constituent subsets, we can sample the ground set by selecting about half of it. It is natural to ask what happens for different sample sizes. At one extreme, we might wonder how well we can sample a set if we are allowed to pick only a constant number of elements. For example, given a finite collection of points in the plane, is it possible to choose a subset of constant size, such that any disk that encloses at least one percent of the points also includes at least one sample point? Surprisingly, the answer is yes.

In fact, something even stronger and stranger is true: Suppose that we want to estimate how many people live within 10 miles of a hospital in a given country. We can do this by sampling the population carefully, answering the question for the sample, and then scaling up appropriately. What is amazing is that, for a given relative error, the same sample size works just as well whether the country is Switzerland or China! Furthermore, we can change metrics and even lift the problem into higher dimensional space, and this still remains true.

Type: Chapter
Information: The Discrepancy Method
Randomness and Complexity
, pp. 169 - 202

DOI: https://doi.org/10.1017/CBO9780511626371.005 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2000

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

4 - Sampling

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive