Adversaries can also execute attacks designed to degrade the classifier's ability to distinguish between allowed and disallowed events. These Causative Availability attacks against learning algorithms cause the resulting classifiers to have unacceptably high false-positive rates; i.e., a successfully poisoned classifier will misclassify benign input as potential attacks, creating an unacceptable level of interruption in legitimate activity. This chapter provides a case study of one such attack on the SpamBayes spam detection system. We show that cleverly crafted attack messages—pernicious spam email that an uninformed human user would likely identify and label as spam—can exploit Spam- Bayes' learning algorithm, causing the learned classifier to have an unreasonably high false-positive rate. (Chapter 6 demonstrates Causative attacks that instead result in classifiers with an unreasonably high false-negative rate—these are Integrity attacks.) We also show effective defenses against these attacks and discuss the tradeoffs required to defend against them.
We examine several attacks against the SpamBayes spam filter, each of which embodies a particular insight into the vulnerability of the underlying learning technique. In doing so, we more broadly demonstrate attacks that could affect any system that uses a similar learning algorithm. The attacks we present target the learning algorithm used by the spam filter SpamBayes (spambayes.sourceforge.net), but several other filters also use the same underlying learning algorithm, including BogoFilter (bogofilter.sourceforge. net), the spam filter in Mozilla's Thunderbird email client (mozilla.org), and the machine learning component of SpamAssassin (spamassassin.apache.org). The primary difference between the learning elements of these three filters is in their tokenization methods; i.e., the learning algorithm is fundamentally identical, but each filter uses a different set of features. We demonstrate the vulnerability of the underlying algorithm for SpamBayes because it uses a pure machine learning method, it is familiar to the academic community (Meyer & Whateley 2004), and it is popular with over 700,000 downloads. Although here we only analyze SpamBayes, the fact that these other systems use the same learning algorithm suggests that other filters are also vulnerable to similar attacks. However, the overall effectiveness of the attacks would depend on how each of the other filters incorporates the learned classifier into the final filtering decision.