Many text mining systems introduced in the late 1990s were developed by computer scientists as part of academic “pure research” projects aimed at exploring the capabilities and performance of the various technical components making up these systems. Most current text mining systems, however – whether developed by academic researchers, commercial software developers, or in-house corporate programmers – are built to focus on specialized applications that answer questions peculiar to a given problem space or industry need. Obviously, such specialized text mining systems are especially well suited to solving problems in academic or commercial activities in which large volumes of textual data must be analyzed in making decisions.
Three areas of analytical inquiry have proven particularly fertile ground for text mining applications. In various areas of corporate finance, bankers, analysts, and consultants have begun leveraging text mining capabilities to sift through vast amounts of textual data with the aims of creating usable forms of business intelligence, noting trends, identifying correlations, and researching references to specific transactions, corporate entities, or persons. In patent research, specialists across industry verticals at some of the world's largest companies and professional services firms apply text mining approaches to investigating patent development strategies and finding ways to exploit existing corporate patent assets better. In life sciences, researchers are exploring enormous collections of biomedical research reports to identify complex patterns of interactivities between proteins.
This chapter discusses prototypical text mining solutions adapted for use in each of these three problem spaces.