  • Print publication year: 2006
  • Online publication date: August 2009

Appendix A - DIAL: A Dedicated Information Extraction Language for Text Mining



This appendix provides an example of a dedicated information extraction language called DIAL (declarative information analysis language). The purpose of the appendix is to show the general structure of the language and offer some code examples that will demonstrate how it can be used to extract concepts and relationships; hence, we will not cover all aspects and details of the language.

The DIAL language is a dedicated information extraction language enabling the user to define concepts whose instances are found in a text body by the DIAL engine. A DIAL concept is a logical entity, which can represent a noun (such as a person, place, or institution), an event (such as a business merger between two companies or the election of a president), or any other entity for which a text pattern can be defined. Instances of concepts are found when the DIAL engine succeeds in matching a concept pattern to part of the text it is processing. Concepts may have attributes, which are properties belonging to the concept whose values are found in the text of the concept instance. For instance, a “Date” concept might have numeric day, month, and year attributes and a string attribute for the day of the week.

A DIAL concept declaration defines the concept's name, attributes, and optionally some additional code common to all instances of the concept.