On the Relationships between Natural Language Processing and Cognitive Sciences
This introduction aims at giving an overview of the questions and problems addressed jointly in natural language processing and cognitive science. More precisely, the idea of this introduction, and more generally of this book, is to address how these fields can fertilize each other, bringing recent advances to produce richer studies.
Natural language processing is fundamentally dealing with semantics and more generally with knowledge. Cognitive science is also mostly dealing with knowledge: how knowledge is acquired and processed in the brain. The two domains have developed largely independently, as we discuss later in this Introduction, but there are obvious links between the two, and a large number of researchers have investigated problems involving the two fields, in either the data or the methods used.
A Quick Historical Overview
The landscape of natural language processing (NLP) has dramatically changed in the last decades. Until recently, it was generally assumed that one first needs to adequately formalize an information context (for example information contained in a text) in order to be able to subsequently develop applications dealing with semantics (see, for example, Sowa 1991; Allen 1994; Nirenburg and Raskin 2004). This initial step involved manipulating large knowledge bases of manually hand-crafted rules, and has resulted in the new field of “knowledge engineering” (Brachman and Levesque 2004).
Knowledge can be seen as the result of the confrontation of our a priori ideas with the reality of the outside world. This leads to several difficulties: (1) the task is potentially infinite since people constantly perceive a multiplicity of things; (2) perception interferes with information already registered in the brain, leading to complex inferences with commonsense knowledge; (3) additionally, very little is known about how information is processed in the brain, which makes things even harder to formalize.
To answer some of these issues, a common assumption is that knowledge could be disconnected from perception, which led to projects aiming at developing large static databases of “common sense knowledge” from CYC (Lenat 1995) to more recent general domain ontologies like ConceptNet (Liu and Singh 2004). However, these projects have always led to databases that, despite their sizes, were never enough to completely and accurately formalize a given domain, and domain-independent applications were thus even more unattainable.