Search

Anniversary article: Then and now: 25 years of progress in natural language engineering
John Tait, Yorick Wilks
Journal:

Natural Language Engineering / Volume 25 / Issue 3 / May 2019

Published online by Cambridge University Press:

15 May 2019, pp. 405-418
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The paper reviews the state of the art of natural language engineering (NLE) around 1995, when this journal first appeared, and makes a critical comparison with the current state of the art in 2018, as we prepare the 25th Volume. Specifically the then state of the art in parsing, information extraction, chatbots, and dialogue systems, speech processing and machine translation are briefly reviewed. The emergence in the 1980s and 1990s of machine learning (ML) and statistical methods (SM) is noted. Important trends and areas of progress in the subsequent years are identified. In particular, the move to the use of n-grams or skip grams and/or chunking with part of speech tagging and away from whole sentence parsing is noted, as is the increasing dominance of SM and ML. Some outstanding issues which merit further research are briefly pointed out, including metaphor processing and the ethical implications of NLE.

10 - Language and communication
- By Yorick Wilks, University of Sheffield
Edited by Keith Frankish, The Open University, Milton Keynes, William M. Ramsey, University of Nevada, Las Vegas
Book:

The Cambridge Handbook of Artificial Intelligence

Published online:

05 July 2014

Print publication:

12 June 2014, pp 213-231
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Language and communication are considered as relevant to artificial intelligence. Linguists are not the only scientists wishing to test theories of language functioning: so do psychologists and neurophysiologists. This chapter briefly looks at samples of important and prescient early work, and shows two contrasting, slightly later, approaches to the extraction of content, evaluation, representation, and the role of knowledge. It considers a range of systems embodying natural language processing (NLP)/computational linguistics (CL) aspects since the early seventies, and divides them by their relationships to linguistic systems and in relation to concepts normally taken as central to AI, namely logic, knowledge, and semantics. Broadly, statistical methods imply the use of only numerical, quantitatively based, methods for NLP/CL, rather than methods based on representations, whether those are assigned by humans or by computers. The chapter discusses the role of annotations to texts and the interpretability of core AI representations.

Are there really two types of learning?
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 9 / Issue 4 / December 1986

Published online by Cambridge University Press:

04 February 2010, p. 671
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Artificial intelligence and real constraints
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 1 / Issue 1 / March 1978

Published online by Cambridge University Press:

04 February 2010, p. 120
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Relevance must be to someone
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 10 / Issue 4 / December 1987

Published online by Cambridge University Press:

04 February 2010, pp. 735-736
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Searle's straw men
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 5 / Issue 2 / June 1982

Published online by Cambridge University Press:

04 February 2010, pp. 344-345
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Leibniz, location, and distinguishing types of sensation
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 1 / Issue 3 / September 1978

Published online by Cambridge University Press:

04 February 2010, p. 369
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Lamarck, Artificial Intelligence (AI), and belief
Yorick Wilks
Journal:

Behavioral and Brain Sciences / Volume 32 / Issue 6 / December 2009

Published online by Cambridge University Press:

28 January 2010, pp. 538-539
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Nothing in McKay & Dennett's (M&D's) target article deals with the issue of how the adaptivity, or some other aspect, of beliefs might become a biological adaptation; which is to say, how the functions discussed might be coded in such a way in the brain that their development was also coded in gametes or sex transmission cells.

AI and Anglo-Saxon Attitudes: a response to Martin Lam
Yorick Wilks, Rio Grande
Journal:

The Knowledge Engineering Review / Volume 5 / Issue 4 / December 1990

Published online by Cambridge University Press:

07 July 2009, pp. 285-288
- Article
- - You have access
- PDF
- Export citation

Part 3 - Experiments in machine translation
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 147-148
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Mechanical pidgin translation
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 161-186
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter gives an estimate of the research value of word-for-word translation into a pidgin language, rather than into the full normal form of an output language.
Introduction
The basic problem in machine translation is that of multiple meaning, or polysemy. There are two lines of research that highlight this problem in that both set a low value on the information-carrying value of grammar and syntax, and a high one on the resolution of semantic ambiguity. These are:
matching the main content-bearing words and phrases with a semantic thesaurus that determines their meanings in context;
word-for-word matching translation into a pidgin language using a very large bilingual word-and-phrase dictionary.
This chapter examines the second.
The phrase ‘Mechanical Pidgin’ was first used by R. H. Richens to describe the output given at the beginning of Section 2 of this chapter (below), which, he said, was not English at all but a special language, with the vocabulary of English and a structure reminiscent of Chinese. Machine translation output always is a pidgin, whose characteristics per se are never investigated. Either the samples of this pidgin are post-edited into fuller English, or the nature of the output is explained away as low-level machine translation, or rough machine translation, or some vague remark is made to the effect that pidgin machine translation is all right for most purposes.

1 - Words
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 21-38
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

To the question ‘What is a word?’ philosophers usually give, in succession (as the discussion proceeds), three replies:
‘Everybody knows what a word is.’
‘Nobody knows what a word is.’
‘From the point of view of logic and philosophy, it doesn't matter anyway what a word is, since the statement is what matters, not the word.’
In this paper I shall discuss these three reactions in turn, and dispute the last. Since it is part of my argument that the ways of thinking of several different disciplines must be correlated if we are to progress in our thinking as to what a word is, I shall try to exemplify as many differing contentions as possible by the use of the word ward, since this word is a word which can be used in all senses of ‘word’, which many words cannot.
Two preliminary points about terminology need to be made clear. I am using the word ‘word’ here in the type sense as used by logicians, rather than in the token sense, as synonymous with ‘record of single occurrence of pattern of sound-waves issuing from the mouth’. Thus, when I write here ‘mouth’, ‘mouth’, ‘mouth’, I write only one word.
The second point is that I use in this paper, in different senses, the terms ‘Use’, ‘usage’ and ‘use’. The question as to how the words ‘usage’ and ‘use’ should be used is, as philosophers know, a very thorny one.

11 - Braithwaite and Kuhn: Analogy-Clusters within and without Hypothetico-Deductive Systems in Science
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 283-298
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

1. Current relativist conceptions of science depend widely, though vaguely, upon the insights of T. S. Kuhn (1962), and, in particular, upon his notion of a paradigm. This notion is being used by relativists to support the contention that, since scientific theory is paradigm-founded, and therefore context-based, there can be no one discernible process of scientific verification. However, as I have shown in an earlier paper (1970a), there is another, more exact conception of a Kuhnian paradigm to be considered: namely, that conception of it which says that it is either an analogically used artefact, or even sometimes an actual ‘crude analogy’, that is, an analogical figure of speech expressed in a string of words.
This alternative conception of paradigm, far from supporting a verification-deprived conception of science (which, for those of us philosophers who are also trying to do technological science, just seems a conception of science totally divorced from scientific reality) can, on the contrary, be used to enrich and amplify the most strictly verification-based philosophy of science that is known, namely the Braithwaitean conception of it as a verifiable hypothetico-deductive (H-D) system. For such a paradigm, even though, in unselfconscious scientific thinking, it is usually a crude and concrete conceptual structure, can yet be shown to yield a set of abstract attributes.

Index
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 311-312
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Other References
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 304-310
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Translation
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 187-224
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The purpose of this chapter is to present a philosophical model of real translation. ‘Translation’ is here used in its ordinary sense: in the sense, that is, in which we say that passages of Burke can be translated into Ciceronian Latin prose, or that the sentence ‘He shot the wrong woman’ is untranslatable into good French. The term ‘philosophical’, however, needs some explaining, since, so far as I know, no one has made a philosophical model of translation as yet. I shall call a model of translation ‘philosophical’ if it has the following characteristics:
It must not only throw some light on the problem of transformation within a language, but must deal also with the problem of reference to something. That is to say, it must relate the strings of language units in the various languages with which it deals to public and recognisable situations in everyday life. It is characteristic of philosophers that, unlike most linguists, they do not regard a text in language as self-contained.
It must deal in concepts, not only in words or terms. All philosophers believe in concepts, though they sometimes pretend not to.
It must face, and not evade, the problem of constructing a universal grammar, while yet recognising fully how greatly languages differ, and howperipheral is the whole problemof determining the nature of language.

3 - Classification, concept-formation and language
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 57-80
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The argument of the paper is as follows:
The study of language, like the study of mathematical systems, has always been thought to be relevant to the study of forms of argument in science. Language as the scientist uses it, however, is assumed to be potentially interlingual, conceptual and classificatory. This fact makes current philosophical methods of studying language irrelevant to the philosophy of science.
An alternative method of analysing language is proposed. This is that we should take as a model for language the classification system of a great library. Such a classification system is described.
Classification systems of this kind, however, tend to break down because of the phenomena of profusion of meaning, extension of meaning and overlap of meaning in actual languages. The librarian finds that empirically based semantic aggregates (overlapping clusters of meanings) are forming within the system. These are defined as concepts. By taking these aggregates as units, the system can still be used to classify.
An outline sketch is given of a mathematical model of language, language being taken as a totality of semantic aggregates. Language, thus considered, forms a finite lattice. A procedure for retrieving information within the system is described.
The scientific procedures of phrase-coining, classifying and analogy-finding are described in terms of the model.
The point of relevance of the study of language to the philosophy of science
Two very general disciplines have always been thought especially relevant to our understanding of the nature of science.

5 - What is a thesaurus?
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 107-146
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Faced with the necessity of saying, in a finite space and in an extremely finite time, what I believe the thesaurus theory of language to be, I have decided on the following procedure.
First, I give, in logical and mathematical terms, what I believe to be the abstract outlines of the theory. This account may sound abstract, but it is being currently put to practical use. That is to say, with its help an actual thesaurus to be used for medium-scale mechanical translation (MT) tests, and consisting of specifications in terms of archeheads, heads and syntax markers, made upon words, is being constructed straight on to punched cards. The cards are multiply punched; a nuisance, but they have to be, since the thesaurus in question has 800 heads. There is also an engineering bottleneck about interpreting them; at present, if we wish to reproduce the pack, every reproduced card has to be written on by hand, which makes the reproduction an arduous business; a business also that will become more and more arduous as the pack grows larger. If this interpreting difficulty can be overcome, however, we hope to be able to offer to reproduce this punched-card thesaurus mechanically, as we finish it, for any other MT group that is interested, so that, at last, repeatable, thesauric translations (or mistranslations) can be obtained.

Part 2 - The thesaurus as a tool for machine translation
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 81-82
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - ‘Agricola in curvo terram dimovit aratro’
Margaret Masterman
Edited by Yorick Wilks, University of Sheffield
Book:

Language, Cohesion and Form

Published online:

22 September 2009

Print publication:

16 January 2005, pp 149-160
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter examines a first-stage translation from Latin into English with the aid of Roget's Thesaurus of a passage from Virgil's Georgics.
The essential feature of this program is the use of a thesaurus as an interlingua: the translation operations are carried out on a head language into which the input text is transformed and from which an output is obtained. The notion of ‘heads’ is taken from the concepts or topics under which Roget classified words in his thesaurus. These operations are of three kinds: semantic, syntactic and grammatical.
The general arrangement of the program is as follows:
Dictionary matching: the chunks of the input language are matched with the entries in a Latin interlingual dictionary giving the raw material of the head language; this consists of heads representing the semantic, syntactic and grammatical elements of the input.
Operations on the semantic heads: these give a first-stage translation.
Operations on the syntactic heads: giving a syntactically complete, though unparsed, translation.
Operations on the grammatical heads: giving a parsed and correctly ordered output.
Cleaning up operations: the output is ‘trimmed’ by, e.g., insertion of capital letters, removal of repetitions like ‘farmer-er’.
Only Stage 2 of the procedure is given in detail here.
Information obtained from stage 1
The Latin sentence to be translated was chunked as follows:
AGRI-COL-A IN-CURV-O TERR-AM DI-MOV-IT AR-ATRO
A number of these generated syntactic heads only. Those with semantic head entries are AGRI-COL-IN-CURV-TERR-DI-MOV-AR-.

Search Results

Refine search

Refine search

Actions for selected content:

64 results

Anniversary article: Then and now: 25 years of progress in natural language engineering

10 - Language and communication

Summary

Are there really two types of learning?

Artificial intelligence and real constraints

Relevance must be to someone

Searle's straw men

Leibniz, location, and distinguishing types of sensation

Lamarck, Artificial Intelligence (AI), and belief

AI and Anglo-Saxon Attitudes: a response to Martin Lam

Part 3 - Experiments in machine translation

7 - Mechanical pidgin translation

Summary

1 - Words

Summary

11 - Braithwaite and Kuhn: Analogy-Clusters within and without Hypothetico-Deductive Systems in Science

Summary

Index

Other References

8 - Translation

Summary

3 - Classification, concept-formation and language

Summary

5 - What is a thesaurus?

Summary

Part 2 - The thesaurus as a tool for machine translation

6 - ‘Agricola in curvo terram dimovit aratro’

Summary

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

64 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary