Hostname: page-component-77c89778f8-n9wrp Total loading time: 0 Render date: 2024-07-21T07:45:46.688Z Has data issue: false hasContentIssue false

The Japanese lexical transducer based on stem-suffix style forms

Published online by Cambridge University Press:  01 December 1996

MASAKAZU TATENO
Affiliation:
Corporate Research Laboratories, Fuji Xerox Company Limited, 430 Sakai, Nakai-machi, Ashigarakami-gun, Kanagawa 259-01, Japan. e-mail: tateno@rsl.crl.fujixerox.co.jp, masuichi@rsl.crl.fujixerox.co.jp, umemoto@rsl.crl.fujixerox.co.jp
HIROSHI MASUICHI
Affiliation:
Corporate Research Laboratories, Fuji Xerox Company Limited, 430 Sakai, Nakai-machi, Ashigarakami-gun, Kanagawa 259-01, Japan. e-mail: tateno@rsl.crl.fujixerox.co.jp, masuichi@rsl.crl.fujixerox.co.jp, umemoto@rsl.crl.fujixerox.co.jp
HIROSHI UMEMOTO
Affiliation:
Corporate Research Laboratories, Fuji Xerox Company Limited, 430 Sakai, Nakai-machi, Ashigarakami-gun, Kanagawa 259-01, Japan. e-mail: tateno@rsl.crl.fujixerox.co.jp, masuichi@rsl.crl.fujixerox.co.jp, umemoto@rsl.crl.fujixerox.co.jp

Abstract

A Lexical Transducer (LT) as defined by Karttunen, Kaplan, Zaenen 1992 is a specialized finite state transducer (FST) that relates citation forms of words and their morphological categories to inflected surface forms. Using LTs is advantageous because the same structure and algorithms can be used for morphological analysis (stemming) and generation. Morphological processing (analysis and generation) is computationally faster, and the data for the process can be compacted more tightly than with other methods. The standard way to construct an LT consists of three steps: (1) constructing a simple finite state source lexicon LA which defines all valid canonical citation forms of the language; (2) describing morphological alternations by means of two-level rules, compiling the rules to FSTs, and intersecting them to form a single rule transducer RT; and (3) composing LA and RT.

Type
Research Article
Copyright
1997 Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)