UIMA: an architectural approach to unstructured information processing in the corporate research environment

DAVID FERRUCCI; ADAM LALLY

doi:10.1017/S1351324904003523

Abstract

IBM Research has over 200 people working on Unstructured Information Management (UIM) technologies with a strong focus on Natural Language Processing (NLP). These researchers are engaged in activities ranging from natural language dialog, information retrieval, topic-tracking, named-entity detection, document classification and machine translation to bioinformatics and open-domain question answering. An analysis of these activities strongly suggested that improving the organization's ability to quickly discover each other's results and rapidly combine different technologies and approaches would accelerate scientific advance. Furthermore, the ability to reuse and combine results through a common architecture and a robust software framework would accelerate the transfer of research results in NLP into IBM's product platforms. Market analyses indicating a growing need to process unstructured information, specifically multilingual, natural language text, coupled with IBM Research's investment in NLP, led to the development of middleware architecture for processing unstructured information dubbed UIMA. At the heart of UIMA are powerful search capabilities and a data-driven framework for the development, composition and distributed deployment of analysis engines. In this paper we give a general introduction to UIMA focusing on the design points of its analysis engine architecture and we discuss how UIMA is helping to accelerate research and technology transfer.

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

McGuinness, Deborah L. and Pinheiro da Silva, Paulo 2004. Explaining answers from the Semantic Web: the Inference Web approach. Journal of Web Semantics, Vol. 1, Issue. 4, p. 397.

Berardi, M. Lapi, M. Leo, P. Malerba, D. Marinelli, C. and Scioscia, G. 2004. A data mining approach to PubMed query refinement. p. 401.

Levas, A. Pingali, G. Podlaseck, M. and Murdock, J.W. 2005. Exploiting pervasive enterprise chronicles using unstructured information management. p. 239.

Sow, D. Ebling, M. Lehmann, R.-P. Davis, J. and Bergman, L. 2005. SCOUT contextually organizes user tasks. p. 94.

Gates, Stephen C. Teiken, Wilfried and Cheng, Keh-Shin F. 2005. Taxonomies by the numbers. p. 568.

Wang, Shenghui and Pan, Jeff Z. 2005. On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE. Vol. 3761, Issue. , p. 1279.

Berardi, Margherita Lapi, Michele Leo, Pietro and Loglisci, Corrado 2005. Innovations in Applied Artificial Intelligence. Vol. 3533, Issue. , p. 500.

Mishne, Gilad Carmel, David Hoory, Ron Roytman, Alexey and Soffer, Aya 2005. Automatic analysis of call-center conversations. p. 453.

Wang, Shenghui and Pan, Jeff Z. 2006. The Semantic Web - ISWC 2006. Vol. 4273, Issue. , p. 668.

Kandogan, Eser Krishnamurthy, Rajasekar Raghavan, Sriram Vaithyanathan, Shivakumar and Zhu, Huaiyu 2006. Avatar semantic search. p. 790.

Hassell, Joseph Aleman-Meza, Boanerges and Arpinar, I. Budak 2006. The Semantic Web - ISWC 2006. Vol. 4273, Issue. , p. 44.

da Silva, Paulo Pinheiro McGuinness, Deborah L. and Fikes, Richard 2006. A proof markup language for Semantic Web services. Information Systems, Vol. 31, Issue. 4-5, p. 381.

Mikroyannidis, Alexander Theodoulidis, Babis and Persidis, Andreas 2006. PARMENIDES: Towards Business Intelligence Discovery from Web Data. p. 1057.

Welty, Chris and Murdock, J. William 2006. The Semantic Web - ISWC 2006. Vol. 4273, Issue. , p. 709.

Garcia Adeva, J.J. and Calvo, R. 2006. Mining Text with Pimiento. IEEE Internet Computing, Vol. 10, Issue. 4, p. 27.

Deriviere, Julien Hamon, Thierry and Nazarenko, Adeline 2006. Advances in Natural Language Processing. Vol. 4139, Issue. , p. 56.

Cunninghamand, H. and Bontcheva, K. 2006. Encyclopedia of Language & Linguistics. p. 733.

Shimazu, Keiko Arisawa, Tatsuya and Saito, Isao 2006. Interdisciplinary Contents Management Using 5W1H Interface for Metadata. p. 909.

Shimazu, Keiko Saito, Isao and Furukawa, Koichi 2006. Contents Sharing Framework Using CCCI Model for Intra-University Network. p. 24.

Garcia Adeva, Juan 2006. Serving Text-Mining Functionalities with the Software Architecture Plato. p. 7.

Download full list

Article contents

UIMA: an architectural approach to unstructured information processing in the corporate research environment

Abstract

Access options

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

UIMA: an architectural approach to unstructured information processing in the corporate research environment

Abstract

Access options

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests