Book contents
- Frontmatter
- Dedication
- Contents
- Foreword
- Preface
- Acknowledgements
- Making search work – critical success factors
- 1 Search must work
- 2 How search works
- 3 The search business
- 4 Making a business case for search
- 5 Specifying and selecting a search engine
- 6 Optimizing search performance
- 7 Search usability
- 8 Desktop search
- 9 Implementing web search
- 10 Implementing search for an intranet
- 11 Enterprise search
- 12 Multilingual search
- 13 Future directions
- Appendix Search software vendors
- Further reading
- Glossary
- Subject index
- Company index
- Frontmatter
- Dedication
- Contents
- Foreword
- Preface
- Acknowledgements
- Making search work – critical success factors
- 1 Search must work
- 2 How search works
- 3 The search business
- 4 Making a business case for search
- 5 Specifying and selecting a search engine
- 6 Optimizing search performance
- 7 Search usability
- 8 Desktop search
- 9 Implementing web search
- 10 Implementing search for an intranet
- 11 Enterprise search
- 12 Multilingual search
- 13 Future directions
- Appendix Search software vendors
- Further reading
- Glossary
- Subject index
- Company index
Summary
In this chapter:
■ Why multilingual search is so difficult
■ The value of Unicode
■ The problems of transliteration
Searching the Tower of Babel
The issues concerning searching in multiple languages are often poorly understood, and yet the need to be able to do so is going to be increasingly important. Research from Byte Level Research (http://bytelevel.com) indicates that the majority of internet users are not native-English speakers.
From the perspective of website search, this means that there may be a considerable number of visitors searching the site who may have only a limited range of synonyms and linguistic awareness. This is not just on a cross-national basis. It is estimated that over 300 languages are spoken in London alone, though this is probably the most linguistically diverse city in the world. Clearly, in the period up to the 2008 Olympic Games, the growth in Chinese users is considerable.
The management of multiple languages also needs to be carefully considered in the enterprise environment. Just because an organization has English as its global corporate language does not mean to say that all documents will be in English. Documents relating to staff contracts and policies, and contracts with local suppliers, will invariably be in local languages. Patents and other legal documents will also be in more than one language, and if any one individual user is to have global access to the resources of the organization, the problems of how to search in a language-independent way as regards both the language skills of the searcher and the languages of the documents need to be addressed.
Searching multiple languages
Many search engines claim to be able to search in multiple languages, but care must be taken over just what this means. It usually means that the search engine can parse a document written in a wide range of languages, create an index, and then run a query against that index to present a number of relevant documents. Although not easy to undertake, this is now quite well developed technology and uses Unicode to convert a language to a standardized (or rather normalized) format. This enables a search to be carried out using a query in the destination language.
- Type
- Chapter
- Information
- Making Search WorkImplementing web, intranet and enterprise search, pp. 127 - 134Publisher: FacetPrint publication year: 2007