Djoerd Hiemstra
University of Twente/CTIT
P.O. Box 217, 7500 AE, Enschede
The Netherlands, Tel: +31 53 4892335
hiemstra@cs.utwente.nl
Franciska de Jong
University of Twente/CTIT
P.O. Box 217, 7500 AE, Enschede
The Netherlands, Tel: +31 53 4892335
fdejong@cs.utwente.nl
Wessel Kraaij
Netherlands Organization for Applied Scientific Research (TNO)
Stieltjesweg 1, 2628 CK, Delft
The Netherlands, Tel: +31 15 2692259
kraaij@tpd.tno.nl
Imagine yourself, sitting behind a PC that is connected to the Internet. You are looking for information on some environmental issue: acid rain, tropical woods, solar energy, recycling, or whatever. You have found some documents in your own language but they do not satisfy you. An exploring and curious person as you are, you would like to browse through documents in another language that is not your mother tongue, even if you don’ not understand a word of this other language. Suppose this other language is French. You want to look on the Internet whether there are French documents about the topic you are interested in. Imagine a computer system that allows you to type in a query in your own language, and get back documents, originally in French, but translated by the system in your language, so that you can read them! Current translation technoloy is unsufficient for this purpose. Still, the scenario described above is a realistic one. But how can it be done?
The answer is to simply accept a crippled syntax, because in information retrieval it is not the syntactic correctness of a text that is important but the relevance of the information to the user’s need. To judge whether a piece of information fits the user’s requests, there is no need for a grammatical correctness. Moreover, if the information appears to be important, but is too ungrammatical or inunderstandable, the user can get it translated manually. In this way only those texts are manually translated that are actually used.
Translation might also seem a strange request in a virtual world called Internet that is dominated by the English language, nowadays generally accepted as the standard scientific and cultural language by the whole western world. But the real world is different. The majority of the people are still very resistant or incapable to cross the language borders. Moreover, a lot of very interesting information, especially in the environmental field, is simply not available in English. This information remains veiled for most people that do not speak the language in which the information is stated. Twenty-One aims at changing this situation rigorously without the mega-effort of translating every piece of information into every possible language. By making use of modern natural language technology, documents will be translated fully automatically into the user’s language, whatever this language may be. For the moment however, Twenty-One will be restricted to English, Dutch, French and German, with a tentative extension to Spanish.
Twenty-One is much more than automatic translation. The project aims at multimedia objects, rather than just pieces of textual information. Twenty-One can be characterized by the following keywords:
In 1992 the UNCED Conference in Rio de Janeiro, visited by more than 170 countries all over the world, yielded a document called “Agenda 21”. This document outlines the basic principles and strategies for setting up sustainable development projects, and has become one of the most important documents for environmental organisations and local authorities all over the world. The conference had a follow-up in 1995 in Berlin. But two years earlier, in 1993, the European Commission started a huge “Awareness Programme” in Europe, to help local authorities, non- governmental organisations (NGO’s) and citizens become aware of their potential role in sustainable development of their own environment. Basic issue within this program was (and is): how can we make one party make use of the experience of another party. For example: if a city in Holland has successfully carried out a project on vandalism, and the results might be applicable to a city in Italy, how and when does this information go from one place to another. Within the European Awareness Programme networks and protocols were developed to monitor the collection and distribution of local sustainable development projects and initiatives. But there is still virtually no technical infrastructure. Moreover the language borders and concurrent required translation of documents make it far too expensive to achieve the desired distribution. The Twenty-One project started in January 1996 as a European project sponsored by the Telematics Programme of the European Commission, Sector Information Engineering to fill in this gap. By automatic disclosure and translation of documents valuable information from any source about any subject in any language becomes within reach of even the smallest wallet.
By july 1997 the Twenty-One project will be halfway its duration. An intermediate demonstrator has been realized already. It contains most of the functionality mentioned above. The demonstrator allows users to look for documents in English and Dutch, to search for noun phrases, to search for similar documents (relevance feedback) and to look at bitmaps of the original (paper) versions of the documents. The demonstrator is available via the Twenty-One Web-page on: http://twentyone.tpd.tno.nl/21demomooi
Partners in Twenty-One are:
Industrial partners: Getronics (main contractor, The Netherlands), Highland Software Systems (Schotland) Rank Xerox (France),
Research Organisations and Universities: TNO-TPD (The Netherlands), DFKI (Germany), University of Tuebingen (Germany), University of Twente (The Netherlands)
Environmental organisations: Stichting MOOI (The Netherlands), Friends of the Earth (Belgium), VODO (France), Environ Trust (United Kingdom), Climate Alliance (Germany)
For more information please contact: Dr. W.G. ter Stal, Project Coordinator Twenty-One, Getronics Software, PO Box 22678, Amsterdam, the Netherlands, phone: +31-20-4306126, telefax: +31-20-4306030
This document was generated using the LaTeX2HTML translator Version 0.6.4 (Tues Aug 30 1994) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -show_section_numbers -no_navigation -split 0 riaodemo.tex.
The translation was initiated by Djoerd Hiemstra on Tue Jun 16 16:09:01 MET DST 1998