Jump to ContentJump to Main Navigation
Learning Machine Translation$
Users without a subscription are not able to see the full content.

Cyril Goutte, Nicola Cancedda, Marc Dymetman, and George Foster

Print publication date: 2008

Print ISBN-13: 9780262072977

Published to MIT Press Scholarship Online: August 2013

DOI: 10.7551/mitpress/9780262072977.001.0001

Show Summary Details
Page of

PRINTED FROM MIT PRESS SCHOLARSHIP ONLINE (www.mitpress.universitypressscholarship.com). (c) Copyright The MIT Press, 2021. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in MITSO for personal use.date: 26 February 2021

Automatic Construction of Multilingual Name Dictionaries

Automatic Construction of Multilingual Name Dictionaries

Chapter:
(p.59) 3 Automatic Construction of Multilingual Name Dictionaries
Source:
Learning Machine Translation
Author(s):

Bruno Pouliquen

Ralf Steinberger

Publisher:
The MIT Press
DOI:10.7551/mitpress/9780262072977.003.0003

Machine translation and other natural language processing systems often experience performance loss when processing texts with unknown words, such as proper names. Proper name dictionaries are rare and can never be complete because new names are being made up all the time. A solution to overcome this problem would be to recognize and mark a named entity in text before translating it and to carry over the named entity untranslated. This chapter presents a method and a system that recognizes named entities of the types “person” and—to some extent—“organization” in multilingual text collections, and automatically identifies which of the newly identified names are variants of a known name. By doing this for nineteen languages and in the course of years, a multilingual name dictionary has been developed containing over 630,000 names and over 135,000 known variants, with up to 170 multilingual variants for a single name. The automatically generated name dictionary is used daily in the publicly accessible multilingual news aggregation and analysis system NewsExplorer.

Keywords:   machine translation, name recognition, multilingual recognition, proper names, language processing systems NewsExplorer

MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.

Please, subscribe or login to access full text content.

If you think you should have access to this title, please contact your librarian.

To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.