Named Entity Transliteration and Discovery in Multilingual Corpora
Named Entity Transliteration and Discovery in Multilingual Corpora
This chapter presents a novel algorithm for cross-lingual multiword name entity (NE) discovery in a bilingual weakly temporally aligned corpus. It shows that using two independent sources of information (transliteration and temporal similarity) together to guide NE extraction yields better performance than using them alone. The algorithm requires almost no supervision or linguistic knowledge. The algorithm was evaluated on an English-Russian corpus, and showed a high level of NE discovery in Russian.
Keywords: algorithm, name recognition, transliteration, temporal similarity, English, Russian, machine learning
MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
Please, subscribe or login to access full text content.
If you think you should have access to this title, please contact your librarian.
To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.