Jump to ContentJump to Main Navigation
Learning Machine Translation$
Users without a subscription are not able to see the full content.

Cyril Goutte, Nicola Cancedda, Marc Dymetman, and George Foster

Print publication date: 2008

Print ISBN-13: 9780262072977

Published to MIT Press Scholarship Online: August 2013

DOI: 10.7551/mitpress/9780262072977.001.0001

Show Summary Details
Page of

PRINTED FROM MIT PRESS SCHOLARSHIP ONLINE (www.mitpress.universitypressscholarship.com). (c) Copyright The MIT Press, 2021. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in MITSO for personal use.date: 30 July 2021

Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes

Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes

Chapter:
(p.93) 5 Combination of Statistical Word Alignments Based on Multiple Preprocessing Schemes
Source:
Learning Machine Translation
Author(s):

Jakob Elming

Nizar Habash

Josep M. Crego

Publisher:
The MIT Press
DOI:10.7551/mitpress/9780262072977.003.0005

This chapter presents an approach to using multiple preprocessing (tokenization) schemes to improve statistical word alignments. In this approach, the text to align is tokenized before statistical alignment, and then remapped to its original form afterwards. Multiple tokenizations yield multiple remappings (remapped alignments), which are then combined using supervised machine learning. The remapping strategy improves alignment correctness by itself. The combination of multiple remappings also improves measurably over a commonly used state-of-the-art baseline. A relative reduction of alignment error rate of about 38% is obtained on a blind test set.

Keywords:   multiple preprocessing, tokenization, word alignment, remapping, machine learning

MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.

Please, subscribe or login to access full text content.

If you think you should have access to this title, please contact your librarian.

To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.