Jump to ContentJump to Main Navigation
Learning Machine Translation$
Users without a subscription are not able to see the full content.

Cyril Goutte, Nicola Cancedda, Marc Dymetman, and George Foster

Print publication date: 2008

Print ISBN-13: 9780262072977

Published to MIT Press Scholarship Online: August 2013

DOI: 10.7551/mitpress/9780262072977.001.0001

Show Summary Details
Page of

PRINTED FROM MIT PRESS SCHOLARSHIP ONLINE (www.mitpress.universitypressscholarship.com). (c) Copyright The MIT Press, 2022. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in MITSO for personal use.date: 28 June 2022

Mining Patents for Parallel Corpora

Mining Patents for Parallel Corpora

(p.41) 2 Mining Patents for Parallel Corpora
Learning Machine Translation

Masao Utiyama

Hitoshi Isahara

The MIT Press

Large-scale parallel corpora are indispensable language resources for machine translation (MT). However, there are only a few publicly available large-scale parallel corpora. This chapter describes a Japanese-English patent parallel corpus created from patent families filed in Japan and the United States. The parallel corpus contains about 2 million sentence pairs that were aligned automatically. This is the largest Japanese-English parallel corpus and will be available to the public after the NTCIR-7 workshop meeting.

Keywords:   large-scale parallel corpora, machine translation, Japanese-English patent parallel corpus, patent families

MIT Press Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.

Please, subscribe or login to access full text content.

If you think you should have access to this title, please contact your librarian.

To troubleshoot, please check our FAQs, and if you can't find the answer there, please contact us.