其他子语料库
Books – A collection of translated literature
DGT – A collection of EU Translation Memories provided by the JRC
DOGC – Documents from the Catalan Goverment
ECB – European Central Bank corpus
EMEA – European Medicines Agency documents
The EU bookshop corpus
EUconst – The European constitution
EUROPARL v7 – European Parliament Proceedings
giga-fren – French-English Gigal-Word Corpus
GNOME – GNOME localization files
Global Voices – News stories in various languages
The Croatian – English WaC corpus
JRC-Acquis- legislative EU texts
KDE4 – KDE4 localization files (v.2)
KDEdoc – the KDE manual corpus
MBS – Belgisch Staatsblad corpus
memat – Xhosa/English parallel data
MontenegrinSubs – Montenegrin movie subtitles
MultiUN – Translated UN documents
News Commentary, v9.0, v9.1
OfisPublik – Breton – French parallel texts
OO – the OpenOffice.org corpus
OpenOffice.org 3 corpus
OpenSubtitles – the opensubtitles.org corpus
OpenSubtitles2011, OpenSubtitles2012, OpenSubtitles2013
OpenSubtitles2016 – snapshot from 2016
OpenSubtitles2018 – new complete version
ParaCrawl corpus
ParCor – A Parallel Pronoun-Coreference Corpus
PHP – the PHP manual corpus
Regeringsförklaringen – a tiny example corpus
SETIMES– A parallel corpus of the Balkan languages
SPC – Stockholm Parallel Corpora
Tatoeba – A DB of translated sentences
TedTalks hr-en
TED Talks 2013
Tanzil – A collection of Quran translations
TEP – The Tehran English-Persian subtitle corpus
Ubuntu – Ubuntu localization files
UN – Translated UN documents
Wikipedia – translated sentences from Wikipedia
WikiSource
WMT News Test Sets
The Xhosa – English Navy corpus
主流CAT
SDL Trados
BasicCAT
Déjà Vu
MemoQ
雪人CAT
OmegaT
Across
Transmate
WordFast
雅信CAT
Wordbee
SmartCAT
MateCAT
CafeTran文章源自西贝博客-https://qinghe.me/translators-resource-list.html
文章源自西贝博客-https://qinghe.me/translators-resource-list.html
评论