EU Acquis translation memory

Ich zitiere Bettina in Linklogbuch):

Die Generaldirektion Übersetzung der Europäischen Kommission hat den Acquis Communitaire (Rechtsvorschriften, Verträge und Urteile des Europäischen Gerichtshofs) in 22 Sprachen in einem Format zum Download bereitgestellt, aus dem problemlos ein Translation Memory in einer beliebigen Sprachenkombination erstellt werden kann.

I completely missed this, whenever it happened. The acquis communautaire (more or less) has now been made available by the Directorate-General for Translation in 22 languages in a form that can easily be converted into a translation memory (assuming one has enough RAM and disc space to use it). That will be a highly useful thing to have, although at the same time, I suspect, very diverse in the suggestions it makes.

There is a tool to extract the language pair you’re interested in.

There is not only the DGT TM collection, but also a corpus for linguistics research called JRC-Acquis. Read more at:

Information in English

As of November 2007, the European Commission’s Directorate General for Translation (DGT) made publicly accessible its multilingual Translation Memory for the Acquis Communautaire – a collection of parallel texts (texts and their translation, also referred to as bi-texts) in 22 languages. On this page, you will find a summary of this unique resource and instructions on where to download it and how to produce bilingual aligned corpora for any of the 231 language pair combinations (462 language pair directions).

5 thoughts on “EU Acquis translation memory

  1. Interesting – at the ASETRAD/FIT Europe seminar on IP & CAT tools in Barcelona in September, the lawyer from the Commission was pretty adamant about copyright on the AC. And as creators of the databases, the DGT (and, therefore, the Commission) would have copyright on the TMs. Difficult to see how these corpora could be used commercially without affecting those copyrights – wonder what restrictions are placed on the usage of the database?

    • I’ve only ever used Celex for quoting legislation and decisions, which must be permitted. I wonder how far there would be copyright in phrases?

      Reading down, I find the third paragraph under 5) rather odd. If the legislation and quasi-legislative documents are in the public domain, how can their dissemination be acceptable only for ‘non-commercial use’? And you have to use a disclaimer that only the paper version of the OJ is authoritative! Does the DGT have every electronic citation checked against that?

      Another thing: English is treated as the source language throughout. Are there implications here? What about the legislative fiction that all the legislation is valid in all languages?

    • I’m not doing it until I get a new computer in a few weeks’ time, but I will be interested to hear how it goes. Maybe Bettina has done it.

      • Yes, she has. Some 379 000 segments were imported very smoothly in TRADOS, although, if I remember correctly, TRADOS reported a small number of errors, but so few that I didn’t bother to find out why. Happy New Year!

Leave a Reply to MM Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.