Corpora/Korpora

(I drafted this entry before I read about the Utah Court)

I recently ‘attended’ a webinar about how translators can use corpora to investigate their target language.

I’ve been fascinated by corpora since I first encountered the Collins Cobuild English dictionary when I was teaching English – I think it was in the 1990s. The dictionary was quite a milestone: it used a database of usage examples to show that what people say is not always the same as what language teachers say they say.

Once I even tried to learn Python, after Mark Liberman said it was a good project for Christmas, but I did not get very far and suspect he has a larger brain than I have.

If you’ve got a free weekend or two, you could do a lot worse than to spend some time messing around with Python and NLTK — there’s even an online book to guide you.

I’ve also been to a (bricks-and-mortar) seminar on corpora, but at that time I did not follow it up by preparing my own corpus.

And this is where the ecpd webinar was so helpful, because it got me that far in half an hour the following evening, using free software (BootCat and AntConc).

In a later post I will give a description of how it works.

What you are doing with BootCat is creating a corpus made of texts from the internet, presumably html ones, so it’s not that different from a Google custom search engine, which you can make for yourself

But with the second program (AntConc in my case) you can analyze the language in a variety of interesting ways.

The webinar was Using Corpora in Translation run by eCPD Webinars

What I don’t know is how useful corpora are for legal translation. The third part of the webinar was on this subject. It was suggested that analyzing legal English in this way would be particularly interesting for legal translators who are not lawyers. I should think it would be interesting for legal translators who are lawyers too! and for lawyers who want to talk about law in English. Most lawyers forget a lot of their law, and they don’t necessarily think about the language of the law in the way a translator needs to.

But translating legal texts from one language and system to another is not the same as writing about a legal system in its original language.

A German lawyer who wants to write legal English could learn a lot from a well-constructed corpus. Areas of law that suggest themselves: judgments (the formal style), legal correspondence (formal style). Areas that seem more problematic are contracts (where the legal equivalence must be checked and apparently similar clauses cannot be taken over lock stock and barrel) and all law which differs between the source and target legal systems (here an explanation is needed).

A potential source for a corpus of English judgments is www.bailii.org

One might assume that the EU bilingual databases/TM would be useful for all lawyers writing about EU law in various languages. But these are inconsistent.

More weblinks/Nochmal Internetlinks

Still not posting much, so here are three links.

1. TestYourVocab is a site where you can get an informed estimate of the size of your English vocabulary. Average sizes for native speakers and non-native speakers are given at the end. Via Johnson.

2. Stan at Sentence First has an entry on Online IPA Keyboards, and the comments give more suggestions (this interests me because I sometimes want to discuss pronunciation).

3. The Obiter J blog, which I’ve recommended before for information on English law, has a summary of the differences in Norwegian law – a very different legal system. Among other links is one to a Guardian article on Norwegian prisons.

4. A nice post on Amy Winehouse on the LRB blog. You can’t hear the version of ‘Valerie’ linked to in Germany, but there are others on Youtube you can hear.

5. A new version of ApSIC X-Bench has appeared – see post by Lisa on Ü wie Übersetzen – XBench lets you search bilingual texts for vocabulary and is free of charge (I haven’t used it for a while so won’t say more than that it is very useful).

Photos from London/Fotos aus London

Statue of Yuri Gagarin (only there for a year, alas – the British Council wouldn’t let me see his spacesuit as you have to book in advance):

Lloyds:

Tottenham Court Road tube station:

Valuable cure for high blood pressure. I fancy the green hand:

Heirloomic assistance:

Corpus used in Utah court/Gericht in Utah benutzt Korpus

Mark Liberman at Language Log cites a Utah court that has relied on corpus linguistics. It was necessary to define the word custody in connection with a child, and some of the dictionary definitions were irrelevant. He cites Gordon Smith at Conglomerate:

Today, my former colleague and current Utah Supreme Court Justice Tom Lee used corpus linguistics in a lengthy concurring opinion (the relevant section starts at page 34). In this opinion, Justice Lee is interpreting the word “custody,” and he brings corpus linguistics to the fight. Of course, it’s no accident that Stephen Mouritsen is Justice Lee’s law clerk, but the bigger point here is that Justice Lee was persuaded — as I am — of the value of corpus linguistics to shed light on this interpretive question. Justice Lee’s collegues are not enamored with the approach, but you can read the opinions for yourself and see who gets the better of the argument.

This was apparently the first judicial decision ever to rely on a corpus.

The question arises, and I want to turn to it shortly, whether corpora are useful for legal translation. My feeling is that a corpus could be useful to reveal the style of judgments, but less so when it comes to contracts. But that might be because I believe that a translation of German law into English should not deny its foreign origins. But more shortly,in connection with a webinar on corpora I ‘attended’ recently.

Liver birds/Deutscher 50 Jahre nach seinem Tod geehrt

The Liver birds, a Liverpool landmark, were made by a German who had taken British citizenship twenty years early but nevertheless was treated as an enemy alien during the First World War and forced to return to Germany, leaving his wife and child behind, after the war. Carl Bernard Bartels, whose father was a Black Forest woodcarver, did manage to return to England, but it’s only this week that he has been given some honour – on the centenary of the building. The Guardian:

“He also made artificial limbs for servicemen in the second world war,” said his great-grandson Tim Olden, a graphic artist from Southampton who is one of 13 family members travelling to Liverpool to receive the award. “But it’s only very recently that he has started to get real recognition. My mother took a ‘let things lie’ attitude, but one of her last wishes was to go and see the birds, and Liverpool gave her a warm welcome.”

The visit in 1998 began Liverpool’s rediscovery of Bartels, including his skill as the first person to sculpt a nonexistent bird only previously portrayed in drawings and paintings. He also managed to create a male and female, giving rise to the scouse legend that one or the other flaps its wings if a virgin or an honest man walks along Pier Head.

Presumably that’s why they never flap their wings.

Interpreting and translation links/Links zu Übersetzen und Dolmetschen

A new book – in German – Gerichtsdolmetschen, by Christiane Driesen and Haimo-Andreas Petersen – table of contents here.

(Via uepo.de)

Peter Newmark has died – more here. I believe he was originally a Germanist.

There has been a bizarre German blog contest called Frauen-Blog-WM 2011. This has been won by a translator’s blog, buurtaal by Alexandra Kleijn.