Ein Artikel auf Englisch in der Technology Review beschreibt die Versuche der FBI und CIA, mit modernen Technologien Dokumente in Fremdsprachen zu analysieren. Maschinenübersetzung wird nicht weiterverfolgt; der Computer soll Humanübersetzern (ein schönes Wort) unterstützen, nicht mehr ersetzen.
In the Technology Review, there is an article by Michael Erard called Translation in the Age of Terror. The topic is the National Virtual Translation Center in Washington D.C., the FBI & CIA’s joint project to expand the use of computer-assisted translation technologies in the intelligence community (to quote Michael Erard’s own description on the Forensic Linguists mailing list.
bq. In a Washington, DC, conference room soundproofed to thwart eavesdropping, five linguists working for the governmentspeaking on condition their names not be publisheddescribe the monumental task they face analyzing foreign-language intercepts in the age of terror.
In view of the mass of material collected, technology has to assist translators. It seems that attempts to use machine translation have been given up in favour of techniques to assist human translators. For example, software might make Arabic easier to read – the documents are often in bad condition. In one example of the way things might work, a document containing one suspicious word is farmed out to a retired translator in Idaho (!), who uses translation memory with a shared database of translated phrases to determine that the document is not suspicious. (This is an odd example – I would have thought a quick human inspection would have done that, saving the OCR time, and if not, that machine translation rather than translation memory would have been used to give the document a once-over).
Other software processes Arabic text to make it easier to search.
Of translation memory it says:
bq. A translation memory works sort of like a spell-check application; it selects a chunk of textwhether a word fragment, several words, or whole sentencesand matches that chunk against previously translated material, saving time and improving accuracy by providing at least a partial translation. Its already a key tool in the medical and legal industrieswhere the same jargon frequently crops up in different languages.
The reference to the ‘legal industry’ is rather brief and I have my suspicions. The article also has a sidebar on ‘white elephants’ – transliteration and multimedia systems that have not lived up to the hopes invested in them in the past.
For more forensic linguistics links, see my earlier entry.