Automatic OCR and translation in the field / Maschinelle Dokumentenübersetzung im Feld

The Defense Advanced Research Projects Agency (DARPA) (Wikipedia), which is responsible for the development of new technology for use by the US military, has commissioned a Multilingual Automatic Document Classification Analysis and Translation (MADCAT) system to convert documents of every kind in Arabic (in the first instance) into English. Thus Mark Rutherford on CNET.

If and when it pans out, MADCAT is expected to provide “relevant, distilled, actionable information” to commanders and troops on the ground by translating foreign language text images accurately and automatically without bothering with linguists and analysts, according to the contract specifications. During the MADCAT proposal process DARPA demanded bidders demonstrate a “revolutionary approach,” one that will produce a new benchmark in language translation. Specifically excluded were “minor evolutionary” improvements or “narrow applications” to current technology.
BBN says it plans to pull it off by integrating “optical character recognition with state-of-the-art translation and distillation techniques,” while developing “novel methods for processing handwritten text,” according to its press release.

DARPA’s original announcement is here.

Nomen est omen?

(via MacLingua at Yahoogroups)

6 thoughts on “Automatic OCR and translation in the field / Maschinelle Dokumentenübersetzung im Feld

  1. I like the “Metrics” on the last page of the original announcement, especially the statistics for recognition and translation of handwriting.
    The “Baseline” (i.e. where we are at now) is 2%. Phase 5 (completed) is 90% accuracy.

    Alice in Wonderland anybody?

    • Hello ,
      My name is Erez and I am a Media Buyer at Babylon ( Babylon is leading international translation software which have millions of customers and clients all over the world.


    • But did you see on Wikipedia what these people have developed? ARPANET, for example.

      Still, I liked the picture captioned ‘Is this graffiti or is it important?’ – I have the feeling there might not be time to OCR that in some situations in Iraq.

  2. The scary thing is the that they hope the pussy will be capable of “automatically providing relevant, distilled ACTIONABLE (!!) information to military command and personnel”.
    So they want to ask an MT program whether they should blow the house up. And all because of a home artist’s painting on the wall.

    Wikipedia makes the worries worse – a house full of tunnel vision IT geeks who think that computers can do anything.

    • Yes, I suppose you’re right. It is as ridiculous as all these schemes are. I wonder if the tender by BBN promises everything DARPA wants.
      It does have the great advantage, though, of eliminating the poxy translator. It could indeed save money, on the basis of their figures, and as a result fewer interpreters and more soldiers might be killed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.