Translation and Interpreting in 150+ Languages
DARPA and Darmok Act 2: SpeechTrans Test
April 28, 2011 - By: - In: Machine Translation - Comments Off on DARPA and Darmok Act 2: SpeechTrans Test

The Defense Advanced Research Projects Agency is spending $50 million a year in search of a universal translator just like the kind you can buy for $19.95 at the iTunes store.  DARPA’s Linguistic skunkworks have been searching for a universal translator ever since they saw those flip phones on Star Trek re-runs. But as we know from Captain Picard’s epic translation fail in  the “Darmok” episode, when an alien encounter ends with a stabbing, even though it’s only the alien that ends up dying in the dust, bleed-outs are never a good outcome for a machine translation tool-fueled encounter.

Spencer Ackerman whispers over research costs for such a tool in Psst, Military: There’s Already a Universal Translator in the App Store, recommending the SpeechTrans app that’s available for the iPhone. So I figured I would put the tool to test at the TranslationGuy Translation Tool Torture Test Track, kind of as a public service to do my bit, but more in the hopes that if I can save the gov’mint $50 million, I might get a bigger tax refund.

Now, since we don’t get any of that DARPA money, our budget was pretty constrained ($19.95 for the app), so to create the most realistic combat situation possible, testing was conducted via a conversation with my wife. We compared it with the Google Translate app, also available at the app store for free. If I understand correctly, SpeechTrans is basically a mash-up of the Nuance speech recognition engine and the Google Translate app, so the contest was basically between interface and voice recognition. So I walked in on the missus cold, whipped out my G4, opened the app, selected my language pair and started talking. “I’m going to have to leave lunch early because I have a conference call at 1:30.” And presto, my iPhone began speaking in Japanese. Barely audible, true, and absolute nonsense, but recognizably Japanese. You’ve got to watch the text display of the speech recognition pretty closely, because if it gets even one word wrong, the machine translation will be wacky as hell, as we learned through repeated efforts to get it right. Most irritating feature is that the Nuance voice recognition engine cannot tell time. “One-thirty” is recognized as “130” instead of “1:30,” and with that missing colon you are going to miss lunch. Curiously, “half past one” worked just fine, but no one’s spoken that way since watches went digital. So despite five minutes of back and forth, we were never quite able to firm up our lunch plans. Lunch was not translating very well, either. It took multiple recordings and recast sentences to get to a phrase that could be recognized by the translation engine, and then it took a lot of puzzling on the part of the listener to get a gist of the translation. For some inexplicable reason the volume of the translated text audio is inaudible. And there is no language default. Your language pair must be selected afresh for every translation.

None of these problems exist with the Google Translate app. I’ve never been a fan of the Nuance speech recognition engine, I guess from all those hours spent in frustration with old versions of Dragon Naturally Speaking, but I’ve been impressed by the high accuracy of Google’s voice recognition system  from the first time I heard it. But like any real-time translation systems, success or failure is defined by the user interface. Google Translate defaults to previous language pairs, displays translation history, and toggles easily between languages on a single input screen. Record and translate with a single click. Very nice, even if it doesn’t translate very well. Took two or three passes to get to the message, but it was doable.

SpeechTrans also does machine translation for Twitter and Facebook, which strikes me as such a useless feature that I can’t be bothered to test it.

Look. I’m an enthusiast. We’ve been trying to make money off of machine translation for years, and the web-based stuff is incredibly useful, but I don’t get it with the hand-held apps. They are certainly the coolest props on the Star Trek set, but I just can’t see yet how they could be better than gesturing for most encounters. Maybe there’s a learning curve or some way of using these tools that I don’t get just because I’m so close to the business.

So next step is to take some of these tools off the Translation Tool Torture Test Track and out into the mean streets of NYC and go talk with the tourists.  If any readers would like to join me some Saturday afternoon, we can take out a video camera and see what we get. Comments from users are of course welcome, particularly those in uniform.

In the meantime, my official recommendation to DARPA: Keep spending.

Post slots and my attention span permitting, I’ll be taking a look at other MT tools in the days ahead. I’m thinking of devoting a page to just that subject as a way to help me and my readers stay on top of a quickly changing technology landscape.

LiveZilla Live Chat Software