On the treadmill with voice recognition and machine translation.
NTT Docomo, Japan’s leading cell phone carrier, has unveiled its real-time speech-to-speech translation service, reports Geek.com. As a supplier of telephone interpreters, I’ve been waiting for this for a long time, since I used to imagine it would put us all out of work. But I’ll hold off on searching Craigslist for a new job just yet. Fact is these tools are just window dressing, press release fodder for the uninitiated, because they just don’t work that well. I haven’t tested the DoCoMo tool, but I’m sure it’s the same old tech BS.
How do I know? Because I’m using the best speech recognition tool available to compose this post. That’s right, no more typing for me. It’s all voice recognition for TranslationGuy.
Tendinitis has ended my typing days. I’m using Dragon Naturally Speaking 12, the latest and greatest voice recognition tool from Nuance. It is an amazingly powerful technology, a real game-changer. I’m writing twice as fast, with half the pain, after only a few weeks of intensive frustration. Thinking before speaking does not come naturally to me.
Nuance technical support tells me that the tool will never be able to transcribe the name of 1-800-Translate because of the unfortunate use of hyphens in our brand. And to Dragon’s tin ear,“Ken” remains indistinguishable from “10” or “can,” no matter how many times I program the tool. Unless I change brand and name (any suggestions welcome) a careful post-edit is always required.
Which reveals the big dirty secret behind Apple’s OS, and it’s not that the maps are stupid, which I guess is not really a secret. The secret is that Siri sucks. Apple’s voice-recognition tool is powered by the same Nuance engine I am struggling to train right now. Having spent hours training it to my voice, and edited at my hand, it is an amazing tool, but when applied to real-time communication the error rate is too high. Siri may be geek-sexy, but her output does not put out.
This is great for my business, because it means our telephone interpreters are going to be around for a long time. We still turn the wheels of real-time translation, just as Fred Flintstone’s saber-tooth squirrels on their treadmills kept Stone-Age Bedrock rolling. Yabba dabba doo!
Without training first and post-edit last, voice recognition and machine translation are just useful-ish. Combining two halves of two incomplete processes just doubles your troubles. But once we get some saber–toothed linguists into the automation dust-up, wonders can be achieved.
If I could, I’d make everyone on my team use them all the time. The training would be killer, but the payoff is we would all make a lot more money, at least until the other guys figured it out.
Mastering these automation processes requires patience, persistence and practice (one of my Dad’s favorite lines, although I think he was mostly referring to sex whenever he said it). To make good use of these tools requires planning and changes to old habits of work and thought. The only thing more painful is carpal tunnel syndrome, or business lost to tech-savvy competitors, which is even more painful to my wallet. Ouch!