Translation and Interpreting in 200+ Languages

Google Translates Dead Languages Better

October 14, 2010 -By: -In: Machine Translation - Comments Off on Google Translates Dead Languages Better

The article Google Translates Dead Languages Better was originally posted on Technorati on October 11, 2010 by featured Technorati author, Kenneth Richard Clark.

Google’s newest machine translation language is Latin. The announcement was posted on the Official Google Blog in Latin, and when it was translated using the Google Latin machine translation tool, the results were pretty good.  Or pretty bad when you consider that practically every bit of the Latin that Google used to build its translation tool had already been translated dozens of times over by serious human translators.

So this:

“Hoc instrumentum convertendi Latinam rare usurum ut convertat nuntios electronicos vel epigrammata effigierum YouTubis intellegamus. Multi autem vetusti libri de philosophia, de physicis et de mathematica lingua Latina scripti sunt. Libri enim vero multi milia in Libris Googlis sunt qui praeclaros locos Latinos habent.

Became this:

“This Latin translation system rarely be used to translate e-mails or understand the subtitles of YouTube videos. But many that are ancient books of philosophy, of physics and of mathematics are written in Latin. But many thousands of books are in Google Books, who have whole passages in Latin.”

OK. It’s not Cicero, but it’s good at what it does, which is to transform the ancient wisdom of the world into awkward anodyne computer-prattle.  Just what you’d expect from some barbarian gewgaw.

Good thing for Google and developer Jakob Uszkoreit that Latin is a dead language because it’s a pretty strict language, even after death. Those Brit public school kids were always getting beaten for dropping a case or getting their veni, vidi, vici mixed up. And back when Latin was a living language, those old-timers left their canes in the shed and did their grammar instruction with the business end of a gladius.  You didn’t want to mess with the Romans―they got even. So a message to Jakob: Caesar si viveret, ad remum dareris (If Caesar were alive, you’d be chained to an oar).

This is one dead language, which means fewer people fed to the lions and more previously translated phrases to dump into the machine memory.  Since so much of Latin is already translated, this makes Google machine translation so consistently accurate for Latin. That’s also why in other living languages, where maybe 1% has been translated or so, Google MT is sometimes so great and other times not too great at all. So if you stick to the classics in your Latin, nihil est (no problem). But reaching beyond the classics, to, say, tattoos for example, is going to be a bit trickier.

Classicist grad student J. Harker has taken a look at the problems with grammar among tattooists who favor the classics. He has divided Latin tattoos into three categories:

  1. Traditional, quoted good Latin – Carpe Diem, Odi et Amo, Alis Volat Suis, etc.
  2. Dog Latin, largely ‘incorrect’ but in wide use – Nolite te bastardes carborundorum, etc.
  3. Absolute fucking gibberish.

Some great pictures and commentary at Tales of a Wayward Classicist.

Final analysis: Based on the expertise I’ve accumulated in what sometimes seems like a lifetime in the translation business, there is a crying need for bad translation in the tattoo sector, and Google’s public-spirited contribution will meet that demand. Not to mention help kids to cheat on their Latin homework. Two thumbs up!