10 Billion Words Translated Daily

by Translation Guy on April 30, 2012
0 comments

The largest translation effort in the world services 200 million users a month, translating the textual equivalent of 1 million books a day. Can you guess who that might be? With numbers like that, you know its gotta be Google.

Franz Och, Distinguished Research Scientist, (his title, not mine) at Google Translate marked six years of progress in a recent post at Google Translate Blog.

In just one day, Google translates as much as all the professional translators in the world translate in a year, says Google. A million books multiplied by 100,000 words per book is 10 billion words.

Only Google has the rack space to run this kind of volume, and the technology to do it efficiently. When they started out, the distinguished Och reports that it took 40 hours and 1000 machines to translate 1000 sentences, which for those of you lacking a technical background means that it sucked. “So we focused on speed, and a year later our system could translate a sentence in under a second, and with better quality. In early 2006, we rolled out our first languages: Chinese, then Arabic.”  Now Google offers 62 languages total, and has a pretty good kit for the less commonly translated languages, so many more to come.

While those human translators billed billions for their billions of words, Google just gives the translations away, although it now costs money to license the app, at something like a penny or two per page, my best guess about .03 % the cost of human translation (Note: that’s 3/100 of a percent, not 3%)

And a tip of the hat (from me, not the distinguished Och), to all of you who spend your lives translating words for a few pennies each. Thanks to your for providing all the good, paid  translation that Google then scoops up in order to turn into bad, free translation. As the distinguished Och notes, “we believe that as machine translation encourages people to speak their own languages more and carry on more global conversations, translation experts will be more crucial than ever.” Note that he said “translation experts” not “translators.”

But dollars and cents really don’t give a sense of what a transformative technology machine translation has become thanks to Google’s reach. Google Translate reports that 92% of queries originate outside the Unites States, and that use of Google Translate on mobiles is increasing at four times the rate of desk-bound systems. So Google Translate will become even more ubiquitous in the future than it is now.

Google has taken their current statistical approach as far as it can go, say I. For years statistical machine translation guys have been asking for more data.  But Google Translate reached critical mass a long time ago, and even vast new stores of data have had only the most incremental impact on the quality of their machine translation.

Alexis Madrigal argues in Atlantic that Google is going to have to come up with some new tricks to make better use of the data they already have. “Google (or any other translation software) will have to start understanding (in some way) the semantic content of the words it is arranging.” Which strikes me as a kind a knuckle-headed comments that at least demonstrates that his blogging heart is in the right place.

Ilia Kaufman of NoBable was developing  AI algorithms’ to identify domain (subject area) in text as a means to refine machine translation output. So that when the computer translates “the server is down” it will be translated one way for an IT text, i.e., “the computer has failed,”  and another way for hospitality industry, as in, “the waiter is injured.”  That’s where I would put my money. But NoBable went out of business last week.

You can see Google Translate in action against Bing and BabelFish at my Free Translation Challenge. I thought this page would be a big hit, but it’s not as easy to give away free translation as it used to be. Thanks to Google Translate.

0 Comments

  1. Noel Alfari says:

    I can’t fathom that amount of text being translated in a year, let alone a day. Wow.

  2. I would think that MT would eventually reach its’ peak and it certainly won’t result in perfection. Too much for the machine to consider when coming up with the true meaning in the text.

  3. Thank you for another take on machine translation, I couldn’t help laughing in a few places! Isn’t Google Translate similar to Wikipedia in a sense that both receive a lot of criticism, but are nonetheless useful for both professional and casual use? As translators, we can be be unhappy about Google’s stealing our clients, but it’s a valuable asset to us anyway. In the same vein, Wikipedia is grabbing market share from encyclopedias, etc., but folks who suffer as a result of Wikipedia’s rise also use Wikipedia themselves. Changes can be hard to accept, but resisting progress can be worse.

    • Ken says:

      My hope is that you were laughing at the right places, Roman, but one can never be sure. Interesting analogy. Thanks for your comments.

  4. At least you got called an “Expert.” :)

  5. Jamie Caub says:

    If Google translate reports that 92% comes from outside the US, what languages are so popular? Or are they translating to English?

  6. I think they are over estimating their translating. If they are counting books at an average of 100,000 words each, they are some pretty good sized books to be translating on a daily basis. I would think that a lot of the translation on Google would be simple words and phrases.

  7. Government’s better come up with laws to keep Google in check. Pretty soon they will be like Buy n Large from that movie Wall-e and control the whole planet.

    • Ken says:

      Bathroom scale data indicates we are headed in that direction. Great film.

  8. Vrba Rozim says:

    Google is the bomb. They are the one-stop shop for everything (I guess they are the Walmart of the internet). Anyway, even though they have so many useful products and apps, I really feel that people will go the avenue that they need. When they need something “expertly” translated, they will find the right person to do it and not rely on something free from the internet.

  9. Joshua Hanff says:

    I feel like the scarecrow in the picture when I read about Google. Is this company all we have to look forward to in the future? Will anyone or anything else ever compare (or compete?).

    • Ken says:

      Sure. Something worse.

  10. I think it’s great that there are free places for help translating materials. However, if I ever need something really important (legal or whatnot) translated, I’m going with a human and not a machine. If I need prostate surgery I will opt for the machine.

    • Ken says:

      May your prostate enjoy long and un-enlarged health.

  11. Djodja Nacuk says:

    We I read “the server is down,” I thought he was depressed. (or it was).

  12. Jaden Tarle says:

    Those mobile inquiries are all of the tourists trying to read what the billboards and street signs are saying in Florida.

  13. Jirka Maczny says:

    On the level, Ken. How much do you think that free translation from sites like Google have negatively affected your business and others like it? Do you see translators having to close shop?

    • Ken says:

      Hmmm. Overall I would say the benefits outweigh the losses. Machine translation is like a gateway drug, facilitating relationships that may later require human translation. Translators who incorporate MT into the workflow and are paid at human rates profit in a way that we are unable to profit from, and the old draft and summary translation business is gone. Most painful bite for us was the big hit we took on selling our inhouse machine translation capability, when our legal clients started programming illegal apps to take advantage of GT’s then un-metered service.

  14. I’m a translator and I’m not concerned about free translation. People will always look for a bargin and you get what you pay for. Do you want nice solid wood bookshelves or paricle board ones that sag in the middle? Same for your translation.

    • Ken says:

      The Ikea model of translation. Innovation always comes at the low price point and not the other way around. I just decided to do a post on that.

  15. My guess is the data Google has is reviewed by machines and then used by machines. I can’t see that getting much better. They will need to come up with some new tricks.

  16. Sounds like your not really a fan of this Google stuff.

    • Ken says:

      I just like to mock those who used “distinguished” in their job title. I think Google Translate is Google’s greatest gift to mankind (in a Trojan Horse kind of way) I really think it is amazing and important.

  17. The free stuff is fine for the relatives living in different countries and they are trying to translate Christmas cards. The real translations take place by “experts” like Och says.

  18. What does Google get out of all of the free translation? It must cost them a small fortune to do this.

    • Ken says:

      I think it has something to do with global domination.

  19. Cassy says:

    Wow! that’s a lot of words translated in just a day.

    • Ken says:

      Too damn many, as far as I’m concerned.

LiveZilla Live Chat Software