Google Translate Trashes the Web

by Translation Guy on June 20, 2011
0 comments

The Internet is definitely worse. And it’s all because of the reckless and unsafe abuse of Google Translate.

Have you noticed? The scum floating to the surface of your searches? Despite the best efforts of the Borg-like algorithms of Google Skynet collective consciousness, there is a whole set of content that floats to the surface in your searches. You’ve read it by mistake… acres of stupid, mindless content stretching on and on half-coherently to the digital horizon. A waste of time, but delicious Google spider-bait. It’s written for the spiders. In fact, almost everything you read on the web is Google spider-bait, since you wouldn’t be able to find it otherwise.

Sometimes that’s good. For example, when I do it. Like the birds and the bees. What my marketing guys make me do is write this blog as kind of a honey trap. I get you readers to click here (God bless each and every one of you) and read this stuff, maybe you tell your friends, send a link or a tweet, and with every action you also exude a little drop of honey-like link juice, which draws the Google spiders to weave their magic web around my site and boost us in the page rankings, higher and higher, towards the top three position in the search engines. This tiny miracle is repeated billions of times a day in the busy beehive that is the Web.

But to write decent content for people I figure takes about 300 words per hour, so a dedicated writer can produce about 10 pages a day, tops. So it’s much easier to copy the content from elsewhere on the Web. So now a lot of the Web is just a copy of other parts of the Web. But Google spiders are wise. They pass over that duplicate content now and will punish you if they find you are doing too much of that. So to outsmart the spiders, savvy programmers have been violating Google Translate’s API rules and doing translation party tricks to hide that kind of cut-and-paste plagiarism. By translating and back-translating the content between language pairs, black hats can automate the creation of unique content, a link-juice bonanza for the spiders, and the bane of human searchers who are able to recognize its uselessness.

And that’s just one particularly nefarious way in which Google Translate trashes the Web. Consider that the same productivity rules apply to translators and translation automation. All you busy translators here have been busily typing away today at about the same rate of 300 unique words per hour, the most microscopic trickle when compared to the Niagara Falls-like capability of Google’s vast server farms.

Now even in my Lord of the Flies-like Boy Scout Troop, we knew enough to drink upstream from where you peed.  But Google can’t yet tell which way is downhill when it comes to translation. Good, bad, or incomprehensible—it’s all the same to the spiders. Thus the Web is trashed, and Google Translate did it.

I’ve already posted on this, and I plan to post more. The implications from this are far-reaching, and Google’s radical decision to “deprecate” the Google Translate app is only the start of a fundamental debate about Google’s role in organizing the Web. Just goes to show you that “Do no evil” is a lot harder than it sounds.

Thanks to commentors Prevedi and Mark and Kirti Vashtee and Dion Wiggins of Asia Online for your thoughts on these issues. For more on this, Tim Carmody of Fast Company (if you can stand all the Joycean name-dropping) and the Atlantic’s James Fallows, the guy who always gets it.

0 Comments

  1. Mirko says:

    I just wonder if we don’t all overestimate the significance of the Google API move. If that MT/SEO scheme was really such a threat to Google’s core mission, wouldn’t they have teamed up with at least Microsoft, who also offer such an API? After all, they do partner with Bing on schema.org. I somehow can’t help but think that someone at Google relatively autonomously took a poor decision and then tried to correct it a few days later. That type of thing happens in any company all the time, and there is no reason at all to think it doesn’t happen at Google all the time too. That person was convinced to do the right thing because Larry Page wanted to bring back some focus and decided to deprecate a bunch of APIs, not just MT. Also, had Google identified the usage of the MT API as a menace to their search effectiveness, they would have switched it off immediately, not just announced its end of live for December. Sorry, this is the least exciting interpretation of an otherwise earth-shattering event, but that’s only because it’s probably closest to reality.

  2. Catherine says:

    Personally, I’m not a huge fan of GT, but it’s the misuse of GT that’s the problem. Here in Sweden, municipalities, local authorities, regional councils, tourist offices and businesses use GT to translate websites, marketing material, even press releases. It is this decision to pinch pennies when translating what they want to convey – after spending hundreds of thousands on graphics and interfaces – that’s the culprit. Poor decisions and no attention to detail or quality. Still haven’t figured how to tackle it either. The general attitude to writing “weather” instead of “whether” appears to be “it’s good enough”.

LiveZilla Live Chat Software