Translation Memory is the gravy on top of the translation business, the fat of the translation land, the easy money, the cherry on top…. So who gets the cherry? (Go to 5:58 in “The Fighting 69 and 1 Half,” an old Warner Bros cartoon on YouTube, to see how these things can wind up.) Moral of the story: Never bring live steel to vendor meetings.
Thanks to the vast amount of data now being translated, and the ease with which existing translation can be automatically aligned into translation memory text strings, the power and reach of translation memory technology into every nook and cranny of a domain mean that the bigger a translation memory, the better. So an industry-specific translation memory better than a client-specific one. But if you want a share of that industry data, then you’ve got to share your own. That’s the way it works―for what you get, you’ve got to give.
So I was keen to see that the Translation Automation User Group (TAUS) just published guidelines on sharing those snowballing translation memories. “The TAUS Data Association [TDA] is a non-profit organization providing a neutral and secure platform for sharing language data.” Jaap van der Meer, the director of TAUS, on top of being a great ice skater, has a great vision for the future of the industry, so I’m glad to see he’s doing the Zamboni work necessary to keep the expansion of translation memory sliding along smoothly.
The guidelines proposed by the TDA establish a set of conditions for language data sharing. Data can be uploaded only by the owners of the data or with owner permission. The data owner and the data provider are identified whenever data is uploaded. And these owners can pull their data out of the pool at any time, since they maintain their intellectual property rights. But so long as the translation data remains in the pool, members of the association can use the data for translation memory, derivative work (building machine translation tools, etc.), and use and resell the same.
Members also have to pass the baton in this speed skate relay. They can only download ten times more than the upload. Since sharing means caring, in this sense it means that the TDA does not care to share with freeloaders. “This reciprocity―’give and take’―is an essential principle of TDA.”
And for the fine print in those translator contracts and letters of engagement, TDA proposes a few boilerplate shavings for members to make sure clients know where their translation by-products are headed. “Client and vendor agree that copies of the translation memories generated as part of the translation project will be shared and uploaded to the TDA repository for optimal use in a collaborative industry environment.”
Will that read like “Danger, Thin Ice” to clients with all their little secrets? The TDA ROT (Rule of Thumb): “Share only what has already been published and keep from sharing any unannounced and unpublished products, features and services.”
And when translation memory skaters do fall through with added secret data, how much corporate intelligence can be gleaned from a bunch of new language pairs? You can imagine some spooky algorithms that can bare a company’s soul to competitors routing through old translation memory, but seems like a lot of trouble for the uncertain promise of buried corporate intelligence. Privacy in the age of the one big database can be a real lace bite.