Toll Free: (800) 790-3680

Translating Big Data: How Tech Giants Reach Out to the World

How Google, Facebook, Twitter, and Other Tech Industry Giants Reach Out to the World

As with any other industry, the translation services industry has had to grow, change, and adapt with the times; as technology has evolved and people’s dependence on the internet, smart phones, and social media has expanded, the traditional methods of translation have become inadequate for keeping up with the constant stream of information and communication that constantly bombards us.

Translating the web: the old fashioned method.

Once upon a time, the internet was a lot more like the print media that preceded it; information was out there, written and designed by its authors and presented as a single published work. Web pages, like printed pages, were compiled into directories manually and rarely modified.

For this reason, translating the web back in its pioneering, “wild west” days was handled much like translating the printed word as well. Translators were brought in, reviewed and translated the content, and translated versions were meticulously composed by hand by editing the original code directly, and then stored separately alongside the original content.

Translation in the age of blogging.

With the rise of blogging in the early 2000s there was a sudden content explosion on the web. Suddenly, you didn’t have to know how to design, code, or even host your own site—you could register for any number of free blogging services available and make your voice heard to anyone who shared your similar interests, or at least knew your blog’s URL.

With the internet suddenly becoming an even more free and open place for the exchange of ideas, it became increasingly obvious that the old fashioned way of web translation simply wasn’t going to cut it. People sharing their voice through free blog platforms certainly weren’t ready to start paying to translate their content into other languages, if they even had the opportunity or knowledge to make a translated version available.

And beyond that, with the sheer amount of text being produced every day, it would have practically been impossible to have it all translated professionally in the first place, never mind if the content needed to be updated or changed slightly after it had already been published. What was the solution?

Suddenly, with the rise in popularity of blogging came a solution—of sorts. Online machine translation services like Babel Fish and Google Translate moved beyond merely translating words and phrases piecemeal and allowed users to enter web addresses into the bar to translate entire websites one page at a time, instantly and right on screen. The results weren’t perfect, but it was big step forward in making all the internet’s content accessible to people around the world.

Translation and big data: how social media changed our perception of translation services.

Eventually, blogs gave way to the next wave of internet communication: social media. Major players like Twitter and Facebook were handling hundreds of millions of terabytes of data per day. Even machine translation, which uses no human interaction whatsoever, wasn’t able to keep up with the sheer amount of information.

Additionally, with social media itself as a medium breaking down the barriers of international communication, the fast but highly inaccurate translations provided by free translators wasn’t giving users who spoke foreign languages the same experience—it was alienating.

Interestingly, this lead to the next step in website translation, which was a kind of combination of the previous two methods: its own users would help translate the core content of their website, and machine translation would optionally be provided to help fill in the rest when foreign-language content was presented to a user who can’t read it. It was a surprisingly effective method, and one that’s still in use by these big players today.

Facebook

Facebook developed their own internal “Translate Facebook” app that lets its own bilingual users review text snippets from the site, vote on correct translations and flag incorrect ones, and submit their own. Users, who typically aren’t professional translators, volunteer their services out of a sense of pride and community, and also to help make their own experience on the site a positive one, as well as those of their fellow users in whichever languages they speak.

For post content, around 2008 (shortly after Microsoft became a major shareholder in the company), Facebook implemented Bing translations into its site’s framework to help users translate post content by other users and pages when it was written in a language other than their own. This was an “on demand” service, with users having to click a link to view the translated version of the text if they desired.

In 2016, Facebook dropped Bing in favor of its own internal translation logic, and in January of this year they announced that they had replaced their previous “neural network” based technology with a new “vector-based” “multilingual word embedding” technology which it claims is “20 to 30 times faster than the natural language processing it had been using,” according to The Next Web.

Twitter

Twitter uses a similar approach to translation as well. Since 2009 it has utilized the power (and numbers) of its own userbase to volunteer to provide its translations. They launched their “Translation Center” in 2011 to help make it easier for users to collaborate and translate each and every segment of text for its web interface and apps.

And in fact, as Facebook once did, Twitter also uses Bing Translator to help translate its Tweet content. Twitter has taken it a step further, however, and also allows bilingual users to suggest their own translations of tweets if they feel the automatically generated Bing translation is not adequate enough—and because the content itself is so small, the likelihood of receiving a user-submitted translation is a much higher than services like Facebook, where posts can be composed of multiple paragraphs of text strung together.

Google

And of course, Google has its own proprietary methods for translation as well.

With so many different services, apps, and projects, they don’t have just one single approach to translation—their methods for translating subtitles for YouTube, translating Android apps submit through Google Play, and translating Google Drive documents and spreadsheets are handled in completely different ways. But at their core, they all have one thing in common, and that’s that their Google Translate service is used as a sort of “base framework” for providing initial translations which can then be edited, perfected, or even outright ignored and replaced by translations submit by humans.

This method of translation is typically referred to as “hybrid translation,” as it combines both fast machine translation technology along with the improved comprehension, context, and readability provided by real, human translators.

If you’ve used Google Translate to translate just a sentence or snippet of text recently, you’ve probably seen the “Click to edit” pop-up that appears when you hover over the translation they provide. By submitting a translation, Google Translate supposedly doesn’t just store it for retrieval in the future, if its deemed an improvement—in fact, its used to train the AI of their “neural network” which they use to provide translations. This is a superior method of translation which is more like real human thought than previous methods, which worked by breaking down text into recognizable fragments (phrase by phrase or word by word) and translating them piece by piece.

Google has also hooked the translation service into their other services as well—as Google improves its image recognition technology, for example, because Google Translate is connected as a component of it, the two services therefore can learn and improve from each other, with Google Translate helping to recognize and comprehend text in foreign languages, providing additional data that’s usable to both services.

Looking to the future.

So, translating big data on the web has, in a way, come full circle—web translation started with pure human translation for content before moving in a completely different, machine-based direction, and has now finally swung back around to relying on humans for “important” text and using automated translations only when the gist of something is needed quickly.

With each new step in the evolution of the web—from static, handmade content to blogging platforms to social media—the need for translation, and the methods for providing it, have evolved as well. To know how the web will be translated in the future, we’ll only need to look for the next wave of change in the way we communicate on the web.

Comments are closed.



Translation Services USA® is the registered trademark of Translation Services USA LLC, New York, New Jersey