Machines, Lost in Translation: The Dream of Universal Understanding
It was early 1954 when computer scientists, for the first time, publicly revealed a machine that could translate between human languages. It became known as the Georgetown-IBM experiment: an "electronic brain" that translated sentences from Russian into English.
The scientists believed a universal translator, once developed, would not only give Americans a security edge over the Soviets, but also promote world peace by eliminating language barriers.
They also believed this kind of progress was just around the corner: Leon Dostert, the Georgetown language scholar who initiated the collaboration with IBM founder Thomas Watson, suggested that people may be able to use electronic translators to bridge several languages within five years, or even less.
The process proved far slower. (So slow, in fact, that about a decade later, funders of the research launched an investigation into its lack of progress.) And more than 60 years later, a true real-time universal translator — a-la C-3PO from Star Wars or the Babel Fish from The Hitchhiker's Guide to the Galaxy — is still the stuff of science fiction.
How far are we from one, really? Expert opinions vary. Like with so many other areas of machine learning, it depends on how quickly computers can be trained to emulate human thinking.
Vikram Dendi says we're very close.
"It's cool to stand here and look back and say, 'We really turned science fiction into a reality,' " Dendi, the technical and strategy adviser to the chief of Microsoft Research, tells All Tech.
Microsoft's translation work has produced apps that can translate voice to voice and voice to text in addition to the familiar text-to-text. The big rollout this year was the Skype Translator, which takes what you say over video chat and turns it into spoken or written translations, currently in seven languages.
Microsoft, of course, is far from alone. A company called Voxox does Internet calling and chat, and has a text-to-text translation service for its messaging app. Google, in addition to its familiar text translations, has introduced a feature in its Translate app that uses your phone camera to scan an image of a foreign text and display the translation.
Stimulating Machines' Brains
After decades of jumping linguistic and technological hurdles, the technical approach scientists use today is known as the neural network method, in which machines are trained to emulate the way people think — in essence, creating an artificial version of the neural networks of our brains.
Neurons are nerve cells that are activated by all aspects of a person's environment, including words. The longer someone exists in an environment, the more elaborate that person's neural network becomes.
With the neural network method, the machine converts every word into its simplest representation — a vector, the equivalent of a neuron in a biological network, that contains information not only about each word but about a whole sentence or text. In the context of machine learning, a science that has been developed over the years, a neural network produces more accurate results the more translations it attempts, with limited assistance from a human.
Though machines can now "learn" similarly to the way humans learn, they still face some limits, says Yoshua Bengio, a computer science professor at the University of Montreal who studies neural networks. One of the limits is the sheer amount of data required — children need far less of it to learn a language than machines do.
"(Machine translation) takes huge quantities of computation and data; it doesn't make sense," Bengio says. But there's promise in the neural network method. "It has the potential to reach human-level performance. It's focusing on the meaning of words, of dialogue."
This method builds off of previous approaches to machine translation.
Early on, scientists taught computers to translate by manually inputting every rule for every language pair they wanted translated. If an adjective, for example, came after a noun in Russian, the computer would know to flip it so that the adjective came before the noun in English.
A press release detailing the 1954 Georgetown-IBM experiment said that translating between two languages necessitated more computer instructions than required "to simulate the flight of a guided missile."
In the face of multitudes of rules and exceptions in every language pair, the manual input approach quickly became tedious.
In the 1980s, scientists began moving toward a statistical-based model. The machines were fed lots of human-translated materials (for example, from the United Nations) and identified language patterns and rules themselves.
Words that came up multiple times within one text were a common focus, says Kevin Knight, a natural languages research professor at the University of Southern California. "For example, by studying a large collection of English-Spanish documents, every time a computer sees 'banco' on the Spanish side, you see either the (English) word 'bank' or 'bench.' "
The computer would eventually deduce that every time it finds a "banco de" on the Spanish side, it can eliminate "bench" from its English options, because typically "the bank of" indicates the name of a financial institution.
Testing The Neural Networks
Neural networks, which became a popular tool for machine translation researchers in the 21st century, improved the quality of translations. Machines collect more information about each word and perform better probability analysis to avoid translations that sound unnatural.
How well does the approach work? I decided to take it for a test drive by testing Microsoft's Skype Translator, which is powered by neural networks.
I connected with Microsoft's Olivier Fontana, over a Skype video chat. Fontana greeted me in French — after a few seconds, a male robot began translating his voice into English. I brought NPR's resident French pro-Caroline Kelly along for reinforcement. She commented that Skype appeared to be more fluent at English-to-French translations than vice versa.
Ultimately, the results were surprisingly accurate, especially when we talked about the subjects one would typically discuss with relatives, like travel plans for the holiday.
As with any video conferencing, this translation chat depended on a strong Internet connection, which helped with its ability to pick up laughter and weeding out of repetitions or "ums" and "ahs." Where the translation became muddled was when we discussed — or rather, attempted to discuss — the science and technology behind Skype Translator. The machine refused to distinguish between French words for "hip-hop" and "iPhone."Dealing with the spoken word in a voice-to-voice translation adds another layer of complexity to machine translation because in addition to producing accurate results, the computer also needs to detect laughter, stutters, repeats and accents. But, as the scientists say, the more you use machine translators, the better they become. The neural network became a "momentum creator," says Microsoft's Dendi. "Without it, (Skype Translator) would still be a science fiction dream," Dendi says. In other words, there's no saying where machine translation can go when the electronic brain meets the human one.