Machines That Think: The Rise of Neural Machine Translation
The evolution of neural machine translation (NMT) from its origins to today: How did NMT become a game changer for doing business globally?
When Charles Babbage first proposed the idea of a programmable computing machine in 1834, he imagined it being used to translate the languages of other nations.
Little did he know that 120 years later, during the 1954 Georgetown-IBM experiment, New York would witness the first demonstration of an automatic language translation machine that converted brief statements about fields such as politics, law, chemistry, and military affairs from Russian into English.
Imagine the reaction of the audience of scientists and engineers when they were told that a computer had just translated language in real time! Babbage’s imagination wasn’t too far off, after all. It didn’t stop there, though.
Machine translation (MT), over the course of its development, has changed greatly—from systems that required hours and days of computing time to produce a translation of dubious quality, to the current neural machine translation (NMT) systems that can process the same content in mere seconds and with much more accuracy.
In this guide, we’ll explore the origin of machine translation, how it has evolved from statistical models to neural machine translation, why neural machine translation is so good at what it does, implementation considerations, and more. Let’s get started.
What is neural machine translation?
Neural machine translation is a form of end-to-end learning that may be used to automate translation. In neural machine translation, the program’s neural network is responsible for encoding and decoding the source text, as opposed to running a set of predefined rules from the start.
NMT, therefore, has the potential to address many of the problems of traditional phrase-based translation systems and has been shown to produce better quality translations.
Before delving into what makes neural machine translation unique, let’s define machine translation in general and take a look at all other forms of machine translation. Without going into too much detail, a short description of each will help you get an understanding of what neural machine translation is bringing to the table.
Definition of machine translation
Machine translation is the process of using artificial intelligence (AI) to automatically translate content from one language to another without human input. In other words, a computer program translates text without the need for a human translator to intervene.
While the first experiments in automatic translation took place in the early 1950s, it was only in the early 2000s—with the use of statistical methods—that machine translation began to take off.
The quality of translation in the early days was very basic, and training the machines required a lot of effort. Unlike modern deep learning, which works with artificial intelligence, early machine translation models required developers to manually define and program what was effectively a large set of rules.
Types of machine translation
Without considering neural machine translation, there are two main subtypes of machine translation, including:
- Rule-based machine translation: This form of machine translation, now mostly obsolete, relies on linguistic information about the source and target languages. Using grammar structures, human linguists establish rules for sentence structure, word order, and phraseology for the input and output language. Next, after retrieving the necessary information from dictionaries, the system maps each source-language word to an adequate translation in the target language.
- Statistical machine translation: Statistical models work by analyzing enormous amounts of existing translations and multilingual corpora and looking for statistical patterns in this input. These patterns allow the program to generate a hypothesis about how it should translate other similarly constructed texts in the future. The resources needed for training the models are large—you need millions of words to train the engine in one particular domain—but the results can be quite good, especially in more technical or scientific texts. The statistical translation models were initially word-based but later evolved into phrase-based systems that capture word context.
How is neural machine translation different?
Neural network models are very different from phrase-based systems. Whereas the latter breaks an input sentence into a set of words and phrases and maps each to a word or phrase in the target language, neural networks take into account the whole input sentence at each step when generating the output sentence.
In 2016, Google Translate switched to neural machine translation to power its Google Neural Machine Translation system (GNMT). Google stated this change addressed the need for fewer engineering and design choices while increasing accuracy and speed.
How does neural machine translation work?
Neural machine translation uses neural networks to translate source text to target text, and neural networks can work with very large datasets and require little supervision. Neural machine translation systems have two main sections: an encoder network and a decoder network. Both are neural networks.
What is a neural network?
A neural network is an interconnected series of nodes, loosely modeled on the human brain. It’s an information system in which input data is passed through these nodes to produce output. This neural network architecture is called a sequence-to-sequence neural network (Seq2Seq), which works by looking at one source-language sentence and producing a corresponding target-language sentence.
What are the benefits of neural machine translation?
The power of NMT lies in its neural network architecture, which allows it to learn from vast amounts of data and adapt to new contexts. This makes neural machine translation an ideal technology for companies that need to translate lots of content quickly, accurately, and flexibly.
In a nutshell, the benefits of neural machine translation can be summed up as:
- High accuracy: Drawing from ever-extending data sets and using language modeling, NMT engines can understand the broader context of words and phrases to produce more accurate and fluent translations and improve over time. By contrast, conventional phrase-based MT only considers the context of a few words on either side of the translated word.
- Fast learning: Neural networks can be trained quickly through automated processes, unlike the costly and largely manual methods required for rule-based MT.
- Simple integration and flexibility: A benefit that NMT carries over from its statistical predecessor is that it can be integrated via APIs and SDKs into any software and applied to many content file formats.
- Customization: You are usually able to customize the output of NMT and update the model through terminology databases, brand-specific glossaries, and other data sources to improve results.
- Cost efficiency: Human translation can be costly, especially in projects that involve a lot of words and many languages. NMT enables you to take advantage of highly accurate and fast systems to produce translations at a fraction of the cost. Whenever needed, you can rely on human translators to take care of machine translation post-editing.
- Scalability: When your translation needs to scale up, neural machine translation can help to quickly and easily meet increased demand.
The list above makes it clear that neural machine translation is a powerhouse technology capable of revolutionizing your organization’s translation capabilities. However, it doesn’t fit all use cases or content types. Let’s explore when neural machine translation works best.
What are some use cases for neural machine translation?
While some content types are best left in the hands of human translators, such as creative advertising copy designed for maximum impact, neural machine translation excels at other types of scenarios, including:
Translation of large amounts of content in extremely short timeframes
When NMT ingests large amounts of high-quality training data to improve its neural networks, it can rapidly produce astoundingly precise translations without any human intervention in record time.
A good example of the need to translate large amounts of content in very short timeframes might be the immediate aftermath of a natural disaster. An organization such as the French Red Cross often needs to translate content within hours in order to inform people across borders about the latest developments.
Another example is the proliferation of online customer reviews. This amount of content presents a huge challenge for companies that rely on user-generated content for marketing purposes because they must translate it quickly and accurately before customers turn to their competitors.
Translation of highly repetitive content
NMT is especially effective at translations that require high neural network accuracy but are also very repetitive, such as manuals, user guides, or other types of reference materials.
For example, a neural machine translation engine can gather a vast amount of data from existing human translations of a product’s manual in different languages, and draw from it to produce high-quality translations of other related product manuals on future occasions.
Translation of user-generated content (UGC) for social sentiment analysis
Neural machine translation can process hundreds of thousands of user-generated comments overnight and deliver accurate, actionable results in record time.
This offers a compelling option for brands to take advantage of the huge amount of content that people post every day and translate it for their own purposes. This will enable them to interact more meaningfully with customers around the world, which is critical for an effective global marketing strategy.
For example, NMT can help you identify which sentiments are most prevalent among your customer base, so you can make quick adjustments to your product’s marketing messaging strategy.
Online customer service
Neural machine translation can be very useful for helpdesk or customer service operations, where staff members need to quickly and accurately translate requests from customers around the world.
Organizations that want to scale up live chat without hiring more personnel can turn to neural machine translation to increase their agents’ capacity while also improving their customers’ experience. This is because neural machine translation enables agents around the world to quickly communicate with customers who speak different languages.
How accurate is neural machine translation?
Neural machine translation performance depends on a variety of factors, including the chosen engine, the language pair at hand, the amount of training data available, and even the type of text being translated.
The more translations an engine performs for a specific domain or language, the higher quality output it will be able to produce. With the amount of translations continually increasing, selecting the optimal engine for the content being translated is key for accuracy.
The differences between the most popular MT engines can be confusing to understand. For a more detailed and up-to-date breakdown, Memsource’s quarterly Machine Translation Report can help—it analyzes Google, Amazon, Microsoft, and DeepL’s translations engines to learn how each is performing for specific language pairs and domains.
What is the future of neural machine translation?
The future of neural machine translation is very bright, and its capabilities are only going to grow over time as artificial intelligence continues to evolve and neural networks become larger and more complex.
However, human translators’ jobs will hardly ever disappear. There are certain types of content and communication styles that neural machine translation has traditionally struggled with and likely will for the foreseeable future. Human translators, on the other hand, excel at translating these kinds of nuanced expressions.
Most experts agree, therefore, that the future of translation will combine NMT and human capabilities—a future where machines will bring scalable capacity to translation while humans will provide creativity, critical thinking, and nuanced interpretation.