Machine Translation: A Comprehensive Guide
Machine translation is an essential part of most localization tech stacks. It helps translate at scale and reduces translation spend. More and more resources are being put into the development of machine translation resulting in constantly improving engines and better quality translation output. But how does it work? How do you implement machine translation into your global strategy? Is the quality any good? We’re here to answer all of your machine translation questions.
1. What is machine translation?
Machine translation (MT) is automated translation by computer software. Users input text in their source language and select their target language. The MT engine then generates the desired translation. Machine translation can be used to translate large volumes of text quickly, which would be close to impossible using traditional translation methods. It can be used to translate entire texts without any human input (raw MT), or alongside human translators i.e.: machine translation post-editing.
2. How does machine translation work?
Machine translation has a very long history: the very first translation engines demonstrated in the 1950s were closer to actual machines than computers, often relying on the input of physical punch cards. Today, the translation technology is advanced and continuously improving.
There are several different types of MT approaches such as rule-based, statistical, example-based, and as the technology has progressed, the older systems have been replaced by newer, more effective technologies. The most significant developments of the past decade have been the advent of neural machine translation and artificial intelligence.
Neural machine translation
Neural machine translation (NMT) is an approach built on deep neural networks. There are a variety of network architectures used in NMT but typically, the network can be divided into two components: an encoder which reads the input sentence and generates a representation suitable for translation, and a decoder which generates the actual translation. Words and even whole sentences are represented as vectors of real numbers in NMT. Compared to the previous generation of MT, NMT generates outputs which tend to be more fluent and grammatically accurate. SMT only evaluates the fluency of a sentence a couple of words at a time whereas NMT evaluates fluency for the entire sentence.
Machine translation data
Machine translation works on training data. The data can be generic or custom depending on your needs. Generic MT engines, like Google Translate, Microsoft Translator, and Amazon Translate are for more general-purposes and are not trained with data for a specific domain or topic. Data is continuously collected and used to improve the output. Custom MT engines, on the other hand, are more fine-tuned as they are trained with specific data, resulting in more accurate MT output (and more data protection) but also come with a higher price tag.
3. What are the benefits of machine translation?
Machine translation has several significant advantages over traditional translation that make it a very attractive proposition for businesses:
- It’s fast: MT engines today are able to handle large volumes of content and translate them near-instantaneously.
- It’s scalable: MT engines can easily handle one document or thousand.
- It’s cost-effective: Some estimates suggest MT is roughly a thousand times cheaper.
However, you always need to bear in mind that machine translation doesn’t fit all use cases and all content types. Human translation or post-editing, is still the gold standard for translations that demand perfect quality.
How does MT handle various content types? Check out our MT Report series. Download the latest issue.
4. When should you use machine translation?
Whether or not you should use machine translation depends on several factors.
Machine translation works for many different content types depending on the strategy you choose. However, for creative content such as marketing copy, machine translation may not be ideal. It can be used as a starting point, but it is best to have human translators who are able to get creative with the text.
Who is it for? Regardless of your use case, you need to be sure that the machine translation output is going to meet the expectations of the reader. Are you translating web pages that drive revenue or internal documentation for your employees? Content that is showcasing your company or product should certainly be reviewed by a human. If the content is for internal purposes, MT is a suitable solution.
Translating small segments here and there? Sure, you can use MT. The true value of MT, however, lies in being able to translate large volumes of text.
If you have a tight deadline and don’t have the man-power to complete a translation job, machine translation is a great option.
Content that is of low-priority, such as internal documentation, or has short life-cycle is a perfect candidate for machine translation.
5. So why isn’t everyone using MT?
The final stumbling stone is of course the quality of the output. While improvements in MT have over time significantly increased output quality, they are nowhere close to consistently achieving parity with human translators.
At Memsource we track how different engines perform across different language pairs and domains on a quarterly basis. This can help you make an educated decision whether to use machine translation. You can find out more in our latest MT report.
6. How do I start with MT?
Using MT effectively requires more than just picking the right MT engine. Businesses succeed with MT if they prepare an effective strategy that works for them.
Picking the right machine translation strategy
Depending on your translation needs, there are some different machine translation strategies you can adopt.
Raw machine translation
Raw MT is machine translation output that has not been reviewed by a human translator. When you use Google Translate to translate a web page, that’s raw MT. The output won’t be perfect but for the most part it is passable. Publishing raw MT is not recommended for customer facing content but it could be a simple solution for translating user-generated content, internal documentation, or in cases where fast translations are needed but accuracy is not important.
Machine translation post-editing
Machine translation post-editing combines MT with human translation giving you the speed and ability of MT engines to quickly handle large volumes of text, with the skill and sensitivity of trained linguists. Post-editing (PE) is a relatively new phenomenon, with the ISO standard for Post-editing of machine translation output only being codified in 2017. PE is the process of reviewing and adapting raw MT output to help finalize a translation.
You can choose the optimal level of MTPE depending on your translation needs:
- Light post-editing
With LPE, raw MT is only modified where absolutely necessary to ensure that the output is legible and accurately conveys the meaning of the source document.
- Full post-editing
With FPE, raw MT is thoroughly reviewed and modified to ensure that there are no errors whatsoever while also taking style, tone, and cultural nuances into consideration.
Want some more machine translation post-editing tips? We’ve got just the thing! Watch our introduction to Machine Translation Post-editing webinar.
Of course there are cases when MT is not suitable for your content. When branding and cultural context come into play it may take post-editors more time to “fix” the translations than if they were translated by a human.
7. How do I implement machine translation?
Once you have your strategy in place, you need to start thinking about implementation. Adding machine translation to your localization workflow doesn’t have to be a daunting task. There are several machine translation implementation steps that you can follow in order to ensure success.
- Pick the right content for machine translation.
- Train the engine with your data if possible to increase the output quality.
- If you go for a machine translation post-editing strategy you need to select a team who has training or experience with post-editing or make sure that they are open to the idea of it.
- Run samples before deployment to get an idea of the quality or to identity areas that could be improved before deployment.
- Agree on a pricing model and be sure to involve all stakeholders in the decision.
- Deploy! Keep in mind that the results may not meet your expectations right away but the output will get better over time.
If you’re looking for tips on setting up machine translation in Memsource, be sure to check out our webinar.
8. What does MT mean for translators?
One common misconception about machine translation narrative is that it will render translators redundant. Luckily, it’s not machine translation versus human translation. Although some MT evangelists predict that eventually machine translation will achieve parity with human translators, it is unlikely to happen any time soon. Human linguists will continue to play a significant role in MT-driven translation workflows as post-editors, helping adapt the MT output to ensure consistent quality.
9. Machine translation quality: Is MT good enough? How do I know?
Although significant advances have been made with MT, there are still some doubts about the quality of the translations, which make some users hesitant to invest more into MT. Many users are concerned that the translations will not be good enough, requiring expensive post-editing, while others are unsure of how to effectively scale their evaluations to match the volume of output that can be produced by MT.
Fortunately there are steps that you can take to ensure that you are always achieving the best possible results and evaluating your outputs effectively. One of the key steps, listed in our article on Managing Machine Translation Engine Quality, is to effectively leverage existing technologies to their best effect, whether they be customized engines, MT management solutions, or AI-driven quality estimation.
10. The future of machine translation
Machine translation is continuously improving at a remarkable pace, by taking advantage of the latest hardware and software developments. MT engines are not only getting better, but there are more of them now than ever before (Memsource alone supports over 30 unique MT engines). The combination of rapid development and intense competition is sure to be beneficial for machine translation in the long-run. Currently however, the large number of ever-changing options can make machine translation more difficult to access for newcomers and optimally leverage for existing users.
Much of Memsource’s development has been aimed at making sure our users always have the best machine translation. To help our users find the perfect engine, we’ve recently launched Memsource Translate, a dynamic engine management solution automates and optimizes the process of choosing an MT engine. A sophisticated AI-algorithm monitors engine performance in real-time and always recommends the optimal engine for your content. As Memsource’s CTO Dalibor Frivaldsky said in a recent interview, “We wanted to make using MT technology as simple as possible, without the need to go through the complex process of choosing a single MT provider”.
To learn more about Memsource Translate and how you can use it to get more from MT, you can tune in to our webinar on Memsource Translate - Getting the most out of MT.