Blog

Integrating MT into L10n Workflows: 5 Key Takeaways

Machine Translation Workshop

Machine translation adoption is on the rise, but questions still swirl around its benefits. Our latest Machine Translation Workshop brought together key MT experts to deliver answers—and much more.

The adoption rate of machine translation is increasing, but questions about the benefits of MT still remain. How do you ensure efficiency gains without sacrificing quality? How can you work MT into your localization workflow? How do you get all stakeholders on board? We put together a group of MT experts to tackle these MT integration questions and more at our latest Machine Translation Workshop.

Our star-studded panel included:

  • Adam LaMontagne—Machine Translation Manager at RWS
  • Paula Manzur—Machine Translation Specialist at Vistatec
  • Jordi Macias—VP, Operations at Lionbridge
  • Elaine O’Curran—Senior MT Program Manager on the AI Innovation Team at Welocalize
  • Lamis Mhedhbi—Machine Translation team lead at Acolad

With the panelists’ 87 years of collective experience in the localization industry, the event was jam-packed with tips for integrating MT into localization workflows. Keep reading to discover our key takeaways.

Always understand quality expectations

Machine translation is not one-size-fits-all. You cannot throw all of your content in and expect the quality of the output to meet all of your needs. Before starting with MT you have to know exactly what outcome you require for the different content types you wish to translate.

Setting quality expectations from the get-go will help not only in the MT evaluation process, but it will also help you measure the success of your MT program, and keep costs, quality, timelines under control. You’ll need to communicate these expectations clearly with your MT evaluators and linguists. This will help reduce bias and prevent preferential or unnecessary changes to the MT output which may increase the post-edit distance and skew efficiency metrics.

Pro tip: Don’t deploy machine translation without a content type analysis. Create a matrix of your different types of content and what your expectations are for each. Memsource’s MT Quality Estimation feature can also help you understand what quality you can expect by providing quality scores for MT at segment level, reducing the uncertainties related to MT results on new content types/languages.

While quality expectations can differ from project to project, it’s also important to note that improving MT engine quality could affect your overall strategy in the future. Light post-editing (LPE), i.e. raw MT that is only modified where absolutely necessary to ensure the output is legible and conveys the meaning of the source, may no longer make sense.“ The quality of MT is going up, and on the other hand, the cost of doing post-editing is going down. The space in which you can play is becoming narrower and narrower,” said Jordi. Shifting the LPE work from professional translators to simply native language reviewers looking for critical errors may replace LPE.

A deployment checklist is essential

We’ve provided an MT integration checklist here, but the panelists highlighted some additional dos and don’ts:

  • Do: When evaluating MT engines, develop your testing to align with your customer’s needs. You can spend a lot of time testing something the end user cares little about.
  • Do: Test a generic engine first. People don’t always consider the volume of source content that will go through MT. When you’re leveraging translation memory for a high volume of words, and only a small percentage of new words are processed by MT, you end up spending far more on customizing and maintaining your engines than actual savings. “Training an MT engine requires an investment, not only for the cost of the MT provider but also the cost of cleaning up and optimizing the training data,” added Paula.
  • Do: Understand the integration. “If you can’t deliver the engine into your processes in an efficient, ergonomic way, then the quality and the suitability of the engine towards your customer’s needs no longer matters,” said Adam. If you invest too much time into MT providers that aren’t supported by your tech stack, you’ll end up spending extra effort trying to make the MT work for you and end up pushing productivity savings to other parts of the supply chain.
  • Don’t: Simply upload your glossaries to your engine. Employ the help of a linguist to help you clean your glossaries to get them ready for MT. Glossaries only take you so far when training an engine, with poorly prepared glossaries potentially creating a lot of noise and ruining the output.

MT is not a “set it and forget it” solution

There is a common misconception that once you have set up your MT workflow, you can sit back and relax and marvel at your efficiency gains. This is not true. At the start of your MT journey, you’ll spend time and resources on set-up and engine training but it’s easy to forget that MT engines are constantly interacting with new data. The results are going to change over time so you need to understand how your engines are handling the input and how your human editors are interacting with the output. Capturing this data, as well as post-editing data, will help identify possible under-editing and over-editing and enable you to continuously fine-tune your processes, keeping the MT lifecycle healthy.

Pro tip: One way to make it easier to adapt your workflows to new performance data is to use advanced MT management features. For example, our MT management hub, Memsource Translate, can ensure that you always use the best performing engine for your content using its AI-powered MT Autoselect feature.

Presenting MT ROI is all about use cases

For a long time, MT had a bad reputation resulting from poor quality MT output. But thanks to neural MT, there has been a seismic shift in the perception of MT quality. If you are trying to get stakeholders on board who are not localization savvy, you may still have to work on removing the fear that MT is synonymous with poor quality. From there, presenting MT ROI is all about use cases and the particular need of the stakeholder. Whether it’s light post-editing or full post-editing, custom engines or generic, you need to find the best use of the budget and how to get the most out of it. “Essentially, it goes back to data, understanding what your stakeholders’ goals are, and then presenting the best workflow that covers the full spectrum of their needs,” added Jordi.

Pro tip: Basing your MT ROI on translation volumes is a common faux pas. Machine translation is not the only technology that could be contributing towards higher productivity. Translation memories produce the bulk of translation output and you’ll end up with a smaller, perhaps seemingly insignificant, volume actually translated by MT.

Capture all the data

Data is integral to a successful localization workflow. This was made especially clear as, regardless of the question, data was part of almost every answer given by the panelists. Evaluating an MT engine before implementation? You need to collect data. Want to improve the quality of the MT output? Dig into the data. How do you present ROI to stakeholders? Data will come in handy. When should you retrain your engine? Look at the data.

Qualitative data is also important. “Collect post-editors feedback regarding the engine quality and do this continuously to adapt to meet productivity goals,” said Lamis. Feedback from clients is also vital for improving engine quality.

When asked if you should track MT data for all projects or rather do spot checks, the panelists all agreed capturing all the data would be ideal, as long as there is an efficient way of doing it. “You don’t want to add huge overhead by manually downloading files and scoring them, but if you do it automatically, yes, measure everything,” said Elaine.

Pro tip: For advanced users, Memsource offers full access to your performance data through our Snowflake connector. You can track your editing time, post-editing analysis, LQA results, and on-time deliveries. On top of that, we automatically calculate key MT metrics, like BLEU, TER, or chrf3, providing you with multiple ways to measure your performances.

Watch the recording of the workshop to get all the insights from the MT experts, and be sure to sign up to hear about the future workshops.

Don’t miss the next edition of our Machine Translation Workshop Series. Sign up to be the first to know!

Sign up