Tuesday, 15 November 2016

Lack of Data security and other issues in (free) online Machine translation


Translation of text by a computer, without human involvement

While interacting with our clients, increasingly these days, the clients mention “We use free online translation regularly for translating emails, letters and documents” – while this is like a quick fix which is free of cost, most people don’t know the cons of using free online MACHINE translation. Yes, it is a machine that handles your text!

What is machine translation?

Normally, text is input to the machine, the machine has a software program that scans the text and translates it using some algorithm. That algorithm could be as simple as a word-by-word dictionary lookup or a bit more complex with some statistical analysis thrown in.
Still, it retains the 'artificial' feel that would not appear in a human translation.
Machine translations (MT) are useful for getting a general idea about the meaning of a text written in a foreign language. However, "general idea" isn't always exactly accurate; the program literally translates (word to word) the text which often results into an unprofessional and inaccurate result which is grammatically incorrect, or sometimes a completely incoherent text.

How/for what is machine translation used?

It is a common practice to use free translation programs available online to read a document in a foreign language, or to translate a document into a foreign language.  It is also sometimes used off-hand for Indian languages. See an example in Hindi in Image 1! This is an actual snapshot taken by our team, of an ad appearing on a website!!!!

Image 1 – Machine translation from English to Hindi for an advertisement

Indian language perspective –

The image 1 in itself is a proof of how free machine translation works in Indian languages, language pairs in which the technology is in a nascent stage of development.

Foreign language perspective -

However, the situation gets worse when MT is used for translating into a language one does not know. In such cases, the translated output cannot be read and therefore cannot be verified by the person. He/she then just blindly uses the output for whatever purpose, blissfully unaware of the problems that could arise.

Multiple issues of using MT in the corporate world

Now consider a scenario where a Project Manager is translating a document related to a Tender from Portuguese – English online on a free translation portal, just to roughly understand it- here the problems are manifold

Even if the PM understands English, how can he be sure that:

i.                  All data has been translated – he does not have the time to verify and check with the original, whether all text, numerals, etc. have been translated and transferred fully.

ii.                That the machine has interpreted the meaning correctly

iii.               That no linguistic nuances have been ignored

iv.              AND MOST IMPORTANTLY, such sensitive information has been passed on to the MT provider, and this data will be stored by the program! This means, though it will not immediately affect your business, this data loses the confidentiality you require! MT providers claim rights to using that information for their own purposes.

Data leakage through MT – a real threat:

a.    Data leakage:

      It is well known how critical information can leak – sending or accessing it over unencrypted connections, through unsecured Wi-Fi networks or storing it on cloud servers. These are risks that most of us are aware of.

What most people are unaware of are what the online machine translation providers do with the data users input.

b.   Rights to your data:

      These sites exercise the right to use your data in ways you may never even have imagined. When you enter text for “Free translation”, you inadvertently provide the MT companies a worldwide license to use, host, store, reproduce, modify, create derivative works, communicate, publish, publicly perform, publicly display and distribute such content.

c.    Types of data divulged:

       Millions of people use MT services daily to translate text from emails, text messages, project proposals, legal contracts, merger and acquisition documents, and other sensitive content.

d.   Risk:

      Organizations worldwide are realizing that confidential information, trade secrets, and intellectual property (IP) are thus open to eavesdroppers and interpretation by the free MT providers to the world.  The bottom line: Data leakage via MT is a real and present danger to enterprises.


If MT is required to produce professional high quality translation, a trained MT engine is necessary, together with specialist translators and post-editors (proofreaders of text post Machine translation!) who can check, edit and validate the MT output. However this is only possible for large corpora of data. When it comes to a one time translation requirement, especially for your commercial or technical data, it is always better to get it translated through a human being!

Difference between human and machine translation:

   Artificialness in language after translation – problems in syntax (sentence structure) influenced by source language, therefore providing translation that seems unnatural/artificial.


a.     Untranslatability: The vagaries and origins of different languages mean that some things cannot be expressed – a concept known as untranslatability. E.g. Ushta (in Marathi) or Jootha (in Hindi) [Food that has been half-eaten by someone] cannot be translated into English/ any other European language as the concept itself is untranslatable. A human translator in such a case inserts a translator’s note to explain the meaning.

b.      Choice of words: You can get the gist of the draft or documents through automatic translation, but machine translation only does word to word translation without comprehending the information. So the words placed in the translation don’t necessarily mean what is said in the original document. A human translator on the other hand can re-express in his/her own words in the target language. Moreover, since the words in machine translation are statistically selected, it means if a word is used by a large number of people incorrectly, the same word is used in machine translation.

c.       Unknown words, incorrect language: Words that the machine is not trained for or does not know are not translated and are retained in the source language as is, even after translation! Sometimes, even basic things like spelling mistakes, grammar errors are left in a machine translation! A professional human translator can avoid such issues due to his/her long standing experience, knowledge of the language, and getting proofreading done by another translator.

d.      Context: A word can have many different meanings and connotations depending on the context in which it is used – and it is difficult for a machine to comprehend these minute differences yet.

E.g. “order” can be related to:

a. something you place in a restaurant,

b. a government order,

c. a superior’s instruction to you,

d. law and order,

e. order of different things in a system, etc.

Depending on these meanings, the word is translated differently in the target language by a human translator. However, it gets complex for the program to translate it in such cases & it chooses a statistically highly used word. Things get worse when such words are not available in the target language at all. In such cases, a human translator will normally put in more words and make the term comprehensible, whereas a machine translation will just replace the word by any translation it has depending on the statistics of the word being used in a corpus on which it is trained.


a.     Mental outlook: Systematic and formal rules are followed by machine translation so it cannot concentrate on a context and solve ambiguity and neither makes use of experience or mental outlook like a human translator can.

b.    Literary ‘machine’ translation not possible: As human thoughts are not predictable and mechanical, computerised system cannot translate most literary works or regular general texts, like humans can.

c.    Emotions/nuances/double meanings: Software programs have no “soul,” no rational, emotional faculty that could translate hidden meanings, irony, subtle humour and all those linguistic details that make a language what it is.

Repercussions of using Machine Translation on your business:

Poor product quality -

Think of the consequences of such irresponsible use of MT in business situations! The results can be catastrophic. We have been thinking of writing to the Marketing Department of a Large FMCG group in India, the food labels of which have been translated into painfully incorrect French and have been printed on their packs. This not only annoys the respective buyers (as most Europeans are very particular and proud about their language) but leads them to question the quality of the products also (from India).


It is clear that automated translation software is here to stay, and there is no turning back. What should you do?  First of all, know that automated translation software has its place, but only for a certain applications. Since it is free, you can always use it for general purposes, such as chatting with a friend abroad, expanding your vocabulary, etc.

Apart from that, you should use translation services provided by trained professionals to achieve desirable results; otherwise, problems are bound to arise. If, however, you cannot avoid using a software application, make sure to hire a professional translator/editor to give your document a second look. This kind of final-editing and quality check is always recommended, even for a translation done by a human, because it reduces the inevitability of mistranslation and errors, and thus can turn a translated piece into something worthwhile.

At this point of time, at least, that what is written by a human can be properly translated into another language only by another human. Computers with translation software can get very close indeed, but excellence is not about getting close.  It’s about delivering accurate and professional translation results.

A good translation is a basic requirement for any company selling a product or service worldwide. It makes good business sense to have brochures, website, promotional literature and contracts translated in the language of the target country. At times, it seems easier to use free translation software through a search engine or other websites. However, one shouldn’t forget that incorrect translations of documents can be disastrous for both the company and clients.

