DeepL is as good as human translators?

How good is automated translation in 2017? There is a new company called DeepL that seems to have “cracked” the translation problem in the sense that it makes fewer errors than non-professional human translators. That’s my claim, not theirs, but since their system is online for anybody to try, you can make your own experiments.

I decided to take an extract from a Michel Houellebecq’s novel to illustrate objectively the point. It is in French, but I do not assume you can read it:

À Charny il ne reste qu’une épicerie ; mais la camionnette du boucher passe le mercredi, celle du poissonnier le vendredi ; souvent, le samedi midi, sa grand-mère fait de la morue à la crème. Michel est en train de vivre son dernier été à Charny, mais il ne le sait pas encore. En début d’année, sa grand-mère a eu une attaque. Ses deux filles, qui vivent en banlieue parisienne sont en train de lui chercher une maison pas trop loin de chez elles. Elle n’est plus en état de vivre seule toute l’année, de s’occuper de son jardin.

Let us first try Bing Translator by Microsoft:

In Charny there is only one grocery store; But the butcher’s van spends Wednesday, the fishmonger’s truck on Fridays; Often on Saturday lunchtime, his grandmother makes cod with cream. Michel is living his last summer in Charny, but he doesn’t know it yet. At the beginning of the year, his grandmother had a seizure. His two daughters, who live in the suburbs of Paris, are now looking for a house not too far from home. She is no longer in a position to live alone all year, to look after her garden.

It is good but Bing gets the capitalization wrong, and “the butcher’s van spends Wednesday” is a bad translation. I think it is slightly worse than what a non-professional human translator would do.

How well does Google translate do on this passage?

At Charny there is only a grocery store; but the butcher’s van passes on Wednesday, that of the fishmonger on Friday; often, on Saturday noon, her grandmother made cod in the cream. Michel is living his last summer at Charny, but he does not know it yet. At the beginning of the year, his grandmother had an attack. His two daughters, who live in the Paris suburbs, are looking for a house not too far from their home. She is no longer in a condition to live alone all year round, to take care of her garden.

Unlike Bing, Google does not randomly capitalize the sentences, but there are awkward bits. The “that of the fishmonger on Friday” passage is hard to parse. And then Google gets the verb tense wrong in “her grandmother made cod in the cream”. I don’t like “his grandmother had an attack”. Like Bing, this does not reach the “non-professional human level” threshold.

What about DeepL?

In Charny, there is only one grocery store left; but the butcher’s truck passes on Wednesdays, the fishmonger’s truck passes on Fridays; often, on Saturdays at noon, his grandmother makes codfish with cream. Michel is living his last summer in Charny, but he doesn’t know it yet. Earlier this year, his grandmother had a stroke. His two daughters, who live in the suburbs of Paris, are looking for a house not too far from their homes. She is no longer able to live alone all year round, to look after her garden.

DeepL is the only one to get “his grandmother had a stroke” correctly. It is as good as what most human beings could do.

All translation engines fail to attribute the daughters to the grand-mother. No professional translator would make such a mistake, but I think many of us would.

Yes, I know that judging a system based on a single passage is methodologically problematic, but I ran many more tests that support my claim that DeepL is far above Google and Bing. I could elaborate further, but I’d encourage you instead to try it out.

Credit: I found out about DeepL via Peter Turney.

Note: My wife is a professional translator. I’m not claiming that she is about to become obsolete. Not by a long shot. But she is a professional translator, she is much better at translation than 99% of us.

4 thoughts on “DeepL is as good as human translators?”

  1. I had to re-read the french text myself twice, but my understanding is that the daughters are that of the grandmother, not Michel’s. Like Bing and Google, DepL translated it to “His two daughters” instead of “Her two daughters”. It should be possible to infer that from the following sentence, “Elle n’est plus en état de vivre seule”, but I’m not convinced all cases will be able to get resolved without semantic analysis to infer family structure, which I myself needed to confirm the sense of it.

    DepL’s sure looks like a step forward, but it’s probably still just a well-trained text robot.

    1. DepL’s sure looks like a step forward, but it’s probably still just a well-trained text robot.

      I think we have not broken the Turing test yet, for sure.

      However, please consider that most human beings are terrible at translation. For example, I only saw the mistake once you pointed it out.

      Most of us make plenty of mistakes while translating. You have to put the bar at a reasonable level.

      I’m also explicit in my post that I do not think we are close to the “professional level” of translation. That’s going to be much harder.

  2. You shouldn’t test on things that have already been translated by humans and potentially used as training data. Ideally, write something completely new.

  3. I tried the sentence: “Toujours tiré à quatre épingles, il connaît bien les impératifs de sa profession.” and none of the automated translators, including DeepL, recognized the expression “tirer à quatre épingles.” They all gave literal translations.

    None of the idioms I tried were translated correctly. These engines still have a ways to go compared even to a non-professional translator.

Leave a Reply

Your email address will not be published. Required fields are marked *