Google unveils translation in isiXhosa, isiZulu

Johannesburg, 28 Jun 2017

Blaise Aguero y Arcas, principal scientist at Google.

Google South Africa announced in Johannesburg yesterday that its translation service is now available in isiXhosa, isiZulu and Swahili.

These languages are the latest that Google can translate - now 90 in all.

This is all made possible using neural machine translation, says Blaise Aguero y Arcas, principal scientist at Google, who leads a team of 300 employees distributed throughout the world.

Aguero y Arcas says machine learning is now being deployed in every Google product, and research and machine intelligence teams are now working in every part of the company. Neural networks find and analyse patterns, which had proved useful in, among other areas, translating language.

Better engines

According to Aguero y Arcas, there have been 'three ages' of machine translation. The first was natural language processing which involved programming the grammar itself into a machine. This immediately ran into problems, such as the fact that one typically needed to combine syntax and semantics in order to fully understand a sentence. This method results in poor translations, such as "Where will the toilet?" Better translation engines would render this to read "Excuse me, where is the toilet?"

Google also tried a statistical approach, looking for places on the Web where text had been translated from one language into another, and building up giant tables of these instances.

And while this method returns better results, it's easy, says Aguero y Arcas, to construct a sentence that has never been uttered before, and when it is looked up, a 'nonsense' response would be offered as a translation.

The new technique involves what he calls LSTMs, or long short-term memories. This involves taking 'a lot' of training data - sentences in both Swahili and English - for example, which mean the same thing. Once the neural net has this information, it can be mapped into what he calls a 'concept space'.

"Machine learning is hard, because it's not based on rules (as traditional computing would be).

"The things that these conceptual spaces give you that ordinary statistics of languages - or the more old fashioned techniques - cannot give you are, for example, metaphor."

To the device

Aguero y Arcas also says Google - now that it is shifting to becoming a hardware maker - sees great benefit of running neural networks locally on devices, such as cellphones.

Aside from the efficiencies of not relying on data or a patchy network, results would also be returned instantaneously.

Other than language translation, Aguero y Arcas demonstrated a neural net running on a cellphone that could identify bird species, just by pointing the device at the bird.

"We'd like to build that capability into phones so that just looking through the viewfinder at anything in nature, the phone will be able to tell you what species you're looking at. It would be like having a little naturalist in your device."