Subscribe

The man behind Google's voice tech

Johan Schalkwyk is responsible for driving Google's heavy commitment to voice-activated search.

Paul Vecchiatto
By Paul Vecchiatto, ITWeb Cape Town correspondent
Cape Town, 09 Nov 2010

Johan Schalkwyk is the man who has struck a chord with Google, being the brains behind the development of its voice search engine.

He says the US search engine giant is serious about its investment in this technology.

Schalkwyk, who is visiting SA for the first Google Developers' Conference, in Cape Town, has been a senior staff engineer at its New York development centre since 2005. He is responsible for developing the core technology that makes voice search work.

Born in Port Elizabeth, raised in Kempton Park and holding a masters degree in electronic engineering from the University of Pretoria, Schalkwyk has spent the last 17 years in the US. He initially worked at the Oregon Graduate Institute as a researcher and then for a private firm, in Boston, before being recruited by Google.

“When Google asked me to join them, I leapt at the chance. It is great to be working for an organisation that is really serious about making voice technology work. Especially in the search environment where there is real application. There are also other applications where voice technology can really be used to great effect,” Schalkwyk says.

Switching careers from being a long-time academic, to taking on a research and development role in a commercial company, has been easy for Schalkwyk.

“It is very gratifying to work for an organisation that can make real use of what you are developing. Academic research for the sake of research can be a bit pointless,” he says.

Hopes and expectations for voice technology have been around for decades. The convenience of using the spoken word over written or other forms of input is a key driving force. However, voice technology has failed to live up to the expectation, but Schalkwyk is determined this will change.

Voice hopes

Interactive voice recognition technology, while first seen as a means to this end, has not set the market alight, as it means a person has to be online continuously to complete the transaction. The process is often long and laborious.

It is great to be working for an organisation that is really serious about making voice technology work.

Johan Schalkwyk, senior staff engineer, Google

Science fiction films from the past 50 years have illustrated this hope for voice. These have included everything from using spoken commands to open doors, to interacting with some form of artificial intelligence, to being able to translate languages.

He says Google has not only committed itself to developing voice as a mainstream and relevant technology, but is also putting the resources behind it.

“Cloud computing has been really important,” Schalkwyk says. “We are able to gather the data, number-crunch it using massive computing power, work out what the most relevant search phrases are, how people say things, and then develop the algorithms that make the technology possible.”

Schalkwyk's job is also to head up a staff of about 40 people at Google, developing the core technology. Other teams use that technology to develop the search function using languages.

The techniques used in developing the technology includes data mining on a massive scale, as search strings that are used everyday by millions and millions of people to execute their searches.

The algorithms that are developed have to take into account a number of factors, such as accents, slang, key phrases and different variants of the language as used in various countries.

Googlish rises

A search of Google Trends shows various words and phrases peak in popularity and, therefore, common use, before tailing off again. Google's analytical tools help Schalkwyk and his team keep track of this ever-evolving and shifting language landscape.

So far, 16 languages are fully developed into the voice search, and three of these are South African: English, Afrikaans, and Zulu. Other languages include Mandarin Chinese, Italian, German, French, and Spanish.

“Many (native) Zulu speakers switch easily to English when speaking, so our technology not only has to keep track of Zulu as a language, but also of English being spoken with a Zulu accent.”

Schalkwyk explains that Google's research into voice search has shown that people will speak into their handsets very similarly to the way they type in a search phrase.

“Normally they would say: 'What is the weather in Johannesburg today?' However, in search they will say: 'Johannesburg weather.' So they speak in a far more abbreviated form. This is giving rise to what we call 'Googlish'.”

Voice search technology will become increasingly important in the African context, Schalkwyk notes, because it overcomes the need for a person to be fully literate to access the Internet.

“Most people in Africa will access the Internet via their mobile phones. Essentially, it will be voice that will bridge the digital divide.”

Schalkwyk sees the future of voice technology as one that is continuously evolving, and believes greater applications will be found for it.

“We see many other applications that can be created using voice; these include translations of standard signs and directions to eventually having devices, more than likely Android mobile phones, that will be able to translate direct conversations.”

Share