Amazon Web Services goes big on AI

By Paula Gilbert, ITWeb telecoms editor.

Los Vegas, 01 Dec 2016

Amazon Web Services has thousands of people dedicated solely to artificial intelligence technology.

Amazon Web Services (AWS) has announced a big push into the artificial intelligence (AI) space, with three new services that enable image recognition, text to speech and conversational applications.

Amazon Lex, Amazon Polly and Amazon Rekognition were unveiled this week by AWS CEO Andy Jassy at the AWS re:Invent Conference in Las Vegas.

The new services enable developers to build apps that can understand natural language, turn text into lifelike speech, have conversations using voice or text, analyse images and recognise faces, objects and scenes.

"A lot of companies don't realise the heritage that Amazon has in the machine learning space, but actually AWS has a very deep heritage in this space. We have thousands of people dedicated solely to AI in our business," Jassy said.

"Until now, very few developers have been able to build, deploy and broadly scale apps with AI capabilities because doing so required access to vast amounts of data, and specialised expertise in machine learning and neural networks," AWS explains in a statement.

New AI services

Amazon Lex is a new service for building conversational interfaces using voice and text that is built on the same automatic speech recognition technology and natural language understanding that powers Amazon's virtual assistant, Alexa.

AWS says the service allows developers to build and test bots or conversational apps that perform automated tasks like checking the weather or booking flights. Bots built using Amazon Lex can be used anywhere: from Web applications, to chat and messenger apps like Slack and Facebook Messenger, or through voice in apps on mobile or connected devices.

Amazon Polly uses text to speech technology that will allow developers to add natural-sounding speech capabilities to existing applications like newsreaders and e-learning platforms, or create entirely new categories of speech-enabled products. Amazon Polly has 47 lifelike voices ? both male and female ? and works in 24 languages, with options of a variety of accents to make applications that appeal to users around the globe.

The Washington Post has already expressed interest in using Amazon Polly to provide readers with audio versions of its stories. Cloud-based, animated video creation platform, GoAnimate, says the technology can also be used by its customers to create voices for animated characters created through its platform.

Amazon Rekognition can be used to add image analysis to applications, using deep learning-based image and face recognition. It can locate faces within images and detect attributes, such as whether the face is smiling or the eyes are open. It also supports advanced facial analysis functionalities, such as face comparison and facial search, which can be useful for security applications.

"Using Rekognition, developers can build an application that measures the likelihood that faces in two images are of the same person, thereby being able to verify a user against a reference photo in near real-time," AWS explains.

The service can also be used to automatically identify objects and scenes, such as vehicles, pets or furniture, and provides a confidence score that lets developers tag images so that application users can search for specific images using keywords.

In addition to the new AI services, AWS recently announced it is investing significantly in MXNet, an open source distributed deep learning framework, initially developed by Carnegie Mellon University and other top universities, by contributing code and improving the developer experience.

MXNet will enable machine learning scientists to build scalable deep learning models that can significantly reduce the training time for their applications.