Amazon Web Services (AWS) yesterday announced it will start deploying artificial intelligence (AI) factories for organisations.
An AI factory is the integrated systems, processes and infrastructure that enable an organisation to build, deploy, scale and continuously improve AI models and AI-powered applications – much like a traditional factory produces physical goods.
According to AWS, its AI factories have dedicated infrastructure with NVIDIA’s computing platform, as well as AWS Tranium chips, AWS AI services and AWS networking.
To deploy the AI factories, organisations provide the data centre, networking connectivity and power.
AWS manages the deployment, management and integration of the infrastructure.
The company says the factories are being pitched at governments and public sector entities that have strict data sovereignty and security requirements. These organisations can train and run their large models on their own proprietary data.
The announcement on Tuesday came after AWS and Humain, a Saudi-based public investment fund formed to deliver AI solutions, entered a partnership in November for the deployment of 150 000 AI accelerators.
These include both NVIDIA’s GB300s AI infrastructure, as well as AWS’s Tranium chips. These chips will be deployed in an AI zone in Riyadh, and the companies said they will together spend $5 billion on AI infrastructure, AWS services and AI training and development in Saudi Arabia.
The companies intend to deliver AI compute and services from Saudi Arabia to organisations in the rest of the world.
AWS CEO Matt Garman, delivering his keynote address at the AWS re:Invent conference in Las Vegas on Tuesday, said other governments and public sector organisations had also expressed an interest in the concept.
“And so, we sat back and asked ourselves, could we deliver this type of AI zone to a broader set of customers, maybe even something that could leverage customers’ existing data centres? This led to the launch of the AWS AI factories.
“This enables customers to deploy dedicated AI infrastructure for AWS in their own data centres for exclusive use. AWS AI factories operate like a private AWS region, letting customers leverage their own data centre space and power capacity they’ve already acquired.”
It also gives organisations access to AWS AI infrastructure such as Trainium UltraServers and NVIDIA GPUs, and access to services like SageMaker for building and training AI models, and Bedrock for building GenAI applications.
Garman said all the traffic from the latest generation models from Anthropic’s Claude and Amazon’s Bedrock were running on Tranium chips. He said it deployed over one million Tranium chips and now represented a multibillion-dollar business on its own.
AWS has also worked to build a system purpose-built around Tranium, and it announced the general availability of its Trainium 3 UltraServers.
The servers use the first 3nm AI chip in the AWS Cloud. Garman said the UltraServers deliver 4.4x more compute, and 3.9 times the memory bandwidth, as well as five times more AI tokens per megawatt of power.
The company’s largest three UltraServers combine 144 total Trainium 3 chips connected by the company’s custom neuron switches. This delivers 362 petaflops of compute and over 700TB per second of aggregate bandwidth.
Share