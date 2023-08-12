In this article META

GOOGL

MSFT

AMZN Follow your favorite stocks CREATE FREE ACCOUNT

Chips as 'true differentiation'

In the long run, Dekate said, Amazon's custom silicon could give it an edge in generative AI. "I think the true differentiation is the technical capabilities that they're bringing to bear," he said. "Because guess what? Microsoft does not have Trainium or Inferentia," he said. AWS quietly started production of custom silicon back in 2013 with a piece of specialized hardware called Nitro. It's now the highest-volume AWS chip. Amazon told CNBC there is at least one in every AWS server, with a total of more than 20 million in use.

AWS started production of custom silicon back in 2013 with this piece of specialized hardware called Nitro. Amazon told CNBC in August that Nitro is now the highest volume AWS chip, with at least one in every AWS server and a total of more than 20 million in use. Courtesy Amazon

In 2015, Amazon bought Israeli chip startup Annapurna Labs. Then in 2018, Amazon launched its Arm-based server chip, Graviton, a rival to x86 CPUs from giants like AMD and Intel . "Probably high single-digit to maybe 10% of total server sales are Arm, and a good chunk of those are going to be Amazon. So on the CPU side, they've done quite well," said Stacy Rasgon, senior analyst at Bernstein Research. Also in 2018, Amazon launched its AI-focused chips. That came two years after Google announced its first Tensor Processor Unit, or TPU. Microsoft has yet to announce the Athena AI chip it's been working on, reportedly in partnership with AMD. CNBC got a behind-the-scenes tour of Amazon's chip lab in Austin, Texas, where Trainium and Inferentia are developed and tested. VP of product Matt Wood explained what both chips are for. "Machine learning breaks down into these two different stages. So you train the machine learning models and then you run inference against those trained models," Wood said. "Trainium provides about 50% improvement in terms of price performance relative to any other way of training machine learning models on AWS." Trainium first came on the market in 2021, following the 2019 release of Inferentia, which is now on its second generation. Trainum allows customers "to deliver very, very low-cost, high-throughput, low-latency, machine learning inference, which is all the predictions of when you type in a prompt into your generative AI model, that's where all that gets processed to give you the response, " Wood said. For now, however, Nvidia's GPUs are still king when it comes to training models. In July, AWS launched new AI acceleration hardware powered by Nvidia H100s. "Nvidia chips have a massive software ecosystem that's been built up around them over the last like 15 years that nobody else has," Rasgon said. "The big winner from AI right now is Nvidia."

Amazon's custom chips, from left to right, Inferentia, Trainium and Graviton are shown at Amazon's Seattle headquarters on July 13, 2023. Joseph Huerta

Leveraging cloud dominance

AWS' cloud dominance, however, is a big differentiator for Amazon. "Amazon does not need to win headlines. Amazon already has a really strong cloud install base. All they need to do is to figure out how to enable their existing customers to expand into value creation motions using generative AI," Dekate said. When choosing between Amazon, Google, and Microsoft for generative AI, there are millions of AWS customers who may be drawn to Amazon because they're already familiar with it, running other applications and storing their data there. "It's a question of velocity. How quickly can these companies move to develop these generative AI applications is driven by starting first on the data they have in AWS and using compute and machine learning tools that we provide," explained Mai-Lan Tomsen Bukovec, VP of technology at AWS. AWS is the world's biggest cloud computing provider, with 40% of the market share in 2022, according to technology industry researcher Gartner. Although operating income has been down year-over-year for three quarters in a row, AWS still accounted for 70% of Amazon's overall $7.7 billion operating profit in the second quarter. AWS' operating margins have historically been far wider than those at Google Cloud. AWS also has a growing portfolio of developer tools focused on generative AI. "Let's rewind the clock even before ChatGPT. It's not like after that happened, suddenly we hurried and came up with a plan because you can't engineer a chip in that quick a time, let alone you can't build a Bedrock service in a matter of 2 to 3 months," said Swami Sivasubramanian, AWS' VP of database, analytics and machine learning. Bedrock gives AWS customers access to large language models made by Anthropic, Stability AI, AI21 Labs and Amazon's own Titan. "We don't believe that one model is going to rule the world, and we want our customers to have the state-of-the-art models from multiple providers because they are going to pick the right tool for the right job," Sivasubramanian said.

An Amazon employee works on custom AI chips, in a jacket branded with AWS' chip Inferentia, at the AWS chip lab in Austin, Texas, on July 25, 2023. Katie Tarasov