At AWS’ annual re:Invent conference this week, CEO Adam Selipsky and other top executives announced new services and updates to attract burgeoning enterprise interest in generative AI systems and take on rivals including Microsoft, Oracle, Google, and IBM.
AWS, the largest cloud service provider in terms of market share, is looking to capitalize on growing interest in generative AI. Enterprises are expected to invest $16 billion globally on generative AI and related technologies in 2023, according to a report from market research firm IDC.
This spending, which includes generative AI software as well as related infrastructure hardware and IT and business services, is expected to reach $143 billion in 2027, with a compound annual growth rate (CAGR) of 73.3%.
This exponential growth, according to IDC, is almost 13 times greater than the CAGR for worldwide IT spending over the same period.
Like most of its rivals, particularly Oracle, Selipsky revealed that AWS’ generative strategy is divided into three tiers — the first, or infrastructure, layer for training or developing large language models (LLMs); a middle layer, which consists of foundation large language models required to build applications; and a third layer, which includes applications that use the other two layers.
AWS beefs up infrastructure for generative AI
The cloud services provider, which has been adding infrastructure capabilities and chips since the last year to support high-performance computing with enhanced energy efficiency, announced the latest iterations of its Graviton and the Trainium chips this week.
The Graviton4 processor, according to AWS, provides up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than the current generation Graviton3 processors.
Trainium2, on the other hand, is designed to deliver up to four times faster training than first-generation Trainium chips.
These chips will be able to be deployed in EC2 UltraClusters of up to 100,000 chips, making it possible to train foundation models (FMs) and LLMs in a fraction of the time than it has taken up to now, while improving energy efficiency up to two times more than the previous generation, the company said.
Rivals Microsoft, Oracle, Google, and IBM all have been making their own chips for high-performance computing, including generative AI workloads.
While Microsoft recently released its Maia AI Accelerator and Azure Cobalt CPUs for model training workloads, Oracle has partnered with Ampere to produce its own chips, such as the Oracle Ampere A1. Earlier, Oracle used Graviton chips for its AI infrastructure. Google’s cloud computing arm, Google Cloud, makes its own AI chips in the form of Tensor Processing Units (TPUs), and their latest chip is the TPUv5e, which can be combined using Multislice technology. IBM, via its research division, too, has been working on a chip, dubbed Northpole, that can efficiently support generative…
2023-12-06 18:41:03
Original from www.infoworld.com