What is driving OpenAI, Microsoft, and other companies to develop their own chips?

As demand for generative AI grows, cloud service providers such as Microsoft, Google, and AWS, along with large language ⁤model (LLM) providers such as OpenAI, have all ‌reportedly considered developing their own custom‍ chips for AI workloads.

Speculation that some of these companies — notably OpenAI⁣ and Microsoft ⁤— have been making efforts to develop their own custom chips for handling ‌generative⁤ AI workloads due to‍ chip shortages have dominated headlines for the⁤ last few weeks.

While OpenAI is rumored to be ‌looking to acquire a firm⁢ to further its⁢ chip-design plans, Microsoft is ⁣reportedly working with ⁢AMD to produce a custom chip, code-named Athena.

Google and AWS both‌ have ‌already developed their own chips for AI workloads in the form of ‌Tensor Processing Units (TPUs), on the part of Google, and AWS’ Trainium and Inferentia chips.

But what factors are driving these companies to make their own chips? The answer, according‍ to analysts and experts, ‌lies around the cost⁢ of processing ‍generative AI⁢ queries and the efficiency‌ of ⁢currently available chips,⁤ mainly graphics processing units (GPUs).⁣ Nvidia’s A100 and H100 GPUs currently dominate the AI chip market.

“GPUs are probably ⁣not the most ⁣efficient processor‌ for generative AI workloads and custom silicon might help their cause,” said Nina Turner, research manager at IDC.

GPUs are general-purpose devices that happen to be‌ hyper-efficient at matrix inversion, the essential math of AI, noted Dan Hutcheson, vice chairman‍ of TechInsights.

“They are ⁤very expensive ⁤to run. ⁣I would think these companies are going‍ after a silicon processor architecture‍ that’s optimized for their workloads, which ‍would‍ attack the⁣ cost‌ issues,” Hutcheson said.

Using‍ custom silicon, according to ⁤Turner, may allow companies such as Microsoft and OpenAI to cut back on power consumption and improve compute interconnect or memory‌ access, thereby lowering the cost of queries.

OpenAI spends approximately $694,444 per day⁢ or 36 cents per query to‌ operate ChatGPT, according to a report from research firm SemiAnalysis.

“AI workloads don’t‌ exclusively require‌ GPUs,”⁢ Turner said, adding that though GPUs are‌ great for parallel processing,‌ there are other ⁤architectures and accelerators better suited ‌for such AI-based ‌operations.

Other advantages of custom silicon ‍include control over ‌access to chips and designing elements specifically for LLMs to improve query speed, Turner said.

⁤ Developing custom chips is not⁣ easy

Some analysts also likened the move⁤ to design custom silicon to Apple’s ‌strategy ‍of producing chips for its devices. Just like Apple ‍made the switch from general-purpose processors to custom silicon in ‍order to improve the performance‌ of its ‌devices, the generative⁣ AI service⁢ providers are also looking to specialize their chip architecture, said Glenn O’Donnell,⁤ research director at Forrester.

“Despite Nvidia’s GPUs being so wildly popular ⁤right now, they too are…

2023-10-10‌ 23:48:03
Source from www.networkworld.com ‌ rnrn