Ai Chip Shortages Continue, But There May Be An End In Sight

SERVERS

While GPUs are in high demand, they still need high-performance memory chips for AI apps. The market is tight for both - for now.

Credit: Shutterstock/Javier Pardina

As the adoption of generative artificial intelligence (genAI) continues to soar, the infrastructure to support that growth is currently running into a supply and demand bottleneck.

Sixty-six percent of enterprises worldwide said they would be investing in genAI over the next 18 months, according to IDC research. Among organizations indicating genAI will see increased IT spending in 2024, infrastructure will account for 46% of the total spend. The problem: a key piece of hardware needed to build out that AI infrastructure is in short supply.

The breakneck pace of AI adoption over the past two years has strained the industry's ability to supply the special high-performance chips needed to run the process-intensive operations of genAI and AI in general. Most of the focus on processor shortages has been on the exploding demand for Nvidia GPUs and alternatives from various chip designers such as AMD, Intel, and the hyperscale datacenter operators, according to Benjamin Lee, a professor in the Department of Computer and Information Science at the University of Pennsylvania.

"There has been much less attention focused on exploding demand for high-bandwidth memory chips, which are fabricated in Korea-based foundries run by SK Hynix," Lee said.

Last week, SK Hynix said its high-bandwidth memory (HBM) products, which are needed in combination with high-performance GPUs to handle AI processing requirements, are almost fully booked through 2025 because of high demand. The price of HBMs has also recently increased by 5% to 10%, driven by significant premiums and increased capacity needs for AI chips, according to market research firm TrendForce.

SK Hynix

HBM chips are expected to account for more than 20% of the total DRAM market value starting in 2024, potentially exceeding 30% by 2025, according to TrendForce Senior Research Vice President Avril Wu. "Not all major suppliers have passed customer qualifications for [high-performance HBM], leading buyers to accept higher prices to secure stable and quality supplies," Wu said in a research report.

Why GPUs need high-bandwidth memory

Without HBM chips, a data center server's memory system would be unable to keep up with a high-performance processor, such as a GPU, according to Lee. HBMs are what supply GPUs with the data they process. "Anyone who purchases a GPU for AI computation will also need high-bandwidth memory," Lee said.

"In other words, high-performance GPUs would be poorly utilized and often sit idle waiting for data transfers. In summary, high demand for SK Hynix memory chips is caused by high demand for Nvidia GPU chips and, to a lesser extent, associated with demand for alternative AI chips such as those from AMD, Intel, and others," he said.

"HBM is relatively new and picking up a strong momentum because of what HBM offers - more bandwidth and capacity," said Gartner analyst Gaurav Gupta. "It is different than what Nvidia and Intel sell. Other than SK Hynix, the situation for HBM is similar for other memory players. For Nvidia, I believe there are constraints, but more associated with packaging capacity for their chips with foundries."

While SK Hynix is reaching its supply limits, Samsung and Micron are ramping up HBM production and should be able to support the demand as the market becomes more distributed, according to Lee.

The current HBM shortages are primarily in the packaging from TSMC (i.e., chip-on-wafer-on-substrate or CoWoS), which is the exclusive supplier of the technology. According to Lee, TSMC is more than doubling its SOIC capacity and boosting capacity for CoWoS by more than 60%. "I expect the shortages to ease by the end of this year," he said.

At the same time, more packaging and foundry suppliers are coming online and qualifying their technology to support NVIDIA, AMD, Broadcom, Amazon, and others using TSMC's chip packaging technology, according to Lee.

Nvidia, whose production represents about 70% of the global supply of AI server chips, is expected to generate$40 billion in revenue from GPU sales this year, according to Bloomberg analysts. By comparison, competitors Intel and AMD are expected to generate$500 million and$3.5 billion, respectively. But all three are ramping production as quickly as possible.

Nvidia is tackling the GPU supply shortage by increasing its CoWoS and HBM production capacities, according to TrendForce. "This proactive approach is expected to cut the current average delivery time of 40 weeks in half by the second quarter [of 2024], as new capacities start to come online," TrendForce report said in its report. "This expansion aims to alleviate the supply chain bottlenecks that have hindered AI server availability due to GPU shortages."

Shane Rau, IDC's research vice president for computing semiconductors, said that while demand for AI chip capacity is very high, markets are adapting. "In the case of server-class GPUs, they're increasing supply of wafers, packaging, and memories. The increased supply is key because, due to their performance and programmability, server-class GPUs will remain the platform of choice for training and running large AI models."

Chipmakers scramble to meet the demand for AI

Global spending on AI-focused chips is expected to hit$53 billion this year - and to more than double over the next four years, according to Gartner Research. So it's no surprise that chipmakers are rolling out new processors as quickly as they can.

Intel has announced its plans for chips aimed at powering AI functions with its Gaudi 3 processors, and has said its Xeon 6 processors, which can run retrieval augmented generation (RAG) processes, will also be key. The Gaudi 3 GPU was purpose-built for training and running massive large language models (LLMs) that underpin genAI in data centers.

Meanwhile, AMD in its most recent earnings call, touted its MI300 GPU for AI data center workloads, which also has good market traction, according to IDC Group Vice President Mario Morales, adding that the research firm is tracking over 80 semiconductor vendors developing specialized chips for AI.

On the software side of the equation, LLM creators are also developing smaller models tailored for specific tasks; they require fewer processing resources and rely on local, proprietary data - unlike the massive, amorphous algorithms that boast hundreds of billions or even more than a trillion parameters.

Intel's strategy going forward is similar: it wants to enable genAI on every type of computing device, from laptops to smart phones. Intel's Xeon 6 processors will include some versions with onboard neural processing units (NPUs or "AI accelerators") for use in workstations, PCs and edge devices. Intel also claims its Xeon 6 processors will be good enough to run smaller, more customized LLMs.

Even so, without HBMs, those processors would likely struggle to keep up with genAI's high performance demands.

Prev:Get an Apple iPad (9th or 10th Gen) for under $400 following Apple's 'Let Loose' event

Next:3+ reasons Apple might want to make its own server chips

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVERS

HOT NEWS

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

​Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

AI chip shortages continue, but there may be an end in sight

While GPUs are in high demand, they still need high-performance memory chips for AI apps. The market is tight for both - for now.

Why GPUs need high-bandwidth memory

Chipmakers scramble to meet the demand for AI

Hot Tags : Technology Industry Generative AI CPUs and Processors

Ordering Guide

Resources

About Us

Introduction to Huawei CloudEngine S6730-H Series Switches