At the start of the global COVID-19 pandemic in 2020, Microsoft officials were proactive in explaining how the company was trying to head off cloud-computing capacity issues in spite of increased demand. Two years later, Microsoft customers are still encountering capacity problems thanks to continuing supply-chain limitations. The difference now is officials aren't saying as much publicly about what's going on. Privately, however, they may be.
A report byThe Informationfrom last week indicates. Capacity issues are affecting Azure datacenters in Washington State, Europe and Asia due to supply chain issues. More than two dozen Azure data centers in countries around the world are operating with limited server capacity, the report says. And capacity is expected to remain limited until at least early next year, it adds.
Talking to customers and partners, the reality that the Cloud isn't limitless and is bound by how rapidly big companies can scale out their datacenters has become almost common knowledge. The idea that the Cloud was out of the room two or three years ago seemed crazy. Now it's the new norm, not just for Microsoft but for AWS and Google.
Microsoft notifies at least some customers that there are quotas on how much cloud capacity the company can provide. Not surprisingly, given that its customer base is largely enterprise customers, Microsoft prioritizes its capacity for existing business users, which means newer and pending customers are more likely to hit quotas and hard barriers.
"I didn't think that Azure having capacity issues was news. I thought that thanks to all the jokes on social media, everyone took it for granted that the effects of the pandemic and cargo crises had crippled electronics supplies the world over," blogged Aidan Finn, a Microsoft Most Valuable Professional (MVP) and Principal Consultant for Innofactor Norway.
Microsoft officials said a couple of years ago that it would be limited in adding more datacenters because of shortages of chips and servers -- a situation that it seemingly rectified later. A year ago, officials said the company was on pace to build annually 50 to 100 new datacenters (not datacenter regions, but actual physical datacenters). But supply chain uncertainties have continued to make fulfilling this goal challenging. It's not just chips and servers that are in short supply. Everything from power supplies to concrete also is hard to come by.
Couple these shortages with Microsoft deciding to prioritize spinning up cloud security services when needed to help with the war in Ukraine, and you've got a situation where even the best workload balancing plans can't produce capacity miracles.
Microsoft officials won't comment on the extent to which Azure's capacity is limited. But they do acknowledge that existing customers get first dibs. The official statement from a company spokesperson:
"Across the globe, we have seen unprecedented growth in the Cloud. With this surge, coupled with macro trends impacting the whole industry, we've taken steps to address customer increases in capacity while also expediting server deployment in our datacenters. Our priority remains ensuring business continuity for customers. In addition to managing and planning for growth, we actively load balance as needed. If it does become necessary to put capacity restrictions in place, we will first restrict trials and internal workloads to prioritize growth of existing customers."
So what can customers do to try to head off the impact of capacity issues?
For one, customers should be looking at least two to three years out in planning their needs and Cloud spending. And they should talk to their Microsoft representatives about what Azure will be able to handle during that time period. They also should keep their eyes on quotas and anticipate any scale-out needs six months in advance, according to people I've asked about this.
While some Microsoft partners advise clients to look at legacy virtual machine SKUs as a possible way to get around capacity issues, others say customers should try to use the latest SKUs even though they require newer and possibly less readily available hardware. And in cases where customers are big enough, multiregional architectures spanning multiple availability zones may help with supply -- as long as companies can afford to pay for it.
Finn also suggested the possible disablement of auto-scaling and potentially reserving VM capacity in advance.