The Mechanics of GPU Depreciation

Analytics India Magazine (Supreeth Koundinya)

One may forgive, but will never forget what happened in 2008 — the housing collapse that blindsided Wall Street and made Michael Burry, the contrarian hedge-fund manager who predicted it, impossible to ignore.

Recently, Burry flagged that hyperscalers may be overstating the useful life of their GPU fleets, effectively understating depreciation and inflating operating earnings.

To evaluate that claim, we need to understand how GPU lifespan is actually determined in practice.

How Hyperscalers Depreciate GPUs

To begin with, when hyperscalers like Amazon, Google, Microsoft, Meta or Oracle purchase GPUs, they don’t treat the full cost as an immediate expense.

They spread that cost over the number of years they expect the hardware to remain economically sound. A $5 billion GPU investment depreciated over three years hits the income statement much harder than the same $5 billion spread over six.

The shorter the assumed life, the larger the annual depreciation charge — and the lower the reported operating profit.

In a conversation with AIM, Jordan Nanos, a member of technical staff at the market research firm SemiAnalysis, shed light on the mechanics of GPU depreciation.

For one, there is a prevailing narrative in the industry that when newer GPUs become available for intensive workloads, such as training, older GPUs are relegated to less demanding tasks, like inference. The narrative isn’t inaccurate, but incomplete, however.

Yes, older hardware often moves into inference clusters. But inference itself is no longer a low-intensity workload. “There are many companies running inference on GB200 NVL72 and GB300 NVL72 [NVIDIA’s latest high-density Blackwell inference and training systems] right now,” noted Nanos, pointing out that real-time, high-volume inference often prefers the newest accelerators.

At the same time, older GPUs remain economically useful for far longer than critics assume. Many organisations still train models on previous-generation NVIDIA hardware such as the H100 (2022) and H200 (2023).

These chips no longer lead on frontier performance, but their overall efficiency relative to cost remains strong when compared with the newest Blackwell-class GPUs.

Why Older GPUs Still Earn Money

Ultimately, the decision isn’t about owning the fastest chip — it’s about which GPU delivers the best performance per dollar.

On that metric, older hardware often stays competitive enough to justify years of continued use in training clusters.

SemiAnalysis notes that the marginal cost of running a modern GPU, once you account for power, cooling and maintenance, bottoms out around $0.30–$0.40 per GPU-hour in many hyperscale environments.

As long as a GPU can earn more than that, it stays alive in the cluster. Once its achievable revenue per hour drops below that threshold, it becomes economically “dead,” even if it is still fully functional.

Nanos lays out the range of workloads where older silicon still delivers strong returns: pre-training, post-training and random experiments; online inference with high interactivity; and offline inference with high throughput.

When GPUs Actually Get Retired

In short, if a GPU’s performance per dollar exceeds its operating cost, operators keep using it.

So, they aren’t retired because they “wear out” or because a new model appears. “GPUs get replaced when the power and floorspace in the datacentre can be used for something else,” said Nanos.

But apart from “economic death,” what actually triggers a cloud provider to retire or repurpose a GPU? “The biggest signal is that the contract ends,” said Nanos.

He stated that a huge portion of high end GPUs like the NVIDIA GB200 and GB300 NVL72, are tied to fixed-term rental agreements.

Many run on five-year contracts with enterprise customers, AI labs, or model-training startups.

“If that contract expires, and there is no one interested in renting the GPUs at the current price (which once again needs to be greater than the cost to keep the lights on), a cloud provider will decommission the GPU systems and replace them with something else,” said Nanos.

Another factor extending GPU lifetimes is warranty coverage. Hyperscalers routinely negotiate five-year server warranties, with options that stretch even longer.

As long as support contracts remain active, OEMs continue supplying replacement parts, making it financially sensible to operate older accelerators well past the headline upgrade cycle.

Networking and storage vendors normalised this long ago: if customers keep paying for support, hardware keeps getting serviced. The same logic now applies to GPU fleets.

“Think of it like a car: the high end of the market might lease and upgrade their Benz every 2 years, while others drive their 20 year older beaters for the price of gas and insurance,” said SemiAnalysis.

A clear example is NVIDIA’s V100 GPU. Launched in 2017, it still runs in production today — including in AWS p3.16xlarge instances and in secondary GPU marketplaces. NVIDIA shipped spare parts for over five years, giving hyperscalers enough stock to keep these systems alive for nearly a decade.

And eventually, these GPUs are used until the space they occupy can be replaced with even higher revenue generating assets — which likely can be newer generation GPUs.

But there is a caveat, which leads to the converse of the entire thesis. “If performance improvements are not realised for new GPUs, the market will demand the old ones for longer,” said Nanos, which further indicates a longer life cycle for GPUs.

However, this is under the assumption that the underlying workload patterns don’t fundamentally change.

Exceptions That Can Shorten Lifetimes

Software evolution can also shorten the lifespan of older clusters — assuming the hardware roadmap stays broadly constant and it’s the software that moves first.

If new model architectures begin exploiting features that only exist in the latest GPUs, then older systems are simply left behind.

Certain parallelism schemes in both training and inference would be technically impossible on legacy clusters, effectively forcing an upgrade even if the hardware is still functional.

Nanos, stating another factor, said, “As GPU servers get older more parts start failing, which leads to higher operational expense. At some point the cost of maintenance + power is more than the market price for the GPUs.”

Higher-density hardware also raises cooling demands, adding pressure to replace older clusters when newer systems can extract more revenue from the same power and cooling footprint.

Burry’s warning matters, but reality is simpler: GPUs die when their economics do. If older chips still out-earn their operating costs, hyperscalers will keep them running, accounting debates won’t change that.

The post The Mechanics of GPU Depreciation appeared first on Analytics India Magazine.

Generated by RSStT. The copyright belongs to the original author.

Source

The Mechanics of GPU Depreciation

How Hyperscalers Depreciate GPUs

Why Older GPUs Still Earn Money

When GPUs Actually Get Retired

Exceptions That Can Shorten Lifetimes

Report Page