TensorRT-LLM: The Software Update That Breaks the ‘Accounting Fraud’ Narrative

A lot of AI bear theses right now sound sophisticated… until you realize they’re using old mental models on a very new ecosystem.

The argument:

“AI infrastructure companies are extending GPU useful lives just to boost earnings. That’s basically accounting fraud.”

That would be a strong point, if GPUs were static bricks.

They’re not.

Nvidia’s real moat is hardware + CUDA + a ruthless software stack on top.

Software like TensorRT-LLM can roughly double inference performance on existing GPUs. Same chip, same rack, same power, just a smarter stack.

And Nvidia designs this to be backward compatible.

So “old” GPUs don’t just sit there decaying; they keep getting pulled forward by software updates.

If those updates:

Keep GPUs economically productive for 5 years instead of 3,
Increase revenue per GPU by boosting throughput,
And delay the point where a chip becomes uncompetitive, then longer depreciation schedules can actually reflect reality, not hide it.

Calling that “fraud” without modeling the software is like calling Tesla’s OTA updates a crime because they make older cars better for longer.

Before we write off the whole AI infra space as “accounting games,” it’s worth asking:

Are we explicitly modeling software-driven performance gains on existing GPU fleets?
Or are we assuming a flat, 3-year life in a world where performance can jump 2× via a code push?

In an AI cycle where demand is exploding and software keeps stretching the life of installed GPUs, some “aggressive” depreciation assumptions might not be aggressive at all.

They might just be up to date.