Chris's Wiki :: blog/tech/RISCVServersNotSoon
Sourcescreenshots: 1
I don't expect to see competitive RISC-V servers any time soon
June 22, 2023
Recently on the Fediverse, Dan Luu was dubious about a prediction
that RISC-V would take over in datacenters in the next 5 to 10 years (here's the
EETimes article being quoted from).
Much like Dan Luu, I was skeptical, considering that under nearly
ideal circumstances AMD didn't make much of a dent. But let's take
this from the top, and ask what RISC-V would need and when if it's
going to do this.
(This is implicitly 64-bit RISC-V. No one is going to put 32-bit
RISC-V into datacenters, much less have it take over.)
Obviously if RISC-V is going to take over in datacenters, there
need to be RISC-V servers that people can buy, including off the
shelf. This is especially the case for non-cloud datacenter usage
of servers; only the cloud players and a few other big places design
and manufacture their own servers. These servers need suitable good
RISC-V CPUs and chipsets (either as systems on a chip or separately).
Apart from performance, these systems need multi-socket support,
lots of PCIE lanes, ECC with large modern RAM standards, and so on.
Given that moving to RISC-V will make people's life harder, these servers and their CPUs
need to be unambiguously better than the x86 (and ARM) server systems
available at the same time. Given that domination has a lead time these servers need to be available in
quantity and proven quality before that five (or ten) year deadline,
probably years before.
(Realistically the first generation of RISC-V datacenter servers
would probably not take over, unless they were amazing marvels that
utterly eclipse the competition. I would expect it to need two or three
generations, just to prove things, shake issues out, and convince people
that these servers really are enough better than the competition.)
These RISC-V datacenter servers will also need proven operating
systems and other software to run, and that software will need
proven and good compilers and other tools to build it. Shaking the
architecture specific bugs out of compilers and operating systems
takes time, probably years of increasingly serious usage. The
developers of all of this software will need RISC-V hardware to use
for this, and this hardware mostly can't be early versions of those
datacenter servers (datacenter servers are too loud, too large, and
too expensive for many people). Some developers will want to use
RISC-V hardware as their daily desktop, but I suspect many others
will want a quiet mini-sized box they can put in the corner (and
use over the network). There will also need to be early servers
that can be used to set up the infrastructure of open source (Linux)
development, for things like dedicated builders for Debian and other
large projects (GCC, clang, Rust, the Linux kernel, etc), CI/CD
build servers that smaller open source projects can use, and so on.
(As a practical matter, the quality of compiler optimization, kernel
tuning, and so on has a significant effect on the realized CPU
performance of anything. Bringing all of this optimization up to
speed to take advantage of the raw capabilities of good RISC-V CPUs
will take (more) time.)
All of this will take money both literally, for hardware, and
possibly figuratively, for people's time. The amount of time this
RISC-V bringup takes will be influenced by how much actual money
is spent on it. If interested companies wait for Linux developers
and other parties to spend their own money and time on buying
developer hardware and working on RISC-V kernels, software, and
Linux distributions, it's probably going to take quite a while. If
interested companies spend money, they can to some extent accelerate
this process.
At the moment, RISC-V has very little of this as far as I know
(based partly on replies to my Fediverse post about this). RISC-V is
probably in a somewhat better place than ARM64 was a decade ago
(partly because RISC-V people have learned lessons from ARM's
experiences), but it's not all that far along. On top of that, even
ARM is not doing all that well in competition to x86. I believe
that the only competitive ARM64 servers available today are the
proprietary ones Amazon made for AWS, and while those see real usage
(as covered in comments on my earlier entry), they haven't exactly taken
over even AWS.
Given all of the steps between current reality and the prediction,
I believe there's no way it can be reached in five years. Ten years
might be possible, but it feels like an aggressive timeline that
needs a lot of fast development. I'd want to see the first generation
of RISC-V datacenter servers in five years, which means we need
high-performance RISC-V CPUs in only a couple of years, along with
developer hardware (probably in large quantity in order to kickstart
a lot of development that will be necessary if those first generation
datacenter servers are going to sell to anyone in any quantity).
(If we have the first generation datacenter servers in five years, that
gives two years to get a better second or even third generation out,
a year for people to come to trust those servers, and then two years
to ramp up purchases to take over the installed base at year ten. If
people keep datacenter servers long enough that RISC-V servers need to
be dominating sales well before year eight, the timeline gets worse and
thus less plausible.)