AWS vs Azure vs GCP: How hyperscaler performance stacks up across 1000 benchmark tests.
Network throughput, CPU, storage read, and more tested.
AWS vs Azure vs GCP: cloud comparisons come and go, but Cockroach Labs' annual cloud report is arguably one of the few to offer a robust test against benchmarks that reflect critical workloads. (Reproduction steps are fully open source and available in this GitHub repo, for the cynical). In Q4 2020 the database company tested 54 machine configurations and nearly 1,000 benchmark test runs to assess performance across CPU, network throughput, network latency, storage read performance, storage write performance; more.
Here's how they stacked up. (There were some surprises).
Throughput? Google Cloud unexpectedly stands out
GCP arguably unexpectedly delivered the most throughput (i.e. the fastest processing rates) on 4/4 of the Cloud Report’s throughput benchmarks.
These were network throughput, storage I/O read throughput, storage I/O write throughput, and maximum tpm throughput – a measure of throughput-per-minute (tpm) as defined by the Cockroach Labs Derivative of TPC-C.
GCP’s worst performing machine outpaced both AWS and Azure’s best performing machines on this measure, with GCP’s top-performing machine had 165% and 237% more throughput than AWS and Azure respectively. Cockroach Labs notes: "[These results] were consistent with how each of the cloud provisioned 16-core VMs: GCP simply made more bandwidth available (up to 32 Gbps) than AWS (up to 10 Gbps, or in some cases up to 25 Gbps) or Azure (up to 8 Gbps).
"When choosing, we recommend looking at the application and workload that you plan to run, and in particular how much data your application needs to send or receive, before determining whether having less available bandwidth is a disqualifying factor."
Throughput -- also known as bandwidth -- is the quantity of data being sent and received over time. Latency is the total “round-trip” response for a request.
A critical factor for latency, needless to say, is where machines are physically placed. Cockroach Labs says it provisioned machines in the same availability zone for all of its network tests (US-East) but didn't try to control the tests using more granular placement policies; data centre location will have had an impact.
Multi-core CPU performance? AWS's custom chips are punching above their weight.
AWS's homegrown Graviton2 chips made a splash when unveiled at Re:Invent in December 2019.
The Arm-based chips were designed in-house with Amazon subsidiary Annapurna Labs, and use 64-bit, open architecture Arm Neoverse N1 cores They've been available to power cloud instances since May 2020 and they are performing very well indeed. Well, certainly on multi-core CPU tests.
As Cockroach Labs notes: "For the single-core runs, all the winning machines ran Intel processors. When we looked at performance on the 16-core CoreMark benchmark, none of the winning machines ran Intel processors. AWS’s Graviton2 came in just ahead of GCP and Azure’s winning machines, both of which ran AMD processors."
Network latency: AWS wins again.
AWS has performed best in network latency for three years running. Its top-performing machine’s 99th percentile network latency was 28% and 37% lower than Azure and GCP, respectively.
AWS vs Azure vs GCP
Each of the clouds offers an “advanced disk” – a more expensive disk for applications and workloads that require higher performance: AWS’s io2, Azure’s ultra disk, and GCP’s extreme-pd.
Azure's disks stood out as worth the money, delivering "performance improvements commensurate to or better than their estimated price increase", but overall, advanced disks may not be necessary, according to Cockroach Labs' testing.
The company noted: "Cockroach Labs Derivative of [benchmark] TPC-C is a compute and memory intensive workload, and while it values storage I/O performance, we found the benchmark did not drive sufficient load at the storage I/O level to prove the value of io2 and ultra disks. As a result, memory- and compute-optimized machines thrived, and storage-optimized machines with advanced disks were underutilized."
"You're all winners!"
Every hyperscaler had its highs and lows. AWS provided the most cost-efficient machine at $0.813 in dollars per tpm. On transactional throughput, however, GCP gave the highest throughput of any of the clouds, running 37,048 transactions per minute (tpm) on a three-node cluster. Azure’s ultra disk meanwhile delivered 16% more tpm while being priced only 11% higher than Azure’s less expensive “premium” disks.
Read the full report here.
p.s. Q: Funny name. What's Cockroach Labs, exactly? A: It's a company that provides a cloud-native, distributed SQL database. It's based in New York, used by Equifax, Bose, Comcast and many large banks, and just raised $160 million in a Series E funding round.