Meta SuperCluster expands to16,000 NVIDIA GPUs

Meta's new "SuperCluster" supercomputer is going to scale to 16,000 GPUs this year, NVIDIA's CEO Jensen Huang said late Wednesday. Meta revealed its AI Research SuperCluster (RSC) in January, saying the supercomputer would be used to analyse images, text, and video together; develop new augmented reality (AR) tools and in future "power real-time voice translations to large groups of people, each speaking a different language."

The project is already running on 6,000 NVIDIA A100 GPUs, Huang said on an earnings call. The comments came as the chipmaker reported "outstanding" hyperscale and cloud demand for GPUs, with revenue more than doubling year-on-year amid huge appetite for hardware to power AI workloads across sectors.

(The Meta SuperCluster is currently running a storage tier that includes 175 PB of Pure Storage FlashArray and 46 PB of cache storage in Penguin Computing Altus systems. Meta believes that when fully built out the supercomputer will be the world's fastest for AI, performing at five exaflops of mixed precision compute.)

NVIDIA's -- like other GPU providers' -- earnings call can be an informative way to get a snapshot of key growth areas in enterprise AI deployments. Huang emphasised "large language models: conversational AI used for customer service, chatbots, a whole bunch of customer service applications... deep learning-based recommender systems... simulations up in the cloud, Android cloud gaming" among the sources of the robust demand.

The company also revealed a new multi-year strategic partnership with Jaguar Land Rover "to jointly develop and deliver next-generation automated driving systems plus AI-enabled services and experiences for its customers" that will span delivery of "active safety, automated driving and parking systems as well as driver assistance systems. Inside the vehicle, the system will deliver AI features, including driver and occupant monitoring, as well as advanced visualisation of the vehicle’s environment" the two said in a February 16 release.

See also: Simulated city built to teach AIs counterfactual reasoning

Meta SuperCluster to have 16,000 of these...

Looking ahead Huang suggested that "customer service will be heavily, heavily supported by artificial intelligence in the future... Almost every point of sales, I think, whether it's a fast food or a quick service, businesses are going to have chatbots and AI-based customer service. Retail checkouts will be supported by AI agents.

He added on the earnings call: "All of this is made possible by a couple of breakthroughs: computer vision, of course, because the AIs have to make eye contact and recognize your posture and such, recognize speech, understand the context and what is being spoken about and have a reasonable conversation with people so that you could provide good customer service..."

The company's AI Inference-focused revenue more than tripled year on year. Of the company's torpedoed takeover of the UK's Arm, NVIDIA's CEO said "We gave it our best shot, but the headwinds were too strong, and we could not give regulators the comfort they needed to approve our deal" -- the company is pressing on meanwhile with its own Arm-based CPU which it aims to launch in H1 2023, targeting AI and HPC workloads.

NVIDIA reported record fiscal-year revenue of $26.91 billion, up 61% and net income of $9.7 billion.

A 2021 report by McKinsey on AI adoption found that the top three use cases were service-operations optimisation, AI-based enhancement of products, and contact-center automation, with the biggest percentage-point increase in the use of AI being in companies’ marketing-budget allocation and spending effectiveness.

Meta's SuperCluster scaling to a monstrous 16,000 GPUs

See also: Simulated city built to teach AIs counterfactual reasoning

Follow The Stack on LinkedIn