Llama 3 is in training – as Meta vows to add another 350,000 Nvidia H100 GPUs to its infrastructure this year

Meta is adding a colossal 350,000 Nvidia H100 GPUs to its AI infrastructure this year alone – and is currently training its “next-gen” foundation model, Llama 3, CEO Mark Zuckerberg has told customers.

It released Llama in February 2023 and Llama 2 in July 2023 – making it available freely for research and commercial use, with some modest restrictions. “We're building massive compute infrastructure to support our future roadmap… overall almost 600k H100s equivalents of compute” Zuckerberg said – a figure that would equate to $20 billion of chips.

A dizzying cadence of open source foundation generative AI models slowed somewhat in late 2023 but Meta said it will be looking to “build general intelligence, open source it responsibly, and make it widely available so everyone can benefit” as it converges two of its major AI research efforts, known as “FAIR” and “GenAI” to support this mission.

See also: The Future of AI: From the world’s "most powerful cat" to transformative enterprise apps

Many are learning many lessons along the way, as customers use them to deliver “legally binding” commitments to sell new cars for $1, write bad poetry about their customer service, and generally prompt-abuse them, with legal departments in many organisations still scrambling to understand and assess the risks of public generative AI deployment.

Not everyone can afford to spend Meta-level sums on training foundation models like Llama 3. Most will never need to. As Goldman Sachs CIO Marco Argenti recently put it: “At the beginning, everybody was thinking that if they didn’t have their own pre-trained models, they wouldn’t be able to leverage the power of AI. Now, appropriate techniques such as retrieval-augmented generation, vectorization of content, and prompt engineering offer comparable if not superior performance to pre-trained models in something like 95% of the use cases — at a fraction of the cost.

“I think it will be harder to raise money for any company creating foundational models. It’s so capital intensive you can’t really have more than a handful. But if you think of those as operating systems or platforms, there’s a whole world of applications that haven’t really emerged yet around those models. And there it’s more about innovation, more about agility, great ideas, and great user experience — rather than having to amass tens of thousands of GPUs for months of training.

“There’s a great opportunity for capital to move towards the application layer, the toolset layer. I think we will see that shift happening…”

Llama 3 is in training – as Meta vows to add another 350,000 Nvidia H100 GPUs to its infrastructure this year

See also: Docker and friends package up free "GenAI" stack

See also: The Future of AI: From the world’s "most powerful cat" to transformative enterprise apps

See also: Microsoft open-sources “TaskWeaver” AI framework