Llama 3 is in training – as Meta vows to add another 350,000 Nvidia H100 GPUs to its infrastructure this year
Goldman Sachs CIO says "there’s a great opportunity for capital to move towards the application layer, the toolset layer. I think we will see that shift happening..."
Meta is adding a colossal 350,000 Nvidia H100 GPUs to its AI infrastructure this year alone – and is currently training its “next-gen” foundation model, Llama 3, CEO Mark Zuckerberg has told customers.
It released Llama in February 2023 and Llama 2 in July 2023 – making it available freely for research and commercial use, with some modest restrictions. “We're building massive compute infrastructure to support our future roadmap… overall almost 600k H100s equivalents of compute” Zuckerberg said – a figure that would equate to $20 billion of chips.
A dizzying cadence of open source foundation generative AI models slowed somewhat in late 2023 but Meta said it will be looking to “build general intelligence, open source it responsibly, and make it widely available so everyone can benefit” as it converges two of its major AI research efforts, known as “FAIR” and “GenAI” to support this mission.
See also: Docker and friends package up free "GenAI" stack
Enterprise users are exploring generative AI for services as diverse as chatbots, content delivery for product SKUs on ecommerce sites, analysis of engine noises for predictive maintenance, code analysis, and more.
GSK for example has created a “proprietary LLM-based operating system [that] uses our internally developed LLMs that possess [state of the art] reasoning for science” the pharmaceutical heavyweight said in late 2023.
GSK was releasing “a community of agents, from specialized AI/ML models to more general code creation and analysis tools” with it, according to the company’s Global Head of AI, Kim Branson.
See also: The Future of AI: From the world’s "most powerful cat" to transformative enterprise apps
Many are learning many lessons along the way, as customers use them to deliver “legally binding” commitments to sell new cars for $1, write bad poetry about their customer service, and generally prompt-abuse them, with legal departments in many organisations still scrambling to understand and assess the risks of public generative AI deployment.
Not everyone can afford to spend Meta-level sums on training foundation models like Llama 3. Most will never need to. As Goldman Sachs CIO Marco Argenti recently put it: “At the beginning, everybody was thinking that if they didn’t have their own pre-trained models, they wouldn’t be able to leverage the power of AI. Now, appropriate techniques such as retrieval-augmented generation, vectorization of content, and prompt engineering offer comparable if not superior performance to pre-trained models in something like 95% of the use cases — at a fraction of the cost.
“I think it will be harder to raise money for any company creating foundational models. It’s so capital intensive you can’t really have more than a handful. But if you think of those as operating systems or platforms, there’s a whole world of applications that haven’t really emerged yet around those models. And there it’s more about innovation, more about agility, great ideas, and great user experience — rather than having to amass tens of thousands of GPUs for months of training.
“There’s a great opportunity for capital to move towards the application layer, the toolset layer. I think we will see that shift happening…”