In 2025, the tides may turn for companies hoping to compete with the $3 trillion gorilla in AI computing.
Nvidia holds an estimated 90% of the market share for AI computing. Still, as the use of AI grows, workloads are expected to change, and this evolution may give companies with competitive hardware an opening.
In 2024, the majority of AI compute spend shifted to inference, Thomas Sohmers, CEO of chip startup Positron AI, told BI. This will "continue to grow on what looks like an exponential curve," he added.
In AI, inference is the computation needed to produce the response to a user's query or request. The computing required to teach the model the knowledge needed to answer is called "training." Creating OpenAI's image generation platform Sora, for example, represents training. Each user who instructs it to create an image represents an inference workload.
OpenAI's other models have Sohmers and others excited about the growth in computing needs in 2025.
OpenAI's o1 and o3, Google's Gemini 2.0 Flash Thinking, and a handful of other AI models use more compute-intensive strategies to improve results after training. These strategies are often called inference-time computing, chain-of-thought, chain-of-reasoning, or reasoning models.
Simply put, if the models think more before they answer, the responses are better. That thinking comes at a cost of time and money.
The startups vying for some of Nvidia's market share are attempting to optimize one or both.
Nvidia already benefits from these innovations, CEO Jensen Huang said on the company's November earnings call. Huang's wannabe competitors are betting that in 2025, new post-training strategies for AI will benefit all purveyors of inference chips.
Business Insider spoke to three challengers about their hopes and expectations for 2025. Here are their New Year's resolutions.
"Execution, execution, execution. Right now, everybody at Groq has decided not to take a holiday break this year. Everyone is executing and building the systems. We are all making sure that we deliver to the opportunity that we've got because that is in our control.
I tell everyone our funnel right now is carbonated and bubbling over. It's unbelievable, the amount of customer interest. We have to build more systems, and we have to stand up those systems so we can serve the demand that we've got. We want to serve all those customers. We want to increase rate limits for everybody."
"For SambaNova, the most critical factor is executing on the shift from training to inference. The industry is moving rapidly toward real-time applications, and inference workloads are becoming the lion's share of AI demand. Our focus is on ensuring our technology enables enterprises to scale efficiently and sustainably."
"My belief is if we can actually deploy enough compute — which thankfully I think we can from a supply chain perspective — by deploying significantly more inference-specific compute, we're going to be able to grow the adoption rate of 'chain of thoughts' and other inference-additional compute."
"It's about customers recognizing that there are novel advancements against incumbent technologies. There's a lot of folks that have told us, 'We like what you have, but to use the old adage and rephrase it: No one ever got fired for buying from — insert incumbent.'
But we know that it's starting to boil up. People are realizing it's hard for them to get chips from the incumbent, and it's also not as performant as Groq is. So my wish would be that people are willing to take that chance and actually look to some of these new technologies."
"If I had a magic wand, I'd address the power challenges around deploying AI. Today, most of the market is stuck using power-hungry hardware that wasn't designed for inference at scale. The result is an unsustainable approach — economically and environmentally.
At SambaNova, we've proven there's a better way. Our architecture consumes 10 times less power, making it possible for enterprises to deploy AI systems that meet their goals without blowing past their power budgets or carbon targets. I'd like to see the market move faster toward adopting technologies that prioritize efficiency and sustainability — because that's how we ensure AI can scale globally without overwhelming the infrastructure that supports it."
"I would like people to actually adopt these chain of thought capabilities at the fastest rate possible. I think that is a huge shift — from a capabilities perspective. You have 8 billion parameter models surpassing 70 billion parameter models. So I'm trying to do everything I can to make that happen."
"In the last six months, I've gone to a number of hackathons, and I've met developers. It's deeply inspiring. So my New Year's resolution is to try to amplify the signal of the good that people are doing with AI."
"Making time for music. Playing guitar is something I've always loved, and I would love to get back into it. Music has this incredible way of clearing the mind and sparking creativity, which I find invaluable as we work to bring SambaNova's AI to new corners of the globe."
I want to do as much to encourage the usage of these new tools to help, you know, my mom. Part of the reason I got into technology was because I wanted to see these tools lift up people to be able to do more with their time — to learn everything that they want beyond whatever job they're in. I think that bringing the cost down of these things will enable that proliferation.
I also personally want to see and try to use more of these things outside of my just work context because I've been obsessively using the o1 Pro model for the past few weeks, and it's been amazing for my personal work. But when I gave access to my mom what she would do with it was pretty interesting — those sort of normal, everyday person tasks for these things where it truly is being an assistant."