Yes, local LLMs are ready to ease the compute strain

AI + ML

Anthropic might be thinking about space to ease its computing burden, but Claude Code on your laptop is way more practical

KETTLE We’ve been experimenting with LLMs for a while here at The Register, and if you ask our systems editor Tobias Mann and senior reporter Tom Claburn, locally installed coding assistants have actually become so good they could relieve some of the compute load that’s pushing AI companies to raise their prices.

This week on The Kettle, host Brandon Vigliarolo is joined by Mann and Claburn to discuss their work with locally-hosted LLMs, why we’re revisiting the topic at all, how to do local LLMs safely, and whether there’s orbital relief coming for the compute crunch.

You can listen to The Kettle here, as well as on Spotify and Apple Music, or read the full transcript of this episode below. ®

—

Brandon (00:01)

Welcome back to another episode of The Register’s Kettle podcast. I’m Reg reporter Brandon Vigliarolo and with me this week are systems editor Tobias Mann and senior reporter Tom Claburn to talk about some experiments they’ve been doing with AI coding assistants, but not just any AI coding assistant mind you, we’re talking about local ones that live right on your own machine. Guys, thanks for joining me this week.

Tobias Mann (00:24)

Good to be here.

Thomas Claburn (00:25)

Thank you.

Brandon (00:29)

So before we jump into what learned during these experiments and how effective local large language models actually are as coding assistants. Let’s talk a bit about why we’re having this discussion in the first place. And I understand that AI coding assistants are about to become way more expensive. And I think, Tom, these were stories that you wrote recently. So can you walk us through a bit what’s going on with the current cloud-hosted ones?

Thomas Claburn (00:52)

Back in November, there was, I think around Opus 4.5, pretty much all the developers started to realize that these models were actually getting pretty good and there’s no longer, vibe coding was less of a joke and more like, you know, maybe this will work. And then by the time, you know, around February with the OpenClaw craze, was a lot more demand for sort of coding agents and people would start running these for long periods of time. And it sort of caught Anthropic and others unaware, Google and open AI as well. There was a lot of capacity constraints, a lot more people were trying these things out and they ended up having to find ways to limit demand through session limits and made a lot of people unhappy but they basically just didn’t have the compute available to serve capacity.

And on top of that, they’re serving a lot of these at a price that is loss-leading. They’re trying to get people into the business, but these are unprofitable workloads for them. And if you look at something like Mythos, which came out, is their big security model, it was too good for anybody, but large companies with expensive payrolls to run.

Brandon (02:08)

Right, right.

Thomas Claburn (02:10)

It’s clear that they’re looking for ways to increase their revenue because they’re investing a lot in the infrastructure to make this run, but they don’t yet have the recurring revenue that justifies all this. The ramps look good. They’re bringing more people on, but they invested a lot of money in this.

Brandon (02:29)

OpenAI famously has never actually turned a profit in its history. I don’t know about Anthropic ⁓ personally, but I can’t imagine they’re doing a whole lot better. And so I understand the two specific examples you had was that Anthropic recently yanked Claude Code from Pro plans, but only for some people. Is that correct?

Thomas Claburn (02:49)

Yeah and they wrote that off as an A/B test. Basically they were doing live A/B testing and people noticed and they were saying, oh, well, no, that’s doesn’t apply to everyone. We’re not going to change or take away from existing Pro users. But clearly there are someone there saying, hey, can we get away with charging this much but providing less service? And that doesn’t happen unless you’re trying to figure out a way to increase your revenue and reduce the demand on your services.

Brandon (02:53)

Okay.

Totally. Did they backtrack on that at all or is that still, is that A/B test still going on?

Thomas Claburn (03:23)

I don’t think it’s still going.

Tobias Mann (03:24)

They do really do do a lot of A/B testing. I think I have a Claude Code Max subscription that is about, has a 50 % discount on it right now. So I’m a little hesitant to give it up because yeah, it’s a hundred bucks a month and I don’t use it nearly enough to justify that. But also if I cancel and decide I wanted it back, it’d be 200.

Brandon (03:46)

Yes, the reason I’m still an Nvidia GeForce Now gaming cloud subscriber, right? Because I was there in the beta test and I’ve never given that discount up, even if I haven’t used it in a while. So I understand. Claude did that, Anthropic did that, and then GitHub also has just straight up jumped to metered billing for AI, think. Correct?

Thomas Claburn (04:05)

Yeah, and they were taking a huge loss on things because they would give you a flat rate, but then people would use the most expensive models. And of course, those things are billed at different rates and offering a flat rate versus these very inflated Opus 4.7 models, which also take a lot longer to process stuff, even if they’re a little bit more efficient, they’ll think for longer periods. It’s just they’re losing money. So everyone has to go to meter billing. And once that happens, it’s going to cost people a lot of money.

You can look at it now, even on a subscription plan, you’ll write up a little widget and you look at the thing and it’s, you know, $2 worth of whatever. You think, well, is that worth it? Maybe. And then if it’s a more substantial project, you know, people spend, you know, hundreds of thousands of dollars on stuff. And if that’s not returning you any revenue, are you still going to do that? So it’s going to be interesting to see how this goes.

Brandon (04:59)

Maybe local LLMs like what we’re here to talk about today are kind of the market control, right? I’m sure there are gonna be people who are using these paid services,

» …
Read More

Yes, local LLMs are ready to ease the compute strain

Recent Posts

Recent Comments

Stay Updated with Tech Actual