Raghav Sethi began his tech writing journey in 2022, contributing to his college’s open-source community blog. Later that year, he joined MakeUseOf, and since then has written extensively about Apple, Android, and AI. His work ranges from hands-on experiments to opinion pieces that explore the bigger picture behind emerging tech trends.
Alongside his work at MUO, you can also find Raghav’s articles at XDA Developers, where he mainly focuses on Linux and the world of open-source software.
Outside of writing, Raghav enjoys working on coding projects, playing the guitar, and living life on the edge by installing the latest beta software on his daily devices.
If you've been using Claude Code, you know it's an amazing tool, and something you cannot live without these days. But one complaint keeps coming up again and again: the cost. While you can run Claude Code completely for free, the harness by itself is not enough.
What makes it a complete package is when you integrate it with your editor (duh), and I've finally come to a setup that just gets work done.
Claude Code is not the problem, the cloud is
If only you could buy some RAM
Claude Code by itself is free. You can install it right now without paying a thing. What you are actually paying for every time you use it is the API call to the model behind it. By default, that is Sonnet or Opus, depending on your configuration, and those models are what show up on your bill.
Most people who complain about Claude Code being expensive are actually complaining about something else. Claude Code is free. What is not free is the language model sitting behind it. Every task you give it, every file it reads, every change it makes, all of that gets routed through Sonnet or Opus by default, and those API calls are what show up on your bill at the end of the month.
Think of Claude Code as the layer that coordinates everything. It decides which files are relevant, figures out what needs to change, and runs the terminal commands. The actual thinking, the reasoning, and the code generation all happen inside the model. And the model is what costs money.
At $20 a month for the Pro plan, that is a fair price if you are deep in it every day. If you are not, it is harder to justify. The thing is, nothing about the setup requires you to use Anthropic's models at all. You can swap the endpoint entirely.
I’ll never pay for AI again
AI doesn’t have to cost you a dime—local models are fast, private, and finally worth switching to.
Setting it up is less painful than it sounds
Ollama to the rescue
Ollama is a tool that lets you run open-weight models locally on your own hardware. No API, no subscription, no usage bill. You download it, pull a model, and it runs a local server that applications can talk to just as they would a remote API.
Claude Code has an environment variable called ANTHROPIC_BASE_URL that lets you redirect it to a different endpoint. That means you can point it towards your Ollama instance instead of Anthropic's servers!
That's all you need to get a functional agentic coding setup running on your own machine. What I usually do is just open VS Code and run the harness inside the integrated terminal.
To start running your own instance, just run this command:
ollama launch claude You'll get prompted to pick a model now. You can either use open-weight models you host locally or Ollama's cloud service, which may be cheaper than paying for a Claude subscription.
After that, Claude Code would work exactly how you would expect. Just pull up the integrated terminal, run that command, and the whole thing just lives inside your terminal.
I personally run Ollama on a Mac Mini with 24GB of unified memory. I've primarily been experimenting with the Qwen 3.6 and Gemma family of models, and it's been running pretty well.
Choosing your first LLM is something that, very bluntly, takes a lot of trial-and-error. For all you know, a smaller 4B model might be just fine for you, or you actually might need a massive model that you just can't realistically run on consumer hardware.
There is a ceiling, and you should know where it sits
It's basically David versus Goliath
The models are not as good. That is just the reality, and it is worth being straight about rather than pretending this is a free lunch with no trade-offs. The closest thing you can get today is running DeepSeek V4 via OpenCode, and even that falls a tiny bit behind.
For the everyday 80% of coding work, a well-chosen local model gets you further than you might expect. But for multi-file refactoring, subtle architectural decisions, or anything that requires holding a lot of context together across a complex codebase, open models fall short of Sonnet or Opus in ways you will notice. Not on benchmarks.
The numbers are often close. But in practice, on tasks that require deep reasoning, Claude still handles them better.
Most people end up using this as a hybrid. Local models for the routine work, and a proper Claude Pro subscription kept in reserve for when you actually need the best model available. The two together cost less than Claude Max on its own, and for a lot of workflows, that is the more sensible setup anyway.
I've vibe coded 7 working apps — I wish I knew these 3 things when I started
As I've vibe coded and re-coded 7 working projects over the past 3 months, there's three golden rules that I've developed.
It's definitely worth trying out
If you are already paying for Claude Code and barely using it, try this first. If you are not subscribed to anything yet, start here and add a proper Claude subscription only when you actually hit the ceiling.
Free and pretty good is a completely different value proposition from expensive and great, and for most everyday coding tasks, pretty good is more than enough.
Claude Code
Claude Code is an agentic coding tool built by Anthropic that works directly inside your terminal. It can read, edit, and manage files across your entire project, run commands, and work through multi-step coding tasks on its own.
