Hello from Solo AI Guy
What this is
Solo AI Guy is a blog about doing real AI work without enterprise budgets. No K8s clusters. No $100/day API bills. No team of ML engineers.
Just one developer, one consumer GPU, and a stack tuned to keep dollars where they belong: in your pocket.
Who it’s for
- Solo founders building AI features into their product
- Indie hackers who want agents to do real work for them
- Devs curious about local LLMs but tired of toy benchmarks
- Anyone whose API bill last month made them flinch
What I’m running
The stack this site documents:
- Hardware: RTX 3070 (8GB VRAM), WSL2 Ubuntu, normal desktop PC
- Local models: Ollama serving qwen2.5-coder:7b and llama3.1:8b
- Local agent: Aider, configured for non-interactive task delegation
- Cloud assist: Claude Code for planning, debugging, and judgment calls
- Hybrid rule: simple mechanical work goes to local. Hard reasoning goes to cloud.
That setup runs me about $5–15/month in API credits and a few cents in electricity. The same workload on a pure cloud setup would be 10–20× more.
What you’ll see here
Working configs. Real benchmarks. Token-cost breakdowns. Honest writeups of what broke. Tools I built that paid for themselves.
If something I tried saved me $50/month, I’ll show you exactly how. If it cost me $50/month and didn’t work, I’ll show you that too — those posts tend to save readers more than the wins do.
What I’m not going to do
- Pretend the local stack matches Claude or GPT-5 quality. It doesn’t, and the gap matters.
- Sell you on a single tool. The right answer is usually a hybrid.
- Bury you in theory. Every post will end with something you can run today.
Subscribe
RSS feed here. Email list coming soon.
First real post drops this week.