> The enterprise mindset dictates that you need an out-of-process database server. But the truth is, a local SQLite file communicating over the C-interface or memory is orders of magnitude faster than making a TCP network hop to a remote Postgres server.
I don't want to diss SQLite because it is awesome and more than adequate for many/most web apps but you can connect to Postgres (or any DB really) on localhost over a Unix domain socket and avoid nearly all of the overhead.
It's not much harder to use than SQLite, you get all of the Postgres features, it's easier to run reports or whatever on the live db from a different box, and much easier if it comes time to setup a read replica, HA, or run the DB on a different box from the app.
I don't think running Postgres on the same box as your app is the same class of optimistic over provisioning as setting up a kubernetes cluster.
Sqlite smokes postgres on the same machine even with domain sockets [1]. This is before you get into using multiple sqlite database.
What features postgres offers over sqlite in the context of running on a single machine with a monolithic app? Application functions [2] means you can extend it however you need with the same language you use to build your application. It also has a much better backup and replication story thanks to litestream [3].
The main problem with sqlite is the defaults are not great and you should really use it with separate read and write connections where the application manages the write queue rather than letting sqlite handle it.
I love them both too but that might not be the best metric unless you’re planning to run lots of little read queries. If you’re doing CRUD, simulating that workflow may favor Postgres given the transactional read/write work that needs to take place across multiple concurrent connections.
This is mostly about thread communication. With SQLite you can guarantee no context switching. Postgres running on the same box gets you close but not all the way. It's still in a different process.
> It's not much harder to use than SQLite, you get all of the Postgres features, it's easier to run reports or whatever on the live db from a different box, and much easier if it comes time to setup a read replica, HA, or run the DB on a different box from the app.
Isn't this idea to spend a bit more effort and overhead to get YAGNI features exactly what TFA argues against?
I have used SQLite with extensions in extreme throughput scenarios. We’re talking running through it millions of documents per second in order to do disambiguation.
I won’t say this wouldn’t have been possible with a remote server, but it would have been a significant technical challenge.
Instead we packed up the database on S3, and each instance got a fresh copy and hammered away at the task. SQLite is the time tested alternative for when you need performance, not features
I mean, you’re not wrong about the facts, but it’s also pretty trivial to migrate the data from SQLite into a separate Postgres server later, if it turns out you do need those features after all. But most of the time, you don’t.
If this sounds like basic advice, consider there are a lot of people out there that believe they have to start with serverless, kubernetes, fleets of servers, planet-scale databases, multi-zone high-availability setups, and many other "best practices".
Saying "you can just run things on a cheap VPS" sounds amateurish: people are immediately out with "Yeah but scaling", "Yeah but high availability", "Yeah but backups", "Yeah but now you have to maintain it" arguments, that are basically regurgitated sales pitches for various cloud platforms. It's learned helplessness.
I don't know what to say. People keep saying these engineers exist and here I am not having seen a single, and I follow many indie hackers communities.
I end up explaining this term to every junior developer that doesn't know it sooner or later, the same way I explain bike shedding to all PMs that don't know it... often sooner, rather than later.
It seems to really help if you can put a term to it.
There are zero reasons to limit yourself to 1GB of RAM. By paying $20 instead of $5 you can get at least 8gb of RAM. You can use it for caches or a database that supports concurrent writes. The $15 difference won’t make any financial difference if you are trying to run a small business.
Thinking about on how to fit everything on a $5 VPS does not help your business.
$15 is not exactly zero, is it? If you don't need more than 1GB, why pay anything for more than 1GB?
I recall running LAMP stacks on something like 128MB about 20 years ago and not really having problems with memory. Most current website backends are not really much more complicated than they were back then if you don't haul in bloat.
It is. With 10k MRR it represents 0.15% of the revenue. Having the whole backend costing that much for a company selling web apps is like it’s costing zero.
if the project already has positive revenue then arguably the ability to capture new users is worth a lot, which requires acceptable performance even when a big traffic surge is happening (like a HN hug of attention)
if the scalability is in the number of "zero cost" projects to start, then 5 vs 15 is a 3x factor.
I think we have to re-think and re-evaluate RAM usage on modern systems that use swapping with CPU-assisted page compression and fast, modern NVMe drives.
The Macbook Neo with 8GB RAM is a showcase of how people underistimated its capabilities due to low amount of RAM before launch, yet after release all the reviewers point to a larger set of capabilities without any issues that people didn't predict pre-launch.
NVME read latency is around 100usec, a SQLite3 database in the low terabytes needs somewhere between 3-5 random IOs per point lookup, so you're talking worst case for an already meaningful amount of data about 0.5ms per cold lookup. Say your app is complex and makes 10 of these per request, 5 ms. That leaves you serving 200 requests/sec before ever needing any kind of cache.
That's 17 million hits per day in about 3.9 MiB/sec sustained disk IO, before factoring in the parallelism that almost any bargain bucket NVME drive already offers (allowing you to at least 4x these numbers). But already you're talking about quadrupling the infrastructure spend before serving a single request, which is the entire point of the article.
Nice list! I'd say the SQLite with WAL is the biggest money saver mentioned.
One note: you can absolutely use Python or Node just as well as Go. There's Hetzner that offers 4GB RAM, 10TB network (then 1$/TB egress), 2CPUs machines for 5$.
Two disclaimers for VPS:
If you're using a dedicated server instead of a cloud server, just don't forget to backup DB to a Storage box often (3$ /mo for 1TB, use rsync). It's a good practice either way, but cloud instances seem more reliable to hardware faults. Also avoid their object store.
You are responsible for security. I saw good devs skipping basic SSH hardening and get infected by bots in <1hr. My go-to move when I spin up servers is a two-stage Terraform setup: first, I set up SSH with only my IP allowed, set up Tailscale and then shutdown the public SSH IP entrypoint completely.
Does WAL really offer multiple concurrent writers? I know little about DBs and I've done a couple of Google searches and people say it allows concurrent reads while a write is happening, but no concurrent writers?
Not everybody says so... So, can anyone explain what's the right way to think about WAL?
Once I had Postgresql db with default password on a new vps, and forgetting to disable password based login, on a server with no domain. And it got hacked in a day, and was being used as bot server. And that was 10 years ago.
Recently deployed server, and was getting ssh login attempts within an hour, and it didn't had a domain. Fortunately, I've learned my lesson, and turned of password based login as soon as the server was up and running.
And similar attempts bogged down my desktop to halt.
Having an machine open to the world is now very scary. Thanks God for service like tailscale exists.
> Nice list! I'd say the SQLite with WAL is the biggest money saver mentioned.
Funny you said that. I migrated an old, Django web site to a slightly more modern architecture (docker compose with uvicorn instead of bare metal uWSGI) the other day, and while doing that I noticed that it doesn't need PostgreSQL at all. The old server had it already installed, so it was the lazy choice.
I just dumped all data and loaded it into an SQLite database with WAL and it's much easier to maintain and back up now.
> I use Linode or DigitalOcean. Pay no more than $5 to $10 a month. 1GB of RAM sounds terrifying to modern web developers, but it is plenty if you know what you are doing.
If you get one dedicated server for multiple separate projects, you can still keep the costs down but relax those constraints.
I put Proxmox on it and can have as many VMs as the IO pressure of the OSes will permit: https://www.proxmox.com/en/ (I cared mostly about storage so got HDDs in RAID 0, others might just get a server with SSDs)
You could have 15 VMs each with 4 GB of RAM and it would still come out to around 2.66 EUR per month per VM. It's just way more cost efficient at any sort of scale (number of projects) when compared to regular VMs, and as long as you don't put any trash on it, Proxmox itself is fairly stable, being a single point of failure aside.
Of course, with refurbished gear you'd want backups, but you really need those anyways.
Aside from that, Hetzner and Contabo (opinions vary about that one though) are going to be more affordable even when it comes to regular VPS hosting. I think Scaleway also had those small Stardust instances if you want something really cheap, but they go out of stock pretty quickly as well.
SQLite is fine, but I have ran Postgresql on a $20 server without any issues, and I would suggest if you have to deal with concurrent users and tasks, Postgresql is the way to go. SQLite WAL works, but sometimes it caused some issues, when you have a lot of concurrent tasks running continuously.
And, not sure I'm correct, but I felt Postgresql has more optimized storage if you have large text data than SQLite, at least for me I had storage full with SQLite, but same application on Postgresql never had this issue
There is also ARR which is "annual recurring revenue" and you should know that when people use ARR they usually are just making up numbers based on their current MRR (so lying). I've seen people announce their ARR after running their business for two whole months!
If you feel like it: start a blog! You have knowledge that you consider basic and a certain other subset of the population is interested in it and doesn't know it exists.
I think it's good. I've definitely seen resource inflation exactly that OP is alluding to in enterprise. A desire to have some huge cloud based solution with AWS, spark bla bla when a python script with pandas in a cron job was faster.
Not only that, his whole business model seems to be "profit off the AI bubble and get the big techs to indirectly subsidize you"
Which obviously works, it's not like there aren't tons of multi-million startups ultimately doing the exact same thing, and yet. It feels a bit... trite?
Great stack! I'm doing a similar approach for my latest project (kavla.dev) but using fly.io and their suspend feature.
Scaling to zero with database persistence using litestream has cut my bill down to $0.1 per month for my backend+database.
Granted I still don't have that many users, and they get 200ms of extra latency if the backend needs to wake up. But it's nice to never have to worry about accidental costs!
I know this article is about the stack, but I'd like to point out that the success of the author has probably more to do with their marketing/sales strategy than their choice of technical infrastructure.
I think his argument is that the functionality is unnecessary. You don’t need dynamic service scaling because your single-instance service has such high capacity to begin with.
I guess it’s all about knowing when to re-engineer the solution for scale. And the answer is rarely ”up front”.
The basic premise, try to be lean, is a good one. The implementation will clearly be debated with everyone having their own opinion on it but the core point is sound. I'd argue a different version of this though: keeping things lean forces simplicity and focus which is incredibly important early on. I have stepped into several startups and seen a mess of old/broken/I don't know what it does so leave it/etc etc. All of that, beyond the cost, slows you down because of the complexity. Regular gardening of your tech stack matters and has a lot of benefits.
The text feels incoherent to me and lacks some nuance.
It starts about cutting costs by the choice of infrastructure and goes further to less resource hungry tools and cheaper services. But never compares the cost of these things. Do I save actually the upgrade to a bigger server by using Go and sqlite over let's say Python and postgres? Or does it not even matter when you have just n many users.
Then I do not understand why at one point the convenience of using OpenRouter is preferred over managing multiple API keys, when that should be cheaper and a cost point that could increase faster than your infrastructure costs.
There are some more points, but I do not want to write a long comment.
It actually starts with a completely unrelated anecdote:
"What do you even need funding for?"
I agree. The author claims to have multiple $10K MRR websites running on $20 costs. I also don't understand what he needs money for — shouldn't the $x0,000 be able to fund the $20 for the next project? It doesn't make any sense at all.
Then the author trails off and tells us how he runs on $20/month.
While I agree with your points, this one could be more nuanced:
> Infrastructure: Bare Server > Containers > Kubernetes
The problem with recommending a bare server first is that bare metal fails. Usually every couple of years a component fails - a PSU, a controller, a drive. Also, a bare metal server is more expensive than VPS.
Paradoxically, a k3s distro with 3 small nodes and a load balancer at Hetzner may cost you less than a bare metal server and will definitely give you much better availability in the long run, albeit with less performance for the same money.
It always make me both roll my eyes and smile a little when i see someone daft enough to think they need some obscene setup - you dont. You never have. You are not Amazon, Microsoft, Google, etc. If you get to the point where you need that kind of setup you're already employing a dev ops team thats telling you that.
Stick whatever you're working on onto a ~$5/mo cheapo vps from someone like Hetzner, Digitalocean, etc and just get on with building your thing.
The most interesting thing in here is https://github.com/smhanov/laconic which is the author's "agentic research orchestrator for Go that is optimized to use free search & low-cost limited context window llms".
I have been doing this kind of thing with Cursor and Codex subscriptions, but they do have annoying rate limits, and Cursor on the Auto model seems to perform poorly if you ask it to do too much work, so I am keen to try out laconic on my local GPU.
EDIT:
Having tried it out, this may be a false economy.
The way it works is it has a bunch of different prompts for the LLMs (Planner, Synthesizer, Finalizer).
The "Planner" is given your input question and the "scratchpad" and has to come up with DuckDuckGo search terms.
Then the harness runs the DuckDuckGo search and gives the question, results, and scratchpad to the Synthesizer. The Synthesizer updates the scratchpad with new information that is learnt.
This continues in a loop, with the Planner coming up with new search queries and the Synthesizer updating the scratchpad, until eventually the Planner decides to give a final answer, at which point the Finalizer summarises the information in a user-friendly final answer.
That is a pretty clever design! It allows you to do relatively complex research with only a very small amount of context window. So I love that.
However I have found that the Synthesizer step is extremely slow on my RTX3060, and also I think it would cost me about £1/day extra to run the RTX3060 flat out vs idle. For the amount of work laconic can do in a day (not a lot!), I think I am better off just sending the money to OpenAI and getting the results more quickly.
But I still love the design, this is a very creative way to use a very small context window. And has the obvious privacy and freedom advantages over depending on OpenAI.
>To manage all this, I built laconic, an agentic researcher specifically optimized for running in a constrained 8K context window. It manages the LLM context like an operating system's virtual memory manager—it "pages out" the irrelevant baggage of a conversation, keeping only the absolute most critical facts in the active LLM context window.
The 8K part is the most startling to me. Is that still a thing? I worked under that constraint in 2023 in the early GPT-4 days. I believe Ollama still has the default context window set to 8K for some reason. But the model mentioned on laconic GitHub (Qwen3:4B) should support 32K. (Still pretty small, but.. ;)
I'll have to take a proper look at the architecture, extreme context engineering is a special interest of mine :) Back when Auto-GPT was a thing (think OpenClaw but in 2023), I realized that what most people were using it for was just internet research, and that you could get better results, cheaper, faster, and deterministically, by just writing a 30 line Python script.
Google search (or DDG) -> Scrape top N results -> Shove into LLM for summarization (with optional user query) -> Meta-summary.
In such straightforward, specialized scenarios, letting the LLM drive was, and still is, "swatting a fly with a plasma cannon."
(The analog these days would be that many people would be better off asking Claw to write a scraper for them, than having it drive Chromium 24/7...)
This is the sort of stuff that makes all the work a student does in college, you know the many hassles of university, seem completely banal and amateurish.
Always good to challenge the narrative - but I don't pay for RDS Postgres because of the WAL, replication, all the beauty of pg etc. I pay RDS because it's largely set and forget. I am gladly paying AWS to think about it for me. I think at a certain scale, this is a really good tradeoff. At the very beginning it could be overkill, and at the top end obviously its unsuitable - but for most of us those tradeoffs are why it's successful.
Do these things actually work? I've seen way too many gurus on twitter claiming to make 10K+ MRR every month. And then they quietly start applying for jobs. or selling courses instead of cashing in.
> If you need a little breathing room, just use a swapfile.
You should always use a swap file/partition, even if you don't want any swapping. That's because there are always cold pages and if you have no swap space that memory cannot be used for apps or buffers, it's just wasted.
I always thought I had to add a swap file to avoid crashing with OOM. I wasn't aware of the cold pages overhead.
Sometimes that crashing is what I want: a dedicated server running one (micro)service in a system that'll restart new servers on such crashes (e.g. Kubernetes-alike). I'd rather have it crash immediately rather than chugging along in degraded state.
But on a shared setup like OP shows, or the old LAMP-on-a-vps, i'd prefer the system to start swapping and have a chance to recover. IME it quite often does. Will take a few minutes (of near downtime) but will avoid data corruption or crash-loops much easier.
Basically, letting Linux handle recovery vs letting a monitoring system handle recovery
> Here is the trick that you might have missed: somehow, Microsoft is able to charge per request, not per token. And a "request" is simply what I type into the chat box. Even if the agent spends the next 30 minutes chewing through my entire codebase, mapping dependencies, and changing hundreds of files, I still pay roughly $0.04.
Really? Lol. If it's true why would you publish it? To ensure Microsoft will patch it up and fuck up your workflow?
>Really? Lol. If it's true why would you publish it? To ensure Microsoft will patch it up and fuck up your workflow?
It's true and it's their official pricing, so talking about it won't change anything.
People are spending way too much money with Claude Code while they could simply pay for GitHub Copilot and fire up OpenCode to get the same results but way cheaper.
I think newer developers really need to learn that you can actually do production stuff using bare tools. It is not crazy, especially in the beginning, and it will save you a ton of money and time.
I was writing about this recently [0]. In the 2000s, we were bragging about how cheap our services are and are getting. Today, a graduate with an idea is paying $200 amounts in AWS after the student discounts. They break the bank and go broke before they have tested the idea. Programming is literally free today.
While I applaud the acumen, this reads like watching a kid standing on the 3rd floor balcony shouting "look what I can do!"
$20/month. Yeah. Great, but why? You get a lot of peace of mind with "real" HA setup with real backups and real recovery, for not much more than $20, if you are careful.
Another half of article is about running "free, unlimited" local AI on a GPU (Santa brought it) with, apparently, free electricity (Santa pays for it).
The pricing is so good that it's the only way I do agentic coding now. I've never spent more than $40 in a month on Opus, and I give it large specs to work on. I usually spend $20 or so.
>The feedback was simply: "What do you even need funding for?"
Not clear from the text, but what was your plan using the funding on? If you did not have a plan, what did you expect? VCs want to see how adding more money results in asymmetric returns.
For single-person companies infra can be the single largest expense (especially if you aren't paying yourself yet!). The day you bring a full-time employee onboard, I have a hard time seeing infra costs ever exceeding salaries for most shops
This is really what 10k mrr can get you? A badly designed AI slop website that isn't even mobile correctly compatible. The logo is white background on black website like a university project.
I can't believe that people are willingly spending money on this.
A lot of this advice is good or at least interesting. A lot of it is questionable. Python is completely fine for the backend. And using SQLite for your prod database is a bad idea, just use Postgres or similar.
There’s a lot to be said about his approach with go for simplicity. Python needs virtual environments, package managers, dependencies on disk, a wsgi/asgi server to run forked copies of the server, and all of that uses 4x-20x the ram usage of go. Docker usually gets involved around here and before you know it you’re neck deep in helm charts and cursing CNI configs in an EKS cluster.
The go equivalent of just coping one file across to a server a restarting its process has a lot of appeal and clearly works well for him.
Yes. It strikes me as odd how many people will put forward Python with the argument of "simplicity".
It is not. Simple. It may be "easy" but easy != simple (simple is hard, I tend to say).
I'm currently involved in a project that was initially layed out as microservices in rust and some go, to slowly replace a monolyth Django monstrosity of 12+ years tech debt.
But the new hires are pushing back and re-introducing python, eith that argument of simplicity. Sure, python is much easier than a rust equivalent. Esp in early phases. But to me, 25+ years developer/engineer, yet new to python, it's unbelievable complex.
Yes, uv solves some. As does ty and ruff. But, my goodness, what a mess to set up simple ci pipelines, a local development machine (that doesn't break my OS or other software on that machine). Hell, even the dockerfiles are magnitudes more complex than most others I've encountered.
Python will take you a long way, but its ceiling (both typical and absolute) is far lower than the likes of Go and Rust. For typical implementations, the difference may be a factor of ten. For careful implementations (of both), it can be a lot more than that.
Does the difference matter? You must decide that.
As for your dismissing SQLite: please justify why it’s a bad idea. Because I strongly disagree.
I think the point is that your Python webapp will have more problems scaling to let's say 10,000 customers on a 5$ VPS tham Go. Of course you can always get beefier servers, but then that adds up for every project
I was wondering this as well: Why did OP look for VC?
In my case, I've used a similar strategy of keeping costs under €100/month. (But have sold, or stopped my ventures before hitting such MRRs as OP reports).
I raised some capital to pay my own bills during development. But mostly to hire freelancers to work on parts that I'm bad at, or didn't have time for: advertising, a specific feature, a library, rewrite-in-rust (wink) or deep research into functional improvements.
nice article, validates some of the things i already thought. although im sure things like aws and database servers etc are still useful for big companies
What a fascinating article. I especially love the part about writing extremely detailed requests which only cost $0.04 versus the token approach most “vibe code” devs use. Fortunately his tactic is almost impossible to emulate for 90% of the YCombinator audience / HN commentators.
Why do I know this? Because there had to be a declaration here to stop using ChatGPT and other Agents to write YOUR OWN GODDAMN POSTS. Thinking isn’t your strong suit, Greed is, and taking the time to learn the power of English doesn’t satisfy the latter, so you minimize it to your own detriment.
> The enterprise mindset dictates that you need an out-of-process database server. But the truth is, a local SQLite file communicating over the C-interface or memory is orders of magnitude faster than making a TCP network hop to a remote Postgres server.
I don't want to diss SQLite because it is awesome and more than adequate for many/most web apps but you can connect to Postgres (or any DB really) on localhost over a Unix domain socket and avoid nearly all of the overhead.
It's not much harder to use than SQLite, you get all of the Postgres features, it's easier to run reports or whatever on the live db from a different box, and much easier if it comes time to setup a read replica, HA, or run the DB on a different box from the app.
I don't think running Postgres on the same box as your app is the same class of optimistic over provisioning as setting up a kubernetes cluster.
Sqlite smokes postgres on the same machine even with domain sockets [1]. This is before you get into using multiple sqlite database.
What features postgres offers over sqlite in the context of running on a single machine with a monolithic app? Application functions [2] means you can extend it however you need with the same language you use to build your application. It also has a much better backup and replication story thanks to litestream [3].
- [1] https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...
- [2] https://sqlite.org/appfunc.html
- [3] https://litestream.io/
The main problem with sqlite is the defaults are not great and you should really use it with separate read and write connections where the application manages the write queue rather than letting sqlite handle it.
Looks like the overhead is not insignificant:
(https://gist.github.com/leifkb/1ad16a741fd061216f074aedf1eca...)I love them both too but that might not be the best metric unless you’re planning to run lots of little read queries. If you’re doing CRUD, simulating that workflow may favor Postgres given the transactional read/write work that needs to take place across multiple concurrent connections.
This is mostly about thread communication. With SQLite you can guarantee no context switching. Postgres running on the same box gets you close but not all the way. It's still in a different process.
What a useful "my hello-world script is faster than your hello-world script" example.
Most important is that that local SQLite gets proper backups, so a restore goes without issues
A total performance delta of <3s on ~300k transactions is indeed the definition of irrelevant.
Also:
> PostgreSQL (localhost): (. .) SQLite (in-memory):
This is a rather silly example. What do you expect to happen to your data when your node restarts?
Your example makes as much sense as comparing Valkey with Postgres and proceed to proclaim that the performance difference is not insignificant.
Would be nice to see PGLite[1] compared too
1: https://pglite.dev/
Why are you comparing PostgreSQL to an in-memory SQLite instead of a file-based one? Wow, memory is faster than disk, who would have thought?
Because it doesn't make a difference, because `SELECT 1` doesn't need to touch the database:
(https://gist.github.com/leifkb/d8778422d450d9a3f103ed43258cc...)Why are you doing meaningless microbenchmarks?
> It's not much harder to use than SQLite, you get all of the Postgres features, it's easier to run reports or whatever on the live db from a different box, and much easier if it comes time to setup a read replica, HA, or run the DB on a different box from the app.
Isn't this idea to spend a bit more effort and overhead to get YAGNI features exactly what TFA argues against?
I have used SQLite with extensions in extreme throughput scenarios. We’re talking running through it millions of documents per second in order to do disambiguation. I won’t say this wouldn’t have been possible with a remote server, but it would have been a significant technical challenge. Instead we packed up the database on S3, and each instance got a fresh copy and hammered away at the task. SQLite is the time tested alternative for when you need performance, not features
I've been doing that for decades.. People seem to simply not know about unix architecture.
What I like about sqlite is that it's simply one file
Author's own 'auth' project works with sqlite and postgres.
I mean, you’re not wrong about the facts, but it’s also pretty trivial to migrate the data from SQLite into a separate Postgres server later, if it turns out you do need those features after all. But most of the time, you don’t.
I bet that takes more time than the 5 extra minutes you take to setup Postgres in the same box upfront.
If this sounds like basic advice, consider there are a lot of people out there that believe they have to start with serverless, kubernetes, fleets of servers, planet-scale databases, multi-zone high-availability setups, and many other "best practices".
Saying "you can just run things on a cheap VPS" sounds amateurish: people are immediately out with "Yeah but scaling", "Yeah but high availability", "Yeah but backups", "Yeah but now you have to maintain it" arguments, that are basically regurgitated sales pitches for various cloud platforms. It's learned helplessness.
I don't know what to say. People keep saying these engineers exist and here I am not having seen a single, and I follow many indie hackers communities.
Apparently the phrase cargo cult software engineering is not common anymore. Explains these things perfectly.
I end up explaining this term to every junior developer that doesn't know it sooner or later, the same way I explain bike shedding to all PMs that don't know it... often sooner, rather than later.
It seems to really help if you can put a term to it.
There are zero reasons to limit yourself to 1GB of RAM. By paying $20 instead of $5 you can get at least 8gb of RAM. You can use it for caches or a database that supports concurrent writes. The $15 difference won’t make any financial difference if you are trying to run a small business.
Thinking about on how to fit everything on a $5 VPS does not help your business.
$15 is not exactly zero, is it? If you don't need more than 1GB, why pay anything for more than 1GB?
I recall running LAMP stacks on something like 128MB about 20 years ago and not really having problems with memory. Most current website backends are not really much more complicated than they were back then if you don't haul in bloat.
Saving 15 USD on 10k+ USD MMR is ridiculous.
It is. With 10k MRR it represents 0.15% of the revenue. Having the whole backend costing that much for a company selling web apps is like it’s costing zero.
There’s a happy medium and $5 for 1GB RAM just isn’t it.
Not a very strong argument now is it?
if the project already has positive revenue then arguably the ability to capture new users is worth a lot, which requires acceptable performance even when a big traffic surge is happening (like a HN hug of attention)
if the scalability is in the number of "zero cost" projects to start, then 5 vs 15 is a 3x factor.
Or better yet, go with a euro provider like Hetzner and get 8GB of RAM for $10 or so. :)
Even their $5 plan gives 4GB.
I think we have to re-think and re-evaluate RAM usage on modern systems that use swapping with CPU-assisted page compression and fast, modern NVMe drives.
The Macbook Neo with 8GB RAM is a showcase of how people underistimated its capabilities due to low amount of RAM before launch, yet after release all the reviewers point to a larger set of capabilities without any issues that people didn't predict pre-launch.
$5 VPS disks are nowhere near macbooks, they are shared between users and often connected via network. They don't seat close to CPU.
Also, macOS is generally exceptional at caching and making efficient use of the fast solid state chips.
NVME read latency is around 100usec, a SQLite3 database in the low terabytes needs somewhere between 3-5 random IOs per point lookup, so you're talking worst case for an already meaningful amount of data about 0.5ms per cold lookup. Say your app is complex and makes 10 of these per request, 5 ms. That leaves you serving 200 requests/sec before ever needing any kind of cache.
That's 17 million hits per day in about 3.9 MiB/sec sustained disk IO, before factoring in the parallelism that almost any bargain bucket NVME drive already offers (allowing you to at least 4x these numbers). But already you're talking about quadrupling the infrastructure spend before serving a single request, which is the entire point of the article.
You won't get such numbers on a $5 VPS, the SSDs that are used there are network attached and shared between users.
It doesn't look like they think about how to make it fit though. They just use a known good go template
Hetzner, OVH and others offer 4-8gb and 2-4 cores for the same ~5$
Nice list! I'd say the SQLite with WAL is the biggest money saver mentioned.
One note: you can absolutely use Python or Node just as well as Go. There's Hetzner that offers 4GB RAM, 10TB network (then 1$/TB egress), 2CPUs machines for 5$.
Two disclaimers for VPS:
If you're using a dedicated server instead of a cloud server, just don't forget to backup DB to a Storage box often (3$ /mo for 1TB, use rsync). It's a good practice either way, but cloud instances seem more reliable to hardware faults. Also avoid their object store.
You are responsible for security. I saw good devs skipping basic SSH hardening and get infected by bots in <1hr. My go-to move when I spin up servers is a two-stage Terraform setup: first, I set up SSH with only my IP allowed, set up Tailscale and then shutdown the public SSH IP entrypoint completely.
Take care and have fun!
Does WAL really offer multiple concurrent writers? I know little about DBs and I've done a couple of Google searches and people say it allows concurrent reads while a write is happening, but no concurrent writers?
Not everybody says so... So, can anyone explain what's the right way to think about WAL?
No it doesn't - it allows a single writer and concurrent READs at the same time.
About security, wall of shame story,
Once I had Postgresql db with default password on a new vps, and forgetting to disable password based login, on a server with no domain. And it got hacked in a day, and was being used as bot server. And that was 10 years ago.
Recently deployed server, and was getting ssh login attempts within an hour, and it didn't had a domain. Fortunately, I've learned my lesson, and turned of password based login as soon as the server was up and running.
And similar attempts bogged down my desktop to halt.
Having an machine open to the world is now very scary. Thanks God for service like tailscale exists.
> Nice list! I'd say the SQLite with WAL is the biggest money saver mentioned.
Funny you said that. I migrated an old, Django web site to a slightly more modern architecture (docker compose with uvicorn instead of bare metal uWSGI) the other day, and while doing that I noticed that it doesn't need PostgreSQL at all. The old server had it already installed, so it was the lazy choice.
I just dumped all data and loaded it into an SQLite database with WAL and it's much easier to maintain and back up now.
Yep, it literally is a one-file backup. And runtime it's so much faster for apps where write serialisation is acceptable.
> Also avoid their object store.
Curious as to why you say this. I’m using litestream to backup to Hetzner object storage, and it’s been working well so far.
I guess itt’s probably more expensive than just a storage box?
Not sure but I also don’t have to set up cron jobs and the like.
> I use Linode or DigitalOcean. Pay no more than $5 to $10 a month. 1GB of RAM sounds terrifying to modern web developers, but it is plenty if you know what you are doing.
If you get one dedicated server for multiple separate projects, you can still keep the costs down but relax those constraints.
For example, look at the Hetzner server auction: https://www.hetzner.com/sb/
I pay about 40 EUR a month for this:
I put Proxmox on it and can have as many VMs as the IO pressure of the OSes will permit: https://www.proxmox.com/en/ (I cared mostly about storage so got HDDs in RAID 0, others might just get a server with SSDs)You could have 15 VMs each with 4 GB of RAM and it would still come out to around 2.66 EUR per month per VM. It's just way more cost efficient at any sort of scale (number of projects) when compared to regular VMs, and as long as you don't put any trash on it, Proxmox itself is fairly stable, being a single point of failure aside.
Of course, with refurbished gear you'd want backups, but you really need those anyways.
Aside from that, Hetzner and Contabo (opinions vary about that one though) are going to be more affordable even when it comes to regular VPS hosting. I think Scaleway also had those small Stardust instances if you want something really cheap, but they go out of stock pretty quickly as well.
SQLite is fine, but I have ran Postgresql on a $20 server without any issues, and I would suggest if you have to deal with concurrent users and tasks, Postgresql is the way to go. SQLite WAL works, but sometimes it caused some issues, when you have a lot of concurrent tasks running continuously.
And, not sure I'm correct, but I felt Postgresql has more optimized storage if you have large text data than SQLite, at least for me I had storage full with SQLite, but same application on Postgresql never had this issue
Just in case, if there are others like me who where wondering what does "MRR" means, it seems to be "monthly recurring revenue".
There is also ARR which is "annual recurring revenue" and you should know that when people use ARR they usually are just making up numbers based on their current MRR (so lying). I've seen people announce their ARR after running their business for two whole months!
I learned nothing. Most of this seems like common basic advice, wrapped up in AI written paragraphs...
Initially from the title, I thought it would be about brainstorming and launching a successful idea, and that sort of thing.
Usually when there's "on a [low] $/mo" you'll hear basic advice. You'd be surprised to find out many folks are not aware of this!
Well, there's also the "How we saved $10M/mo by actually paying attention to indexes" trope.
If you feel like it: start a blog! You have knowledge that you consider basic and a certain other subset of the population is interested in it and doesn't know it exists.
> Sometimes you need the absolute cutting-edge reasoning of Claude 3.5 Sonnet or GPT-4o
Dead giveaway
Maybe it's tongue-in-cheek.
I think it's good. I've definitely seen resource inflation exactly that OP is alluding to in enterprise. A desire to have some huge cloud based solution with AWS, spark bla bla when a python script with pandas in a cron job was faster.
Not only that, his whole business model seems to be "profit off the AI bubble and get the big techs to indirectly subsidize you"
Which obviously works, it's not like there aren't tons of multi-million startups ultimately doing the exact same thing, and yet. It feels a bit... trite?
20$ vs 300$ does not really matter if you have multiple 10K MRR.
Great stack! I'm doing a similar approach for my latest project (kavla.dev) but using fly.io and their suspend feature.
Scaling to zero with database persistence using litestream has cut my bill down to $0.1 per month for my backend+database.
Granted I still don't have that many users, and they get 200ms of extra latency if the backend needs to wake up. But it's nice to never have to worry about accidental costs!
This is a really nice setup for side projects and random ideas too. Thanks for sharing!
I know this article is about the stack, but I'd like to point out that the success of the author has probably more to do with their marketing/sales strategy than their choice of technical infrastructure.
Something to remind to many tech folks on HN
True. But he’s able to do marketing because he has the money, time and sense of priorities to do so.
The moral of the story is: Don’t be (another) fool, your tech stack is not your priority.
When he switches from Kubernetes in the cloud to Nginx -> App Binary -> Sqlite he trades operations functionality for cost.
But, actually you can run Kubernetes and Postgres etc on a VPS.
See https://stack-cli.com/ where you can specify a Supabase style infra on a low cost VPS on top of K3s.
I think his argument is that the functionality is unnecessary. You don’t need dynamic service scaling because your single-instance service has such high capacity to begin with.
I guess it’s all about knowing when to re-engineer the solution for scale. And the answer is rarely ”up front”.
Dynamic scaling is not really even available on a single node kubernetes.
I was thinking more of
Running multiple websites. i.e. 1 application per namespace. Tooling i.e. k9s for looking at logs etc. Upgrading applications etc.
The basic premise, try to be lean, is a good one. The implementation will clearly be debated with everyone having their own opinion on it but the core point is sound. I'd argue a different version of this though: keeping things lean forces simplicity and focus which is incredibly important early on. I have stepped into several startups and seen a mess of old/broken/I don't know what it does so leave it/etc etc. All of that, beyond the cost, slows you down because of the complexity. Regular gardening of your tech stack matters and has a lot of benefits.
The text feels incoherent to me and lacks some nuance.
It starts about cutting costs by the choice of infrastructure and goes further to less resource hungry tools and cheaper services. But never compares the cost of these things. Do I save actually the upgrade to a bigger server by using Go and sqlite over let's say Python and postgres? Or does it not even matter when you have just n many users. Then I do not understand why at one point the convenience of using OpenRouter is preferred over managing multiple API keys, when that should be cheaper and a cost point that could increase faster than your infrastructure costs.
There are some more points, but I do not want to write a long comment.
It actually starts with a completely unrelated anecdote:
"What do you even need funding for?"
I agree. The author claims to have multiple $10K MRR websites running on $20 costs. I also don't understand what he needs money for — shouldn't the $x0,000 be able to fund the $20 for the next project? It doesn't make any sense at all.
Then the author trails off and tells us how he runs on $20/month.
Well, why did you apply for funding? Hello?
I read it as an article in defence of boring tech with a fancier/clickbaity title.
Here’s the more honest one i wrote a while back:
https://aazar.me/posts/in-defense-of-boring-technology
While I agree with your points, this one could be more nuanced:
> Infrastructure: Bare Server > Containers > Kubernetes
The problem with recommending a bare server first is that bare metal fails. Usually every couple of years a component fails - a PSU, a controller, a drive. Also, a bare metal server is more expensive than VPS.
Paradoxically, a k3s distro with 3 small nodes and a load balancer at Hetzner may cost you less than a bare metal server and will definitely give you much better availability in the long run, albeit with less performance for the same money.
I want to know how he’s identifying and monetizing businesses
It always make me both roll my eyes and smile a little when i see someone daft enough to think they need some obscene setup - you dont. You never have. You are not Amazon, Microsoft, Google, etc. If you get to the point where you need that kind of setup you're already employing a dev ops team thats telling you that.
Stick whatever you're working on onto a ~$5/mo cheapo vps from someone like Hetzner, Digitalocean, etc and just get on with building your thing.
Pretty sure this is just written by AI... Why else would someone call "Sonnet 3.5 Sonnet and gpt 4o' high end models.
Yep. It made me go check the date of publishing thinking it was published on 2023
The most interesting thing in here is https://github.com/smhanov/laconic which is the author's "agentic research orchestrator for Go that is optimized to use free search & low-cost limited context window llms".
I have been doing this kind of thing with Cursor and Codex subscriptions, but they do have annoying rate limits, and Cursor on the Auto model seems to perform poorly if you ask it to do too much work, so I am keen to try out laconic on my local GPU.
EDIT:
Having tried it out, this may be a false economy.
The way it works is it has a bunch of different prompts for the LLMs (Planner, Synthesizer, Finalizer).
The "Planner" is given your input question and the "scratchpad" and has to come up with DuckDuckGo search terms.
Then the harness runs the DuckDuckGo search and gives the question, results, and scratchpad to the Synthesizer. The Synthesizer updates the scratchpad with new information that is learnt.
This continues in a loop, with the Planner coming up with new search queries and the Synthesizer updating the scratchpad, until eventually the Planner decides to give a final answer, at which point the Finalizer summarises the information in a user-friendly final answer.
That is a pretty clever design! It allows you to do relatively complex research with only a very small amount of context window. So I love that.
However I have found that the Synthesizer step is extremely slow on my RTX3060, and also I think it would cost me about £1/day extra to run the RTX3060 flat out vs idle. For the amount of work laconic can do in a day (not a lot!), I think I am better off just sending the money to OpenAI and getting the results more quickly.
But I still love the design, this is a very creative way to use a very small context window. And has the obvious privacy and freedom advantages over depending on OpenAI.
Yeah, came here to mention that too!
From the article:
>To manage all this, I built laconic, an agentic researcher specifically optimized for running in a constrained 8K context window. It manages the LLM context like an operating system's virtual memory manager—it "pages out" the irrelevant baggage of a conversation, keeping only the absolute most critical facts in the active LLM context window.
The 8K part is the most startling to me. Is that still a thing? I worked under that constraint in 2023 in the early GPT-4 days. I believe Ollama still has the default context window set to 8K for some reason. But the model mentioned on laconic GitHub (Qwen3:4B) should support 32K. (Still pretty small, but.. ;)
I'll have to take a proper look at the architecture, extreme context engineering is a special interest of mine :) Back when Auto-GPT was a thing (think OpenClaw but in 2023), I realized that what most people were using it for was just internet research, and that you could get better results, cheaper, faster, and deterministically, by just writing a 30 line Python script.
Google search (or DDG) -> Scrape top N results -> Shove into LLM for summarization (with optional user query) -> Meta-summary.
In such straightforward, specialized scenarios, letting the LLM drive was, and still is, "swatting a fly with a plasma cannon."
(The analog these days would be that many people would be better off asking Claw to write a scraper for them, than having it drive Chromium 24/7...)
This is the sort of stuff that makes all the work a student does in college, you know the many hassles of university, seem completely banal and amateurish.
Always good to challenge the narrative - but I don't pay for RDS Postgres because of the WAL, replication, all the beauty of pg etc. I pay RDS because it's largely set and forget. I am gladly paying AWS to think about it for me. I think at a certain scale, this is a really good tradeoff. At the very beginning it could be overkill, and at the top end obviously its unsuitable - but for most of us those tradeoffs are why it's successful.
Do these things actually work? I've seen way too many gurus on twitter claiming to make 10K+ MRR every month. And then they quietly start applying for jobs. or selling courses instead of cashing in.
> If you need a little breathing room, just use a swapfile.
You should always use a swap file/partition, even if you don't want any swapping. That's because there are always cold pages and if you have no swap space that memory cannot be used for apps or buffers, it's just wasted.
I always thought I had to add a swap file to avoid crashing with OOM. I wasn't aware of the cold pages overhead.
Sometimes that crashing is what I want: a dedicated server running one (micro)service in a system that'll restart new servers on such crashes (e.g. Kubernetes-alike). I'd rather have it crash immediately rather than chugging along in degraded state.
But on a shared setup like OP shows, or the old LAMP-on-a-vps, i'd prefer the system to start swapping and have a chance to recover. IME it quite often does. Will take a few minutes (of near downtime) but will avoid data corruption or crash-loops much easier.
Basically, letting Linux handle recovery vs letting a monitoring system handle recovery
So what's the $10K MMR product, exactly? The lede is buried into nonexistence. Is it this one: https://www.websequencediagrams.com/ ...?
> Here is the trick that you might have missed: somehow, Microsoft is able to charge per request, not per token. And a "request" is simply what I type into the chat box. Even if the agent spends the next 30 minutes chewing through my entire codebase, mapping dependencies, and changing hundreds of files, I still pay roughly $0.04.
Really? Lol. If it's true why would you publish it? To ensure Microsoft will patch it up and fuck up your workflow?
>Really? Lol. If it's true why would you publish it? To ensure Microsoft will patch it up and fuck up your workflow?
It's true and it's their official pricing, so talking about it won't change anything.
People are spending way too much money with Claude Code while they could simply pay for GitHub Copilot and fire up OpenCode to get the same results but way cheaper.
I think newer developers really need to learn that you can actually do production stuff using bare tools. It is not crazy, especially in the beginning, and it will save you a ton of money and time.
I was writing about this recently [0]. In the 2000s, we were bragging about how cheap our services are and are getting. Today, a graduate with an idea is paying $200 amounts in AWS after the student discounts. They break the bank and go broke before they have tested the idea. Programming is literally free today.
[0]: https://idiallo.com/blog/programming-tools-are-free
Does anybody know a good service to self host Ai? My graphics card is shit, I want to rent hardware to run my own models
AI has solved the "code problem", but it hasn't solved the "marketing problem"…
This is how every website used to be run before everyone fell four the cloud trap.
While I applaud the acumen, this reads like watching a kid standing on the 3rd floor balcony shouting "look what I can do!"
$20/month. Yeah. Great, but why? You get a lot of peace of mind with "real" HA setup with real backups and real recovery, for not much more than $20, if you are careful.
Another half of article is about running "free, unlimited" local AI on a GPU (Santa brought it) with, apparently, free electricity (Santa pays for it).
I think making is the easiest part, would be really cool if you also reveal how you distribute what you are making for $20/mo.
well, the guy runs what he runs and can't complain
Nice tech read, but without information about which companies, doing what, just feels way too click-baity.
Can anybody validate this Github Copilot trick for accessign Opus 4.6? Sounds too good to be true.
Longtime happy Copilot user here. It's true.
The pricing is so good that it's the only way I do agentic coding now. I've never spent more than $40 in a month on Opus, and I give it large specs to work on. I usually spend $20 or so.
I'm not what I'd call a heavy user, but I've also mainly been using Copilot in VS Code on the basic sub.
You do get Opus 4.6, and it's really affordable. I usually go over my limits, but I'm yet to spend more than 5 USD on the surcharges.
Not seen a reason to switch, but YMMV depending on what you're doing and how you work.
It is true, it's the official pricing of GitHub Copilot.
Why is GitHub sticking to per-request pricing when other providers switched to per-token for the high performing models?
>The feedback was simply: "What do you even need funding for?"
Not clear from the text, but what was your plan using the funding on? If you did not have a plan, what did you expect? VCs want to see how adding more money results in asymmetric returns.
Is infra where investors money is going? I imagined salaries would be it. Marketing costs maybe.
For single-person companies infra can be the single largest expense (especially if you aren't paying yourself yet!). The day you bring a full-time employee onboard, I have a hard time seeing infra costs ever exceeding salaries for most shops
I decided to look at their website halfway through the post,
https://imgur.com/a/7M4PdO6
This is really what 10k mrr can get you? A badly designed AI slop website that isn't even mobile correctly compatible. The logo is white background on black website like a university project.
I can't believe that people are willingly spending money on this.
A lot of this advice is good or at least interesting. A lot of it is questionable. Python is completely fine for the backend. And using SQLite for your prod database is a bad idea, just use Postgres or similar.
Why is SQLite bad for production database?
Yes, it has some things that behave differently than PostgreSQL but I am curious about why you think that.
For read only it can be a great option. But even then I would choose D1 which has an amazing free tier and is sqlite under da hood.
But then you don't get the benefits of having the DB locally, with in-process access.
It's local to the worker? I don't understand what you mean.
There’s a lot to be said about his approach with go for simplicity. Python needs virtual environments, package managers, dependencies on disk, a wsgi/asgi server to run forked copies of the server, and all of that uses 4x-20x the ram usage of go. Docker usually gets involved around here and before you know it you’re neck deep in helm charts and cursing CNI configs in an EKS cluster.
The go equivalent of just coping one file across to a server a restarting its process has a lot of appeal and clearly works well for him.
Yes. It strikes me as odd how many people will put forward Python with the argument of "simplicity".
It is not. Simple. It may be "easy" but easy != simple (simple is hard, I tend to say).
I'm currently involved in a project that was initially layed out as microservices in rust and some go, to slowly replace a monolyth Django monstrosity of 12+ years tech debt.
But the new hires are pushing back and re-introducing python, eith that argument of simplicity. Sure, python is much easier than a rust equivalent. Esp in early phases. But to me, 25+ years developer/engineer, yet new to python, it's unbelievable complex. Yes, uv solves some. As does ty and ruff. But, my goodness, what a mess to set up simple ci pipelines, a local development machine (that doesn't break my OS or other software on that machine). Hell, even the dockerfiles are magnitudes more complex than most others I've encountered.
I am not following the difficulties you have mentioned. Setting up a local dev environment in Python is trivial with UV.
The only major downside of Python is its got a bit poor module system and nothing as seamless as Cargo.
Beyond that the code is a million times easier to understand for a web app.
Python will take you a long way, but its ceiling (both typical and absolute) is far lower than the likes of Go and Rust. For typical implementations, the difference may be a factor of ten. For careful implementations (of both), it can be a lot more than that.
Does the difference matter? You must decide that.
As for your dismissing SQLite: please justify why it’s a bad idea. Because I strongly disagree.
What a load of nonsense.
Why is it nonsense? Sounds reasonable to me.
> its ceiling (both typical and absolute) is far lower
If you plan to remaining smaller than instagram, the ceiling is comfortably above you.
I plan to remain smaller than two VMs
I think the point is that your Python webapp will have more problems scaling to let's say 10,000 customers on a 5$ VPS tham Go. Of course you can always get beefier servers, but then that adds up for every project
At 10,000 paying customers I don't think it is frivolous to move to a 10/month vps, or maybe a second 5/month one for fail-over.
So is the slopaclypse gonna destroy HN too? 2nd from the top AI written non-proofread article
You already have and had everything you need to scale the business to max and it hasn’t happened so more money won’t help.
What do you want VC to do?
You didn’t bring a plan.
I was wondering this as well: Why did OP look for VC?
In my case, I've used a similar strategy of keeping costs under €100/month. (But have sold, or stopped my ventures before hitting such MRRs as OP reports).
I raised some capital to pay my own bills during development. But mostly to hire freelancers to work on parts that I'm bad at, or didn't have time for: advertising, a specific feature, a library, rewrite-in-rust (wink) or deep research into functional improvements.
If you can’t articulate what you need funding for, don’t be surprised if nobody will give it to you?
You can get all the advantages and almost none of the constraints by buying a bigger base server for $50/m
Not my website. I found this interesting.
nice article, validates some of the things i already thought. although im sure things like aws and database servers etc are still useful for big companies
LMFAO at Linode / Digital Ocean as lean servers.
Hetzner / Contabo maybe. Cloudflare workers definitely.
This guy is not at my level and multiple $10k MRR is possible but unlikely.
What a fascinating article. I especially love the part about writing extremely detailed requests which only cost $0.04 versus the token approach most “vibe code” devs use. Fortunately his tactic is almost impossible to emulate for 90% of the YCombinator audience / HN commentators.
Why do I know this? Because there had to be a declaration here to stop using ChatGPT and other Agents to write YOUR OWN GODDAMN POSTS. Thinking isn’t your strong suit, Greed is, and taking the time to learn the power of English doesn’t satisfy the latter, so you minimize it to your own detriment.
Don’t get mad at me. Go punch a mirror.
Cool but missing the Claude Code or Coding Agent part imo
He specifically mentions that he is using GitHub Copilot because of how Microsoft bills per request instead of token.