SQLite in Production: Lessons from Running a Store on a Single File

A deep dive into running a real-world e-commerce store on SQLite, exploring the benefits of Rails 8, the mechanics of WAL mode, and the hidden risks of rapid deployments.

We run a production e-commerce store on SQLite. Not as a proof of concept. Not for a side project with three users. A real store, processing real Stripe payments, serving real customers.

Rails 8 made this a first-class choice. And for most of our operation, it's been excellent — simpler deploys, zero connection pooling headaches, no database server to manage. But "most of our operation" isn't all of it. Here's the part nobody warns you about.

The Setup: Four Databases, One Volume

Our database.yml defines four SQLite databases in production: Primary, Cache, Queue, and Cable. All four live in a storage/ directory that maps to a named Docker volume. One Docker volume. Four database files. Every container that mounts this volume shares the same data. This is both the feature and the footgun.

WAL Mode: Why It Works at All

SQLite's default journal mode locks the entire database on writes. WAL (Write-Ahead Logging) changes the model. Writers append to a separate -wal file instead of modifying the database directly. Readers continue reading from the main file. Multiple readers and a single writer can operate concurrently. Rails 8 enables WAL by default for SQLite databases.

The Day We Lost Two Orders

February 4th. We pushed 11 commits to main in two hours. Each push triggers a deploy via GitHub Actions. Kamal runs blue-green deploys — it starts a new container, health-checks it, then stops the old one. During the switchover, both containers are running. Both mount the storage. Both have the SQLite files open.

Eleven deploys in two hours? The overlap windows started overlapping. Three processes with the same WAL file open, all trying to write. Orders 16 and 17 completed successfully in Stripe, but the order records never made it to the database. Somewhere in the WAL file contention, the writes were lost.

We diagnosed it through sqlite_sequence: The auto-increment counter said 17 IDs had been assigned, but only 15 rows existed.

The Fix: Stop Deploying So Fast

The fix was procedural, not technical: stop pushing to main every ten minutes. This isn't a SQLite problem. It's a deployment pipeline problem that SQLite makes visible. Postgres handles this fine because connections go through a TCP socket. SQLite's write ordering depends on filesystem-level locking on a shared Docker volume, and that breaks down when containers overlap.

The Gotchas Nobody Mentions

No ILIKE: Use WHERE LOWER(name) LIKE '%term%' instead.
JSON types: Always CAST(json_extract(...) AS TEXT) when you need string comparison.
RAM usage: Spawning an exec container for a quick database query adds significant RAM overhead. On small machines, this can trigger the OOM killer during deploys.
Timeout 5000: If you're hitting SQLITE_BUSY regularly, you have a concurrency problem that configuration can't fix.

Would We Choose SQLite Again?

Yes. For a single-server deployment with moderate write volume, SQLite eliminates an entire category of infrastructure complexity. The constraint is real: one server, and careful deploy pacing. The day we need horizontal scaling, we'll migrate to Postgres. Until then, a single file handles everything.

Source: Hacker News