Hacker News new | past | comments | ask | show | jobs | submit login

I always use SQLite as the first database when I start a new project. It is a great way to get started quickly and not waste time while you are figuring out what problem you are solving and how you want to structure your data. On some projects I've been able to make do with SQLite only for over a year before having to bother with other databases.

However, I always design with the intent to use PostgreSQL or some other database. Sometimes with the intent of using two different data stores in the scaled up version. For instance an SQL database for stuff that has low volumes, and perhaps a time-series database for the high volume stuff.

This is why I always design a persistence interface. An internal persistence API layer inside the application. I also write the test suite to this interface. This makes adding support for other databases much easier later.

Note that I say "add", because I usually keep the SQLite support after I've added support for the databases I want to use in production. Keeping SQLite is extremely useful for when people want to do integration testing or even when they are learning to use your server/service. Rather than having to fire up a database, people can just run with the built in SQLite. You download and run the binary and stuff just works. It also makes cleanup a lot easier. Not least if you run an in-memory database (use ":memory:" as the DB spec).

In the persistence layer I tend to avoid using too much abstraction. The interface is my persistence abstraction. I don't need more layers. I think ORMs are incredibly limiting and unnecessary. However, I do use sqlx (Golang) to make the SQLite implementation of the persistence layer less verbose (faster turnaround when figuring stuff out). I usually use it for PostgreSQL too since the performance penalty usually isn't significant enough to matter. (If you haven't used sqlx before: it is essentially like using the DB driver directly, but with a lot less boilerplate). sqlx is trivial to just rip it out if you think it introduces too much overhead.

Usually you will know ahead of time when you potentially need more than one data store. I tend to use SQLite for storing everything during initial development, but the API is usually designed in a way that doesn't make it awkward to split the store in two or more domains.

I sometimes write storage API middleware. For instance statistics, adapters that can combine different permutations of different storage domains, sharding, throttling, specialized logging etc. I've done ACLs, failover, housekeeping middleware as well, but usually they do not belong in persistence middleware.

I never put business logic or "cleverness" in the persistence layer. It should worry about storing and retrieving data.

You can probably use SQLite as your database for much longer than you think. And some projects honestly will never need more than SQLite. However, I think that if you start doing sharding, replication and dealing with situations where throughput and concurrency is high, you really want to use something a bit more beefy. You have to think about how much time you'd spend getting SQLite to do things versus how much effort it is to just fire up database instance(s).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: