From experience, before you just spinkle big ass string literals throughout your Go code that communicate with Sqlite, just toss all of your persistence in a separate package if you can. I'm not saying some crazy abstraction, just put it some place common if you don't have a lot of advanced SQL needs, and put in the minor extra work serializing to and from structs. You'll be happy when you want to add query logging, find certain usages, need code reuse, and if you ever want to change your persistence approach.
Also, devs should be aware of some of the concurrency limitations of Sqlite. Devs need to essentially share a single connection across the app, and serialize the writes.
Also, side note, I've found one really neat use case for Sqlite over other DBs: simple disk encryption. I wrote a Go wrapper for one of the popular ones [0], but in general if you distribute your app in other ways you may want to offer your users encryption that you can't easily with other DBs. Sure you lose some performance, but sometimes it's worth it (again, less in the server case where you control it and/or can do fs-level encryption, but definitely in the distributed app case).
The author talks about using multiple readers and serializing all writes to a single connection fyi. This is how you get really stellar update/write performance out of sqlite. In my test I saw 3x insert per second compared to Postgres for example.
"Most problems do fit in one computer, up to a point. Spend some time determining where that point is. If it is years away there is a good chance one computer will do."
I've been following this principle for a decade, and 98% of the services I've built have never needed to grow beyond one instance. The ones that did were solved by breaking out specific functions into their own managed services, not by applying a wide horizontal scaling solution.
If you build for scaling before you have anything running that can be tested against, it's highly likely that you're actually scaling the wrong thing. It's like optimisation: to get any meaningful benefit, you'll want to surgically solve the actual bottleneck rather than sprinkle hope-based gestures all over the place. (I'm reminded of the article that was posted the other day, where the writer experienced a Firebase slowdown and tried to solve it by upgrading hundreds of npm packages to their latest versions. That's what I mean by hope-based gestures. Premature "web scale" architectural tripping is essentially the same kind of shotgun approach.)
Sure, I'd say this applies to the vast majority of websites, and technologies like docker, kubernetes or multi-region deploys simply are not required for most projects. There is no need for massive scale and multiple nodes for most services/websites and if you find there is, well you'll be rewriting a bunch of things anyway unless you've done a lot of work on very large websites before.
I do think there's something to be said for habitually separating the database though - if you store data elsewhere, you can have 1 instance or 2 or 3 pretty easily, and sometimes having more than one is nice for redundancy, handling big spikes or seasonal traffic etc. It is pretty easy to scale horizontally nowadays on cloud services where they take care of load balancing and sometimes the data store too. You also don't have to worry about wiping a server and starting again, as your data is elsewhere.
Embedding a db like sqlite introduces other problems (cgo requirement for one). It is really handy for things like mobile development though as then you don't depend on a separate api or db for data. It's also handy if you don't want potential users to have to worry about setting up a db when trying out your project, so an embedded db is really nice for that - I wish golang had a clone of sqlite that was pure go code as it would make this even more attractive for a broader range of options.
"One process programming" is a really under-rated philosophy. Communication overhead can be substantial and concurrency issues are harder to debug. A single process that you can just stop and inspect in the debugger is easier to work on. Especially if you're only one developer as well.
Two big pieces of software I worked on used this philosophy. One was https://en.wikipedia.org/wiki/Zeus_Web_Server - nowadays its niche is occupied by nginix, but back then its advantage was being a one-process-per-core select()-based solution, compared to Apache's process-per-concurrent-request solution. It was also absurdly, unnecessarily portable, being written in vanilla POSIX. Ran on a dozen different flavours and architectures of UNIX almost all of which are now dead.
Another was chip design software for a startup. Eventually it would have been useful to parallelise computation, but in the meantime it was so much faster to develop it and just buy big machines and leave it running overnight.
Unsurprisingly, comments start pointing the way to added complexity like Postgres and containers and whatnot, but this is exactly the right approach for probably the large majority of webapps out there. One executable, one db-file, up and running anywhere within a very few minutes. With a modicum of intelligence thrown into the design, you're never ever going to see SQLite acting bottleneck. Few sites have request rates on the scale of tens of thousands per second.
Developers of would-be federated services to decentralize the web, please take note. This is how you go about ensuring wide deployment. Simplicity, if anyone remembers the word.
If you're using AWS anyway, I think using their managed Postgres / RDS could make this a lot easier operationally. No need for custom syncing/backup scripts, easier to scale the storage up.
To the larger point, while I totally agree with the premise -- most apps will never need to scale beyond one large instance, I'm not exactly sure what's the actual tradeoff is. If you're writing a simple CRUD app, it is not really controversial that it shouldn't need any complex distributed stuff anyway, it is just a simple python/Go app and a data store.
Most "fancy" things outside that single process paradigm, such as avoiding using local disk and using S3/RDS/document stores, using containers for deployment, etc. usually have more to do with making the app operationally simpler and easy to recover from instance failures than scaling per se.
> operationally simpler and easy to recover from instance failures than scaling per se.
I'd say architecturally making the app easier to perform this function, rather than rely on infrastructure to do it for you (and hoping doing so will make the app "easier" to code).
For example, use an append only data structure (like event sourcing), with snapshotting. Then your app recovery process is just "restart".
"Typical concerns about the impact on software quality of adding C code to Go do not apply to SQLite as it has an extraordinary degree of testing. The quality of the code is exceptional."
It could be worth noting the thread concurrency limit imposed by cgo calls locking the OS thread to the calling goroutine. For example, 128 concurrent requests running long statements would take up 128 OS threads.
Of course any language not using a lightweight threading model has this issue either way if each request is mapped to one thread but it seems worth it to point out that you may want to limit your own concurrency at the request level or with properly sizing a connection pool (like the one present in the sqlite driver).
Edit: whatever the solution to concurrency is, benchmarking it is a good idea for any sized product going into a production environment. There's a good article about this by the people at Cockroach labs: https://www.cockroachlabs.com/blog/the-cost-and-complexity-o... (see Cgoroutines != Goroutines)
This is a good article, but I feel like using Postgres instead of SQLite on the server side would achieve his goals even better:
- No need to link with C code. There's a pure go driver for Postgres.
- You get the 'psql' command to inspect and change the database, even while the web app is running.
- You get a lot more database features.
- You can keep the deployment simple by running Postgres on the web app server until you need to scale up.
- Conversion to a full blown cloud database and horizontal scaling is just a matter of configuration.
I understand the desire to keep the stack as simple as possible, but it looks to me like Postgres would actually be simpler than SQLite for the server side.
> You can keep the deployment simple by running Postgres on the web app server until you need to scale up.
This doesn't affect deployment. Deployment includes setting up and upgrading and potentially backing up Postgres. All of these are easier with Sqlite. While in general I agree with you, the benefits of putting one file on a server and running it has value over installing an entirely separate software product.
> Deployment includes setting up and upgrading and potentially backing up Postgres.
But I find this rather trivial to do with cloud database services like RDS. Why not just use that? At the end of the day it gives you tons of benefits (e.g. easy backups) with no real lock-in because you'd still just be using a normal postgres driver.
No particular reason, depends on the use case. Every choice has tradeoffs, and there are cloud-specific ones that can hinder mobility even if it uses a common tech. Lock in is about more than code reuse especially as dependence grows. Granted, for many use cases it might be the best option, but we shouldn't pretend like it's the exact same mobility profile as copying a DB file.
the desire to keep the stack as simple as possible
It's a somewhat deceptive simplicity, to boot. The fact that you can use SQLite like this is a testament to its versatility and quality but it's not really for this. There's a conceptual overhead in trying to wedge it into this role and it can also easily become a practical one.
No you can't. But you can deploy the PostgreSQL executable along your app executable, and run it as a "sidecar" process, using a Unix-domain socket to let your app communicate with the PostgreSQL server, and setup PostgreSQL to store the database files in a location specific to your app, instead of a system global location. It's a bit unusual, but it's possible, and PostgreSQL command line makes this relatively easy.
As far as I know, there is no way to link PostgreSQL as a library in another executable. At least, nothing supported by the official PostgreSQL project. Even MySQL, which used to be available as an embedded library, got rid of this, feature a long time ago. Do you have any link to this "underdocumented" feature?
While reading this, please keep in mind this is for indie developers, who needs to do all kinds of jobs. Chances are SQLite is the one thing that everyone must learn to use, if your business involves building apps. So if you can use it on server side, it would save a lot of needless learning. Same reason as how nodejs got started.
Some may say, well, sql servers are not that hard to learn. Well, that is because you live with it day and night. Try going back to it after 1 year of break.
I've done something similar to keep it small with potential to go multi region for one of my new projects using Bitbucket and Google Cloud.
Bitbucket pipelines that create an image of the app, create instance template with a container "gcloud beta compute instance-templates create-with-container", create instance group, create DNS, global load balancer, SSL certs and rolling update of the instance groups with a new version (if applicable).
Using preemptible g1-small to keep costs down but deployed in at least two zones per region.
Just today, I finished writing my own low level go sqlite driver. I haven't published it yet. I really thought there was a need for it. I hate the database/sql integration that the other drivers have. Now I'll have to see if my driver has any need to exist.
This always seems to happen when I write something.
Your sqlite package totally fits my needs. My biggest problem is that I hate the database/sql layer. Connection pooling is completely unnecessary and surprising in a sqlite context. I honestly have no idea how anyone is using it with sqlite. The concept of a sqlite "connection" goes out the window. You can no longer use sqlite transactions in any normal way. You get locking issues if you don't force it into shared cache mode.
I've always wanted just a pure sqlite package with a one-to-one mapping of functions. Your driver is totally what I've always wanted to exist.
I've put up a preview version of my package if you wanted to look at it.
I ended up removing a ton of features, removing the database/sql driver, getting rid of all the "caching" of information that it does, removing callbacks (which I plan to re-add later), and renaming the methods to match sqlite function names.
It still has some niceties that I plan to keep, such as the Scan method to extract several columns with a single method call, and a RowData type to extract rows into a go map.
I found your project in Github and I'm going to suggest some things. There's a particularly nasty footgun that you might have by not having HAVE_USLEEP defined on UNIX. I'll open an issue.
Last I checked mattn's wrapper was for an older version and was missing some things, like registering an update hook (I could be wrong, been a while). I welcome a current, lower level wrapper. If you publish your code, I'd take a look.
And most people discount the learning experience. You don't really know how to do something until you've done it, with all its warts and edge cases and all.
Also, devs should be aware of some of the concurrency limitations of Sqlite. Devs need to essentially share a single connection across the app, and serialize the writes.
Also, side note, I've found one really neat use case for Sqlite over other DBs: simple disk encryption. I wrote a Go wrapper for one of the popular ones [0], but in general if you distribute your app in other ways you may want to offer your users encryption that you can't easily with other DBs. Sure you lose some performance, but sometimes it's worth it (again, less in the server case where you control it and/or can do fs-level encryption, but definitely in the distributed app case).
0 - https://github.com/cretz/go-sqleet