This partially might be more of a what's old is new again, but here's what I use:
- 100% server side rendered
- Progressively enhanced (fully works without JS, but having JS enabled makes it nicer)
- In select places where I need SPA-like snappiness I use a ~100 line long DOM diffing algorithm (I request full HTML through AJAX and diff it into the current page; if the user has no JS enabled then the page just refreshes normally)
- Running on a single, cheap VPS; no Docker, no Kubernetes, no serverless
- No SQL database; I just keep everything in memory in native data structures and flush to disk periodically; I have a big swap file set up so that I can keep a working set bigger than the RAM
- Written in Rust, so it's fast; I can support tens of thousands of connections per second and thousands concurrent users on my cheap VPS
> request full HTML through AJAX and diff it into the current page
These first three points are actually starting to be "in" again. They're called HTML-over-the-wire[1, 2]. And I am currently looking into implementing them for my site (which uses ASP.NET Core Blazor), because I think this approach is awesome.
No SQL database is the one thing I disagree with. What if you want to write a SQL query/report? Do you have transactions? As the app gets bigger you might want that.
> No SQL database is the one thing I disagree with.
I'm not saying this approach is a good idea in every case. There are many situations where I would use a proper database. It really depends on what kind of app you're building, what kind of queries you want to run, how big is your dataset, your failover and availability requirements, your programming language, etc.
In my case it was a great decision. The code is simple, and the data is convenient to access since it is already packed in native data structures. And it's really, really fast since all of the "queries" are basically LLVM-optimized machine code.
> What if you want to write a SQL query/report?
I just write a few lines of normal Rust code to do that. Sure, it's not as convenient. But I don't need to do that as often, so I don't care.
> Do you have transactions?
The concept of 'transaction' doesn't really make sense in this approach, since I'm directly modifying the data in memory, every change I make is atomic and can't fail, and I flush the data to the disk atomically as well.
>The concept of 'transaction' doesn't really make sense in this approach, since I'm directly modifying the data in memory, every change I make is atomic and can't fail, and I flush the data to the disk atomically as well.
So you don't have mutable data shared between threads then?
> So you don't have mutable data shared between threads then?
I might have worded that one sentence a little better, sorry.
I do technically need to share data between threads, since my event loop runs on multiple threads, so if I want to mutably modify something I do need to wrap it either in a `Mutex` or an `RwLock` and lock it.
What I meant was that it is not the same as a "transaction" in the SQL sense. It can't fail, there is no rollback, I decide on my own what level of granularity I need, and I bake the necessary atomicity into the data structures itself. So there are no SQL-style transactions anywhere - I just use a lock where it's appropriate to directly modify the memory, just as you'd do in any other situation. (Either by wrapping the whole structure in a lock, or wrapping only a single field, or by simply making the field atomic by itself.)
So it's a simple mmap database like early versions of MongoDB? I like using atomic commands, but I imagine it would become tedious fast to do the extra work of doing the accounting of the transaction manually every time you want to do a non atomic transaction.
> So it's a simple mmap database like early versions of MongoDB?
Even simpler. I use mmap only for things that I build offline and that never change during the runtime. Everything else is just kept as your garden variety objects in memory.
(Well, although I guess you could say it is an mmap database, it's just that the mmaped file is the swap, and the OS manages it for me.)
> I like using atomic commands, but I imagine it would become tedious fast to do the extra work of doing the accounting of the transaction manually every time you want to do a non atomic transaction.
In my case it's not really tedious at all. Usually when I want to mutate something it looks something like this:
let user_ref = state.require_login()?;
let user = user_ref.write();
user.set_field(value);
drop(user);
Of course this depends on your requirements. If you'd have 20 different objects of 10 different types that you'd have to lock separately and you'd want to update all of them atomically it might become hairy (SQL would definitely be better in such a case), but I don't have such cases.
I do similar stuff to what you do, writing my backend code in Rust and not using a database but opting instead to keep things in memory and flushing to disk on my own. And running servers that I manage with SSH.
I would love to talk with you and exchange knowledge. Your HN profile has no publicly visible contact info though.
Do you have an email address I can reach you on? Can you put it in your HN profile description?
(Just please note that if I apparently don't reply and if you're using gmail then my reply probably went to spam or gmail's /dev/null, so email me again; lately gmail doesn't like emails that I send and either puts them into spam or just sends them to /dev/null, even when I'm replying to someone that emailed me first! So much for Google having no monopoly, sigh...)
The basic stuff Rust will solve for you, but it is easy to forget logical dependencies and get an inconsistent state anyways... Just the ordinary close a session, send error message if not alive has bitten us in a 20 person project.
Yeah, this seems problematic. What if you hard power off your server, how do you know your data didn't get corrupted? As someone who has written databases, and knows how difficult this is to get right, it makes my skin crawl.
Backups and restore will also have to be DIY. Point in time restore probably won't be possible - and that's a life saver when you need it.
> What if you hard power off your server, how do you know your data didn't get corrupted?
Atomically replacing the whole file instead of updating the data in-place, and using a checksumming filesystem like ZFS should give you such guarantee, no?
> Backups and restore will also have to be DIY.
Well, yeah. A daily cron job that `scp`s the data to an offsite location. Personally I don't really see much problem with that? It's easy to backup, and easy to restore.
> Atomically replacing the whole file instead of updating the data in-place, and using a checksumming filesystem like ZFS should give you such guarantee, no?
No. You need to use fsync to ensure the file is flushed to disk. Also careful not to rename the file across mount points (/tmp is often a different mount and the most common location for people trying this.)
> Well, yeah. A daily cron job that `scp`s the data to an offsite location. Personally I don't really see much problem with that? It's easy to backup, and easy to restore.
That's fine, but your backup and restore granularity is daily. If you introduce a bug that writes the wrong / partial data, you can't recover anything that isn't in the previous day's backup.
With a typical database you have a transaction log and you could restore back to any point during the day.
This has been necessary about a dozen times during my career (once every two years or so on average.)
> No. You need to use fsync to ensure the file is flushed to disk.
Are you sure? At least in case of ZFS AFAIK the file wouldn't get corrupted, since it uses copy-on-write internally. Sure, you might get the old data back, but you wouldn't get corruption. And it automatically checksums your files anyway so it will detect any data that did get corrupted, and if you're using RAID-1 (as you should for any important data) it will automatically repair it.
> Also careful not to rename the file across mount points (/tmp is often a different mount and the most common location for people trying this.)
You can't rename across mount points. (Linux's `rename` syscall and the corresponding C API will return an EXDEV error if you try.)
> If you introduce a bug that writes the wrong / partial data, you can't recover anything that isn't in the previous day's backup.
Sure. If I was working in a bank not being able to do that would be totally unacceptable. But in my case that's the tradeoff I'm willing to live with.
Adding in a service like a SQL database because "you might want that" in the future is not a good idea. Apps should do what they need to do and no more. "Future proofing" is almost always a waste of time - even if you definitely want the feature in the future it's usually better to add it in when it's needed than adding it early and have to support something that adds no value yet.
As for reports and transactions specifically, noSQL databases support both.
OP isn't talking about noSQL databases, but no database at all (if I understand correctly - OP says '- No SQL database; I just keep everything in memory in native data structures'.
In this case, I don't think OP is even thinking about noSQL databases.
> "Future proofing" is almost always a waste of time
Not really, unless you lack the experience to know what to future proof and what not to future proof.
I've saved days/weeks of development by future proofing certain things, I've also saved days/weeks of development by taking pragmatic YAGNI in the right places.
Obviously this will only surface on projects that last longer than a couple months ;)
Since this is not vDOM diffing (which most JS frameworks nowadays provide) can you help share the algorithm if its already out in the open / non-proprietary?
It's not really a general purpose diffing algorithm. It probably doesn't handle every corner case as it only does what I need it to do in the few places I actually need it.
Basically, I use diffing because it's simple. I don't want to mess around with manually updating whatever content I want to update, and I don't want to be forced to generate a specially crafted piece of HTML so that it's updatable. This way I only have to have only one function that updates the current page in-place by diffing, and I can use it everywhere I need to.
I meant that the old tree and the new tree both originate from the same source (i.e., your code), and if both were annotated with some kind of hashes of their contents, you wouldn't need any diffing to find the changes. You could simply compare the annotations on corresponding pairs of nodes and if they're the same, you don't need to descend any further and compare their contents - you already know than anything below that node is going to be identical.
Also, have you compared the performance of whatever you're doing with a simple PJAX-like approach? Considering that browsers are very good at parsing HTML, I wonder how fast that would be compared to doing quite a lot of DOM calls, which you seem to be doing (?). I'd definitely be interested in that comparison.
> if both were annotated with some kind of hashes of their contents
That's exactly what I don't want to do, since it's extra complexity and extra work on my part. (:
> Also, have you compared the performance of whatever you're doing with a simple PJAX-like approach? Considering that browsers are very good at parsing HTML, I wonder how fast that would be compared to doing quite a lot of DOM calls, which you seem to be doing (?).
Nope, sorry, I haven't done any benchmarks. From what I've tested the performance is great. If the pages where I use it were heavier then maybe I'd have performance problems, but so far I don't.
I'm really impressed at how many people have come back with critiques or suggestions about your architecture without actually even knowing what your app does!
Yes! I am a big fan of "keep data hot" approaches to things. I first came across this approach with the Prevayler library years ago and I love it.
People who haven't tried it will be excited to discover how much faster it is. The only trick is to store your data in a format where you can write the changes easily. E.g., logging changes and then playing back the log on startup.
I do this a lot these days with prototypes and poc’s but usually with Go since I’m more familiar with it. Sure it only works on one machine, but it’s usually so fast scaling means it was a success and now you have a war chest.
I do the same thing with "no SQL db". The only thing I do differently is not make snapshot of the state but rather replay events to achieve the same state as before (or different t if we migrate to different logic). Also all the changes that are possible are typed events. It wont be a problem to save the state like you do but right now our application changes so often that we have to make sure we can replicate the state even if the logic changes.
How do you protect yourself against incomplete writes? What about when there's a critical update that you don't want lost?
I've done similar stuff in the past, but I handled things slightly differently. For tiny data structures I just spat out a short XML file. For larger data structures I synchronized all changes with a SQLite database.
> How do you protect yourself against incomplete writes?
I don't update files on disk in-place. I write to a new file, and once it's done I do an atomic move and replace the old file with the new one.
> What about when there's a critical update that you don't want lost?
I immediately write it to the disk? (:
I don't think it's possible to completely avoid this problem. In a traditional database if you suddenly lose power you'll also lose that critical update, if it hasn't been yet committed to the disk.
I like the approach of tailoring the app to the use case, so I'm not arguing you should do something differently.
The database has commit for exactly this reason. If you lose power, you will not acknowledge the commit to the client, so the client may retry. For example, the end user may get an error page. This is different from a case when the app would show modified values, but the next day, the modifications would be gone.
With the database, you still have the problem, that after a user sees an error, the transaction might have suceeded. Eg if connection drops at the moment when the DB receives the commit command.
The old version of the application that was written before my version always wrote the entire file like you do. But when the file got very large it was too slow, which was why we used SQLite in newer versions.
> Progressively enhanced (fully works without JS, but having JS enabled makes it nicer)
This is something I've struggled to get developers to understand forever. It doesn't help that no browser really allows users to switch off js. The number of developers who think doing client-side validation on forms is enough is far too high!
Currently 4.29 EUR/month VPS on Hetzner. (Although I have some fun features planned and might need some more processing power so I might upgrade in the future.)
They have great AMD-based VMs, 2x perf/$ over Intel. (I've benchmarked.)
I don't use a DB. But sure, you could run both your app and the DB on a single VPS.
Yes, you can host multiple domains. You can either point multiple domains to a single IP of your VPS, and configure your web server to use name-based virtual hosting, or you could buy an extra floating IP, assign it to your VM, and use IP-based virtual hosting.
You have to do it yourself. But it's not that hard. The easiest way of hosting multiple sites is probably to just assign multiple IPs to a single VM and simply run multiple servers simultaneously, each listening only on a single IP.
As far as the backup is concerned, Hetzner has the option of handling it for you for 20% extra money.
In my opinion this question seems like a good reason to get your feet wet just a bit. Just gaining some understanding of how things work under the hood is worth it even if you end up never using it professionally.
The basics of operating a (virtual) server to run your apps on could be seen as similar to learning the basics of cooking food. You can eat fairly well for the rest of your life without knowing a thing about it but if you just spend a single hour every week preparing your own food, you will become more familiar with the food itself. You would larn what individual flavors you prefer and where they come from, and you'll even be able to survive on less money if you would need to, or even survive more easily in a potential crisis where home delivery isn't an option.
If you spend an hour every week to learn some system administration (what people call "ops" nowadays) you'll be able to realize when you're being overcharged for simple services, feel the power of building a larger part of your stack yourself, play with new things, etc. Running multiple apps on a single host with proper backups is a trivial problem, especially if you know how to code, and in my opinion these things shouldn't be "left to the devops team" or seen as "IT stuff".
Not that long ago most developers had to know what environment their software ran in as part of their job, and despite many attempts to move away from it, this is still mostly true even if we have switched from the Linux CLI to cloud dashboards, CI/CD pipelines and Terraform scripts...
I am currently using a bunch of AWS technologies, e.g. AWS EBS, RDS etc. I am also thinking to consolidate all projects to a single AWS LightSail, which offers pretty cheap instances, e.g. 512 Mb, 1vCPU, 20Gb SSD for 3.5$/month.
It was quite some time ago so I don't remember exactly, but I've basically used one of the usual benchmarks that everyone uses to measure the number of requests per second that a server can sustain.
(I've rented another VPS in the same data center and ran the benchmark from there.)
> I just keep everything in memory in native data structures and flush to disk periodically;
Can you expand on what you mean by this? I am interpreting that to mean that you are actually writing to a file and reading from a file rather than using SQL?
They're storing their data in native Rust objects and read or write them in server application RAM. In case data becomes larger than available RAM, they have set up a large swap, which is a file or a disk partition that the OS uses when no more real RAM is available, while acting like nothing is happening (and everything gets 1000x slower). They periodically serialize these in-memory objects to a file on disk, and presumably (they didn't specify, but I feel it's implied) they're reading this file and deserialize it during initialization/startup phase.
Correct. Although I don't read everything on startup. Some of the data (e.g. per user data) is only loaded once it's needed.
Also, the swap doesn't make everything slower as long as my active working set is smaller than the available RAM. That is, I have more data loaded into "memory" than I have RAM, but I'm not using all of it all the time, so things are still fast. It's basically an OS managed database. (If I were using a normal SQL database I'd get a similar behavior where actively used data would sit in RAM and less used data would be read from the disk.)
The obvious difference is that swap files are not as optimized for disk io. There is a serious performance issue if you ever have to hit disk often. It all depends on how you structure your in memory/on disk data. But it might never be an issue, have scaled to tenish million user in memory there is so much else that can go wrong first.
Why not use something like lmdb instead of writing raw files? It's a performant k/v store and memory mapped. It beats file access syscalls most of the time.
It's not really a general purpose diffing algorithm. It probably doesn't handle every corner case as it only does what I need it to do in the few places I actually need it.
Okay, so this is indeed very interesting. I just launched the same version of Firefox in a Win10 VM, and it works there too. So it might be one of your extensions, maybe? (I use uBlock Origin too, so that's not it.)
Could you try these things out please?
1. Try disabling all of your extensions and try again. (IIRC by default private mode disables all of the extensions, so that should be an easy way to do it.)
2. Can you open the dev console and try running something like this in it?
document.querySelector("audio").play()
Based on your original error either the `previousElementSibling` is not an `<audio>` tag, or for some reason audio tags don't have a play method, so it'd be good to know which one it is.
I have my own crate for template rendering. I tried various existing ones but I ended up not liking any of them for one reason or another, so I've made my own.
Some of the features are:
* Fully static. It's compiled into Rust code instead of being dynamic. (I wanted to be able to use normal Rust code in my templates instead of a special template-only language, and I wanted the extra speed.)
* Built-in automatic HTML whitespace minification at compile time.
* Built-in XSS-safe string interpolation. You have to explicitly use the unsafe raw interpolation.
* It can interpolate any Rust value which implements the `Display` trait.
* Doesn't use Jinja's syntax.
* Doesn't have any extra unnecessary dependencies.
* Doesn't use inheritance for template composition, instead it uses simple textual inclusion plus late binding to achieve composition. Since it's hard to explain let me give a simple example. Let's assume I have a skeleton.html which looks like this:
<html>
<head>
<title><~ @paste title ~></title>
</head>
<body>
~~ @paste body
</body>
</html>
And now I have index.html which is the actual template I render, and it looks like this:
~~ @include "skeleton.html"
~~ @snippet title
Index
~~ @end
~~ @snippet body
<h1>Hello world!</h1>
~~ @end
As you can see I first include the skeleton.html, and then I define the "title" and the "body" snippets after the skeleton has already pasted them. Normally this wouldn't work, but since the `@paste` directive is lazy it only gets resolved once the whole template is processed.
This makes it really simple to customize the templates, since you can just add `@paste`s wherever you'd like to have a customization point, and you don't have to think about doing things in the right order since the templating engine will take care of it for you.
(If you forget to define a snippet that was `@paste`d you'll get a compile time error; I also have other directives that allow you to set a default if there was no snippet defined.)
That looks a lot like ructe [0], a templating language that I really love. It compiles templates to functions that you can call from Rust using a build script. The functions parameters are accessible from your templates. You can even import and run arbitrary rust code from your app:
It's the other way around. I have no requirements that need it to have a normal database.
The biggest two advantages of not having a database are simplicity (I don't have to query the database to access the data, as it's already in memory) and speed (my average response times are well under 1ms, and that is with compressing every reply, having more data in memory than I have RAM, running on a cheap dual core VPS, and being flooded with requests from linking to my site on HN).
In general I was a little frustrated with the state of Rust's web server ecosystem since it has, what I personally call, the NPM syndrome. Pulling a single crate will often pull along with it hundreds of dependencies, and murder your compile times. I know that those dependencies are often necessary, but during development I do not need all of that.
So I made my own.
But of course it's generally a bad idea to run your own HTTP stack in production. So my framework works like this - it can be compiled either in development mode, or in production mode.
In development mode it uses my own crappy HTTP stack and my own crappy async executor and has *zero* dependencies, and compiles super fast. In production mode it uses `hyper` and `tokio`. In both cases the API is exactly the same. So I can have my cake and eat it too.
Not OP, but we had a similar project long back. All data is in maps and structs. We didn't had free text search, all filters had a corresponding Map of key to []ids. A primary map from id to object. Needs lots of RAM, but that is the cheapest thing we can add
Linux's memory management subsystem is really good. It's almost magic. And modern SSDs are really fast. So whatever swapping is going on in the background I really don't feel it.
It's basically an OS managed database. Instead of the SQL server doing the swapping to-and-from the disk, Linux does it for me.
> I'd like to hear about different approaches others are using to write web sites or apps today. What languages, frameworks, or libraries are you using?
So the question as I understood it was not "what truly novel and unique approach you're using that no one else has before?" but more of a "what non-mainstream approach you're using?". And what I'm doing is definitely non-mainstream.
Actually, the title of the post specifically says "novel tools". That apparently didn't make it into the body of the post, but almost everyone is primed by the title to expect novel solutions in the answers.
That said, I don't think there's any harm in reminding people that there are plenty of non-novel solutions already that they might not have fully considered.
He is using a novel templating library, novel HTTP server, novel JS diffing algorithm ... and this approach has generated more questions than any other on this thread so clearly it is unfamiliar to much of the audience.
- 100% server side rendered
- Progressively enhanced (fully works without JS, but having JS enabled makes it nicer)
- In select places where I need SPA-like snappiness I use a ~100 line long DOM diffing algorithm (I request full HTML through AJAX and diff it into the current page; if the user has no JS enabled then the page just refreshes normally)
- Running on a single, cheap VPS; no Docker, no Kubernetes, no serverless
- No SQL database; I just keep everything in memory in native data structures and flush to disk periodically; I have a big swap file set up so that I can keep a working set bigger than the RAM
- Written in Rust, so it's fast; I can support tens of thousands of connections per second and thousands concurrent users on my cheap VPS