Hacker News new | past | comments | ask | show | jobs | submit login
Neon – Serverless Postgres (neon.tech)
667 points by nikolay on May 28, 2022 | hide | past | favorite | 330 comments



This is the missing piece on cloud for masses

  * we already have compute scale-to-zero (cloudrun, lambda, fly.io).

  * Network is default pay for use. Storage (S3) is default pay for use.

  * The only piece in the stack that was always-on was the database (only serverless db thus far was firestore, or something like sqlite+litestream)
With something like this we get a solid RDBMS engineered to be scale-to-zero, and with good developer experience.

This opens up a world of try-out mini applications that cost cents to host. serverless db (postgres) + serverless compute (cloud-run) + use as you go storage+network. This is a paradigm-shift stack. Exciting days ahead.


There are plenty of serverless database options already: Firestore, DynamoDB, CosmosDB, FaunaDB, even MongoDB, and there are "newsql" distributed relational systems like CockroachDB and Planetscale with serverless plans.


Many teams would prefer a postgresql compatible database with full sql support without compromises, missing features, etc. So, this could break that market open a little. Both AWS and Google are unreasonably expensive for this stuff. Most teams don't need a huge database and would be able to run postgresql on a tiny instance and get away with it. Have 2 of those and failover and backups and it's good enough for a lot of small shops.

Most managed/serverless options begin at hundreds of dollars per month. So, you get lots of companies either just handing over the cash or jumping through hoops to get something more reasonable. The latter is a stupid waste of time if you can afford the former. That's how Google and Amazon make money: they make the expensive option more tempting and the cheap option needlessly hard. They are not interesting in supporting frugal teams. The whole point is squeezing their customers hard.

So, this is potentially very nice if it offers some competition on the cost front. I'd certainly consider using this if it proves reliable. In fact, the whole reason I opted out of a relational database is the above. What I'd need is something that is reasonable in cost relative to the modest data I store and retrieve.


We have a single big bare-metal machine. We run Postgres with a ~1TB DB, moderate load, on a Hetzner AX101 (16C/128GB ram). It has 2* 3.84TB nvme drives (zfs mirror with hourly snapshots) used for postgres storage only, and a seperate pair of mirrored sata drives for system/boot (had to request the extra drive, ask support to change boot option in BIOS, and reinstall OS using rescue system). It's about 100 EUR/mo with unlimited data. We bounce all incoming requests from clients (the machine also runs a node backend) through a digital ocean machine (NGINX proxy), as their peering agreements are better, without this some users in Brazil, Turkey etc. have very slow access. OVH I think would be even better for this use (better? peering and IIRC cheaper data). ZFS snapshots are backed up with sanoid to a machine under my desk with spinning disks. AX101 can be fitted with up six 3.84TB drives, that's almost 12TB of mirrored storage, we should be good for a while. You can (should) use at least lz4 compression on zfs.. can consider zstd-1, bit slower, that could double the effective space. The compression also applies to the in-ram ZFS cache, that can be 100GB+.

We used firestore before.. got a bit tired of some of the limitations (latency, indexing). Cost-wise I don't think it's that different actually, but we aren't using much bandwidth, then self-hosted can be dramatically cheaper. Have to manage some details of course (zfs filesystem parameters, set up backups, config postgres etc.), but I found that stuff quite interesting and it's knowledge that will always be useful.


How do you measure the 1 TB db estimate? Is that in index, as a CSV, or back of the napkin math based on estimates?

I've always struggled estimating my DB size, curious what you mean by your estimate.


Sounds neat. What’s the weak link here though?


No read replicas that can be promoted to master in an outage?


Ah, nop. If it goes down, we'll figure it out.. 24hrs outage wouldn't be the end of the world. Would have been nice to have a second (read replica) machine in East Asia, to reduce latency for users there, but didn't find a provider like Hetzner.. maybe could take a server in a suitcase and install it in a collocation center there. Bit of a hassle though.


I have a similar setup as OP, and I use ZFS snapshots to be able to quickly rebuild the whole machine on another host. Of course, it still requires manual intervention in an outage (it could be automated, theoretically), and I may lose up to 5 minutes of data due to my snapshot schedule.


Maybe that zfs snaphot under a running database? Is that ok?


Yes, ZFS snapshots are atomic, so as long as the database use sane semantics to do disk I/O, you're fine (AFAIK, PostgreSQL and MySQL/InnoDB are ok, MySQL/MyISAM is not though).

A snapshot is effectively identical to a killed database server (due to OOM, or power outage), and database servers should be crash-resistant.

With PostgreSQL, checkpointing the database before taking the snapshot may make the crash recovery quicker when loading the snapshotted database.


Hi Mani! For sure there are many serverless options - fewer that separated storage and compute and fewer that are open source end-to-end. Neon is also 100% compatible with Postgres (unlike CockroachDB) because compute is Postgres.

Our intention is to standardize the separation of storage and compute cloud architecture - that's why it's open source under the Apache 2.0 license.


You should update your HN profile :). Neon looks really great. Far more interesting to me and my team than S2.

I noticed you mention Azure BS in your RFCs as a potential backend. Have you done much work towards that yet?


For now we've focused on making the product production-ready on AWS, so that we can get users to try it out. Once the model has been proven we'll likely branch out to other clouds.

However, if you really can't wait to run Neon on Azure, you could contribute the integration yourself: the code is available under Apache 2 at https://github.com/neondatabase/neon/


Exact reason why I asked :) Thanks for the response & all the work you're putting in, it looks really promising.


Haven't done anything with Azure yet. Shouldn't be hard to add though, we don't rely on any special cloud storage features, just simple upload/download of files.


(Some of these are options I've not looked deeper into - Fauna, Planetscale)

This sentiment is perhaps right, but I was careful about calling out scale-to-zero. We do have options that are zero cost (or pay as you use), but there's a fundamental difference in something that may be zero cost because a cloud provider is using it as a customer-acquisition ploy.

Options like litestream+sqlite+s3, or what Neon seems to be, are verifiable you-pay-for-when-db-is-booted up, else the verifiable cost is storage only.

So the trifecta that will be very productive for masses is 1) database where compute is scale-to-zero, 2)open source or commoditised, and 3) is RDBMS.


There is a big difference between in architecture between Neon and PlanetScale, CockroachDb, and Yugabyte. Neon is shared storage (storage is distributed but shared) and the others are shared nothing. Shared nothing systems are hard to build with supporting all the features of the base system. E.g. https://vitess.io/docs/13.0/reference/compatibility/mysql-co....

Neon is 100% compatible from Postgres b/c we didn't (or almost didn't) change the Postgres engine.


What is the effective difference to you? Technically all compute can enter hibernation by dumping RAM (or just using virtualized memory backed only by SSDs).

CockroachDB already does true scale-to-zero if that's your requirement: https://www.cockroachlabs.com/blog/how-we-built-cockroachdb-...


The effective difference is commoditization of the paradigm - there's confidence in a commodity technology that a proprietary cannot give (standardized use and wide support and community, multiple providers, self-hosting).

Eg: of innovation -> commodity. EC2 was innovative, but is today common-place. S3 was the same. DBs that scale-to-zero have not reached that state yet.

Thanks for the link on cockroachdb - it sounds promising. I wonder what's the minimum self-deployable unit of cockroachdb - will google around a bit.


I think it is more "market leadership" than commoditization that's working in favour of the MySQL-Postgres duopoly.


> "minimum self-deployable unit of cockroachdb"

If you host it yourself, or use their dedicated enterprise clusters then you still have instances which are individual (virtual) servers.

The serverless model has no instances exposed to you and instead is a multitenant architecture that uses a pool of shared compute nodes with a routing layer that intercepts and introspects your connection to load up your specific database context for query processing.


Most stuff you mentioned:

1 - Is not an actual relational DB

2- Doesn't really scale to zero

Planetscale does scale to zero, but has a ridiculous billing model.


Both CRDB and Planetscale "scale" to zero - but neither expose any concept of individual instances so I'm still not sure what difference it makes.


What's the billing model / how is it ridiculous?


Per row read / write billing -> how do you estimate that?


Either you have an existing db running and check the metrics, or you don't have a workload to move and you can just try it out from scratch. You can always migrate away if the billing model doesn't work for you.

Alternatively, it's likely that Planetscale will eventually provide a provisioned billing model, similar to DynamoDB. Then you can choose between scalable usage based pricing or less flexible flat rate pricing. Either way, innovating different billing models for RDBMS is a win.


MongoDB & CockroachDB are the only open source ones, the only ones we can hack on & improve & grow.

Neon seems like a vast vast improvement & great & desperately needed potential leap for mankind.


Planetscale is Vitess which is also opensource: https://vitess.io/

> "great & desperately needed potential leap for mankind"

Are you being serious? That's very hyperbolic if so.


> Are you being serious? That's very hyperbolic if so.

Yes Im serious. This is one of the most foundational & key levels of computing: storing & quering data. Without this, computing isnt good for much.

Getting good at this is a huge task for humanity. Right now that task is almost entirely being fulfilled by far off hyperscalers. Aurora, BigTable, Firebase, DynamoDB, CosmoDB, more special works like DataDog Husky... the world is running off vast super-awesome dataengines. But ones that are not ours, that we cant hack on, that we cant run ourselves. It might as well be the Martian's (little green men's) databases as far as I'm concerned: these are not humanity's heritdge & humanity is cut off from active participation with them.

Right now there are so so so few scalable data systems available for the world, that we have. This seems like a great & novel effort to radically open up the range of human capabilities, in one of the most important sectors of computing: handling data.

We have lots of other cloudware for humankind but data has seemingly been much slower & un-scaled. I agree with this @anilgulecha comment[1]:

> This is the missing piece on cloud for masses

[1] https://news.ycombinator.com/item?id=31537313


There are tons of databases, I just named several and there are hundreds more for every possible niche.

> "But ones that are not ours, that we cant hack on, that we cant run ourselves."

Who is "we"? Is your entire issue that not everything is open-source?

> "these are not humanity's heritdge & humanity is cut off from active participation with them."

This is still incredibly exaggerated. Everything humanity does is humanity's heritage. There are plenty of open-source databases of all types and sizes if you look: https://db-engines.com/en/ranking


Your free to your opinion, but I respectfully do not share your values & sense of collective ownership over remote, far off software I can't see, shape, control or rework.

The most notable feature of Neon, that makes it so exceptionally smart, is that it takes the humanity's existing first best choice default-go-to database, by a country mile, and layers in really smart decoupling of storage to help it scale. Reinventing things from scratch can be a win, but that this is already a well known quantity, much loved & cherished & used by all, hacked by all, grown by all is a planet-sized huge plus mark.

I think you vastly overrate the broad availability of alternatives to this, of potential starting places others might explore. Most of your list is dominated by proprietary whatever. Even skipping that, yes, there is a ton of novelty & promise, the potential for breakthroughs. But this is starting with the best, and pushing it into the cloud-layer, into the troposphere.


> "I think you vastly overrate the broad availability of alternatives to this"

There are dozens of "scalable postgres" alternatives from Redshift to Citus to CockroachDB to Yugabyte. I find it strange that you say its overrating the availability when you're also comparing it to something that isn't even a full product yet.


[flagged]


I've met my share of developers without perspective in time, and hey I was probably one of them for my first 5-6 years as a developer, that I don't think one needs to assume corporate toadying for hyperbolic importance for mankind to be ascribed to a corporate offering.


rektide is not Neon employee, nor is he or she was asked to post this.

Thank you for the comment rektide. We of course would love to see constructive comments and criticism.

- Neon CEO


[flagged]


Can you please stop posting these off-topic comments (which are against the site guidelines, as you'll see if you review https://news.ycombinator.com/newsguidelines.html)? Also, can you please stop posting unsubstantive comments generally? You've been doing that in other threads as well.

I appreciate your concern for the integrity of HN discussions, but the thing to do if you suspect abuse or manipulation is to email hn@ycombinator.com so we can look into it. Posting about it in the threads themselves is explicitly against the rules.


understood, dang sent you an email

but please take a look at the flagging attack on all of my comments outside this thread. the timing is too peculiar here

somebody really didnt like what i wrote above ;)


Moderators did that—your comments have been breaking the site guidelines quite badly!


that explains why seemingly YC money related threads are my criticisms getting flagged


Users can flag comments too and yours get flagged because they break the guidelines in overt and obvious ways which is pretty easy to avoid.


Doesn't seem like an employee due to another comment they posted. https://news.ycombinator.com/item?id=31537613


Minor correction, both of them are source available, but not open source

Cockroach DB license - https://github.com/cockroachdb/cockroach/blob/2c4e2c6/LICENS...

Mongo license - https://github.com/mongodb/mongo/blob/39e4b70/LICENSE-Commun...


Cockroach is eventually open source which I rather appreciate. Any code that is 3 years old goes Apache if I recall it correctly.


That only applies to the core. The "enterprise" features are proprietary until the copyright expires.

IIRC, backup was enterprise but is now part of the core. However, restore is still enterprise.


There is also gigabyte as open source. In our tests (which means nothing in general as it’s specific to our business case), it outperforms cockroachdb.


Sorry, cannot edit; Yugabyte (ios spellcheck).


Yes, but they are not "scale-to-zero"


Both CRDB and Planetscale are.


> This opens up a world of try-out mini applications that cost cents to host

Given how much performance you can squeeze out of a $5/month VPS (I've been spinning them up and indeed down regularly over the last couple of years), is this really a paradigm shift?


On that $5/month VPS there's management overhead - you need to have at least basic Linux knowledge, and ideally more than that to know not to do stupid things like chmod 777 and database exposed on the public internet. You also need to do your updates, etc.

I'm an (former) SRE and run my own Kubernetes cluster for fun, but still use serverless (containers as a service, static website hosting) depending on the project.


> On that $5/month VPS there's management overhead - you need to have at least basic Linux knowledge

Don't you need that as well as cloud-specific knowledge if you go serverless?

> and ideally more than that to know not to do stupid things like chmod 777 and database exposed on the public internet.

You still need some arcane knowledge to make sure your serverless doesn't experience cost overruns, right?

IME (your's obviously differs), the amount of cloud-specific + vendor-specific knowledge needed to avoid using a $5/m VM is a lot more in volume and a lot less in stability[1] than learning basic Linux once and using VMs everywhere[2].

[1] How the different cloud providers bill, when they bill, how to control your limits, etc changes much more often than knowing how to keep your server patched. Knowing how to get your serverless DB going on AWS doesn't help when you want to use Azure. And each cloud vendor regularly requires you to update your knowledge. Knowing how to keep a PostgreSQL-on-Linux up-to-date can be learned once and used for years. Even if running a managed DB, you'll still need to gain some of that knowledge anyway.

[2] Once you get to a scale where treating your machines like cattle rather than pets, you'll obviously have the team required to use cloud stuff optimally.


> You still need some arcane knowledge to make sure your serverless doesn't experience cost overruns, right?

I know people who have literally written their own analysis tooling just to figure out what's impacting their AWS spend. It's gotten better, but I could have retired many times over on what I've seen clients overpay to cloud providers because they didn't understand what would drive cost.

> [2] Once you get to a scale where treating your machines like cattle rather than pets, you'll obviously have the team required to use cloud stuff optimally.

The "problem" with the cloud story is that at that point you also have a team that could save you a fortune with a hybrid setup. Cloud providers get their margins off those who don't understand how much they're overpaying or who are too small to a) care or b) have leverage. Those big enough to have leverage who understand either negotiate hefty discounts or build out cheaper setups (basically at the point you're spending 7 figures a year, if you're paying anywhere near list prices for cloud services you're a chump; below that it's hit and miss)

I'm not at all against using cloud services, but I wish more people actually understood their costs and picked based on merits rather than cargo-culting. Some teams benefit greatly from cloud services, but usually if they're not cost sensitive. In my current job we have everything on AWS because we're never going to scale to somewhere where it'll get expensive and it's convenient. We'd save money if I moved it to, say, Hetzner, but the hosting bill is too small to matter. For that use it's fine.

The moment the bill starts to bite people ought to at least price out alternatives, and consider hybrid setups. E.g. I've had setups where even just putting a caching proxy in front of AWS to cache images to cut the egress bill would have paid for a team to keep it running. Their egress cost is still bad, but at the time it was just pure highway robbery.


With every cloud service there's a management overhead too. Just different skills you need to learn.

I've done SRE/devops work in various capacities including consulting longer than cloud services have existed, and my experience is that I've consistently earned more from clients who insisted on cloud services because they consistently need more help. Nothing is driving more demand for devops consulting services than cloud providers.


The issue with a full Linux system’s overhead is that if there are any new security vulnerabilities the situation could blow up in your face (e.g. the system is used to send spam, or host malware), so you need to maintain it at least minimally. With a serverless cloud architecture at worst it’ll stop working.


Or you just use Flatcar (a derivative of CoreOS), and don't worry about anything more than rebooting once a new image has been (auto-)installed, and run everything else in app containers where you have to worry about nothing more than what you would in your regular cloud setups.

This is not hard to get right. Yes, you need to learn how to do it, but the amount of money I've made from clients who thought cloud was simple and proceeded to create massive security holes for themselves is fairly substantial. People who think they're reducing their attack surface by using these services need to reevaluate - they're large, complex architectures that very few users understand properly. You need to learn the skills either way.


We don't know. But we built it anyway because it may be that.


Millions of students and enthusiasts around the world would find that cost sufficiently friction-ful to not try out things.


> Millions of students and enthusiasts around the world would find that cost sufficiently friction-ful to not try out things

I appreciate there are indeed billions of people for whom $5 is a lot of money, but just how many of them are "students and enthusiasts" itching to get started with Postgres?

I realise that - perhaps particularly here - a $5/month VPS is a deeply unsexy thing. You can, however, achieve (and learn) an awful lot with one.


Nowadays there's a ton of idling compute capacity running client side that could be used to spin up a personal cloud environment with all core services if one has the proper knowledge. For anyone that has a core to spare and a few GB's of memory, which should be easy for most modrn midrange hardware, I would recommend they save that $5 paying for a third party VPS, go deploy an open source hypervisor such as KVM, and run the virtual server on your own hardware.


Who absolutely would not want to give a credit card to AWS where a bill is dynamic. If $5/mo is bad they definitely can’t handle the screw up that scales up and runs overnight for $500.


Free tier instances on various providers provides an option. And if you can't afford $5 a month you really shouldn't be playing with services where there's a risk of huge overages if you face a sudden spike in users.


You cannot sign up on AWS Free Tier without the credit card info and many students end up with a huge bill. This sub reddit.com/r/aws/ has many such posts.

Some links and discussions:

- https://cloudirregular.substack.com/p/please-fix-the-aws-fre...

- https://twitter.com/alexwlchan/status/1399095011178958851

- https://news.ycombinator.com/item?id=27044371


Well, and that is just as much of an issue if you use most serverless offerings. But you're not prepared to carefully manage your use, cloud services should not even be on your radar.



Fly.io does not scale to zero.

Lambda has many limitations.

In particular, for some reason AWS is allergic to providing a container deployment service that actually scales to zero.


> AWS is allergic to providing a container deployment service that actually scales to zero

Isn't this what Fargate is?


No.


Not yet, but soon.


AWS Aurora Serverless v1 (in MySQL and Postgres flavors) has had serverless, scale-to-zero for quite a while.


Aurora Serverless v1 has cold boot times of ~30 seconds when scaling up to zero, which precludes it from being a viable option for most usecases


Unfortunately V1 is getting very little love from AWS and the new one, V2, does not scale to zero.


What's the cold start time for something using sqlite+lightstream on scale-to-zero compute? I think you'd need to pull the db out of storage, so I would be slow to go from 0->1 instance. Anyone know if that's right?

Is there any cold start delay for neon?


Right now it is 2 seconds. We are working on improving it.


> This is the missing piece on cloud for masses

I like this perspective a lot & think it's absolutely key here.

We- the world- still pick single-node writer postgres & read replicas when we have to store & query data. There's great Kubernetes postgres operators, but it's still a distinctly pre-cloud pre-scale type of technology, & this decoupling & shared-storage sounds ultra promising, allows independent & radicaly scale up & scale down, sounds principally much more managable.


If you can scale your app to zero, couldn’t you also just scale your database to zero once no more app servers are running?

Or for try-out apps, as you mention, you could just run Postgres next to your app in the same container.

This might be possible with fly.io, or will soon, I think.

I’m not sure how comfortable I am using a custom flavor of Postgres (even if it’s just the storage layer).


We already had serverless db for ages and it's called ... Google Sheets. You can even query it with simple SQL-like language.

The problem with most other "serverless" databases is that they don't offer HTTP API to query them from restricted environments like serverless functions.



I virtually never self-promote, but that exact article got me to investigate the offering:

https://ldoughty.com/2022/05/exploring-aws-aurora-serverless...

Short answer if you don't want to read my post: it constantly uses CPU, it's always on. After creation, waiting 2 days, never logged into it, never ran a script against it, never gave it access to any networks, minimum cost is $43/month because it can't actually scale down to 0.5 units unless you CAP it at 0.5, which makes it unusable, because it consumes all of that capacity just to exist.

It sounds like this Neon offering is exactly what I hoped AWS was offering... Or they are using Language to suggest it and mislead the customer just the same ... If it's the former, if probably sign up and try it out. If it's the latter, I'll probably never touch it for the false hope.

Edit: lots of typos from phone keyboard


I tried recreating your experiment.. created an aurora Serverless v2 db with 0.5 - 2 ACUs in us-east-1.. since you said it was for Wordpress, I disabled the multi-AZ replication, since AFAIK, WP can't use separate reader/writer connections (mentioning this because you didn't say anything about it in your article)... then I let it sit overnight so it had time to create everything.

It's sitting at 23% of 0.5 ACU.

So its either the replication setting (haven't tested with it on yet), or.. AWS is a shared service... wonder if it's similar to EC2, in that sometimes you get an instance on a machine that's more overloaded and the instance doesn't perform as well.. and you have to destroy it and try again. Might want to try it again.

Edit: I don't think its the replication setting... tried that with a new db and its at 25% on each replica after an hour.


I am using Serverless v2 with min/max ACU of 0.5/8 and it spends most nights at 0.5.


No scale to zero unfortunately


It does scale to zero no?

> It automatically starts up, shuts down, and scales capacity up or down based on your application's needs.

The only tradeoff is the additional latency someone will have when connecting to the db after it has shutdown and waiting for it to spin back up and become ready.


AWS aurora serverless says:

> You pay only for the capacity your application consumes.

> Scales down to 0.5

But it actually can't scale down to 0.5 or the DB falls over just existing.. auto scaling won't let you go down that low unless you set 0.5 as the max, which literally makes it not scale up, and it's dead, because the DB can't run with that little CPU.

So it's fair to ask if neon can scale to 0, both in marketing, and in practice.


We do scale compute part down to zero after 5 mins of inactivity now (no active transactions). This 5 mins threshold is a random pick, it could be 1 min or 30 mins later, or even customizable by the end-user. Storage part is heavily multi-tenant, so it's always running and our main objective is to make resource utilization as effective as possible.

It still has a significant latency on the first connection attempt after suspend (1-2 seconds), but we are working on that and it seems to be realistic to put the startup time under 1 sec.

Pricing model is still work-in-progress, so cannot say much about it. Yet, my personal intention is to make it cost-effective for both end-user and us. I'd prefer to don't build a service with claims like 'here is your free-tier serverless Postgres with zero-latency on connect', which actually means that under the hood there is an always-running compute burning the investors money. Hope it's realistic to achieve :)

-- Cloud engineer @ Neon


That's interesting to hear. That probably works great for my use cases, which is typically wake up to refresh a CDN for guests, but ready to work for a bit if a content creator logs in (e.g. a WordPress instance without comments or non-author logins).

Looking forward to seeing how this works out. I have no issues paying for services, I just hate that the minimum entry level cost is $20... I can't imagine why, at scale, it can't be more affordable for hobby/fun level projects.


How do you plan to start a PostgreSQL instance in less than 1 sec? Sounds interesting.

I tried fast booting of PostgreSQL instances and it always took multiple seconds. So i am really curious!


That's where the separation of storage and compute kicks in, I guess. Startup process of our Postgres instance (compute node) is a bit different from vanilla Postgres. We need to go to the network storage service (pageserver and safekeepers) to get the last known commit LSN, but we don't need to perform any sort of recovery on the compute node side. That way, compute is mostly stateless.

Basically, to start we need to know this LSN and to bootstrap the Postgres processes. This is really that quick. After that compute is ready to accept connections and serve requests, as it's able to get any missing pages from pageserver with GetPage@LSN request.

We do have the whole bunch of problems to solve: queries latency after cold start; startup after the unexpected exit of the heavily loaded Postgres instance could be slower; etc.


Some parts of the PostgreSQL start-up sequence take a long time:

- Initializing shared memory -> We, for now, have only small instances, so that doesn't hit us as hard

- Reading data directories -> We don't have to do that at all

- Replaying WAL from a previous unclean shutdown -> We don't need to do that, PageServer is responsible for that

- When initializing a whole new database: Initializing the data directory -> We have a copy that each instance gets initialized from, which makes the process "copy those ~16MB in the background", which saves us from having to do the costly initialization process.

And there's several more infrastructural optimizations, such as pre-loading the docker images onto the hosts.


It seems Aurora v1 used to scale to zero but v2 has a minimum of 0.5 ACU.


Snowflake is probably the closest comparison.


Some users called us Snowflake for OLTP, some others Snowflake for Postgres. Obv b/c of separation of storage and compute.


uh... AWS Aurora? Azure CosmosDB? GCP BigQuery?

All serverless, scale-to-zero or pay for demand...


GCP BigQuery is unusably slow for small datasets and is more like redshift/athena than postgres.


I don't see the benefits over a 5$ VPS. Even admitting I'll save a few cents over a VPS (which is absolutely not guaranteed), the cost saving is so minimal I won't bother rewriting everything under the serverless paradigm just for it. Of course if your cloud doesn't cost an eye and an arm. I can understand people excited to save on expensive Aws instances but maybe you should just consider dumping Aws.

Scale doesn't matter for mini applications and scaling vertically (=throw money for a bigger server) will work for 99% of the companies. The 1% who need horizontal scaling will have custom everything regardless and will need to hire experts, not a good niche to release a product.


I do. There are real benefits to using a hosted server for large projects (think hundreds of gigabytes to terabytes of data) because getting sharding, fallbacks, and downtimeless maintenance right is difficult, risky, and expensive.

These are reasons why one might go for a hosted database at AWS/GCE/Azure. There are tons of good servers out there for small to medium projects and I don't think this service is right for those, unless they can make do with the free tier. The real benefit is in the larger cloud application space.

A system that does this type of scaling automatically while also reducing the dependency on a single cloud provider's service can be a gamechanger for some companies with huge database servers that risk getting locked in.

I think using this service for most existing applications will introduce a performance drop, a rise in expenses, and a complicated migration path, but on the other hand I think that developing against this system for new projects that are very likely to grow in scale will end up with some major data control benefits.

The open source nature also allows for competitors in markets like the EU to start serving databases that don't break privacy laws (although most companies don't care until they receive a fine).

I'm not sure how this company will become profitable while giving away its special sauce that can be modified to run at competing companies relatively easily, but that's a whole different story.


> Neon allows to instantly branch your Postgres database to support a modern development workflow. You can create a branch for your test environments for every code deployment in your CI/CD pipeline.

> Branches are virtually free and implemented using the "copy on write" technique.

Unless I missed that everyone supports this, this here could be a killer feature and should be advertised higher.


Agreed, this direction is underestimated and should be developed better -- we (Postgres.ai) do it for any Postgres with our Database Lab Engine [1], and Neon would bring even more power if it's installed on production

[1] https://github.com/postgres-ai/database-lab-engine


You can get that feature on any Postgres server by installing Citus


Does Citus provide any such storage-level multi-cluster features? I can't seem to find any documentation on that...


I don't understand what you are looking for. Care to explain?


AWS Aurora Postgres supports this to an extent with "clones". You can even clone cross-account. The same copy-on-write stuff applies, so they're relatively cheap and fast. I hope that Google's new AlloyDB will also support it.

https://aws.amazon.com/about-aws/whats-new/2019/07/amazon_au...

There are some annoying restrictions, though. You can only have a single cross-account clone of a particular db per account.


The problem with Aurora's thin clones is extra cost each clone adds

For CI/CD, you want multiple clones running on the same compute power, in a shared environment, to keep the budget constant


It sounds like something you might be able to accomplish with a copy-on-write VFS on top of a Firebird database file. (Not sure about PostgreSQL, but with Firebird, you only deal with one file, so with Firebird, this should definitely work.)


There is an enterprise company called Delphix that does it on top of Zfs - so the idea was in the air.

Instead of duct taping this together with a filesystem we purpose built database storage. The advantage to this is that we can much tighter control execution paths and can profile them end-to-end. Additionally this allows us to integrate with S3 and make it much much cheaper to run.


Technically Firebird just requires a block device, so you might not even need a filesystem.


What are the intended usecases for "branching" a database? Currently, I use separate databases for different environments, are branches better?


Now the most common setup is to copy the production database to the staging once in a while and test migration against staging. With branching, you can test each PR against its own production database branch -- just put branch creation in your CI config. Hence, it has fewer moving parts, is a bit easier to set up, and reduces the lag between prod and staging.


Have a staging / qa env, then fork it for a branch for testing. Much faster than reseeding / restoring.


It's a great feature on heroku for branches, it shares data between review apps. Quite nice.


Really interesting. I've seen so much disagregated database work, and so so so much of that exposes postgres interfaces. But all the good stuff has been closed source!

I'm very very excited to hear about a team taking this effort to postgres itself, in an open source fashion! From the Architecture[1] section of the README:

> A Neon installation consists of compute nodes and Neon storage engine.

> Compute nodes are stateless PostgreSQL nodes, backed by Neon storage engine.

> *Neon storage engine consists of two major components: A) Pageserver. Scalable storage backend for compute nodes. B) WAL service. The service that receives WAL from compute node and ensures that it is stored durably.

Sounds like a very reasonable disaggregation strategy. Really hope to hear about this wonderful effort for many more years. Ticks the boxes: open-source with a great service offering: nice. Rust: nice.


We are committed to building a durable company and we are well funded. So yes, you will hear from us for years to come as we will be shipping more and more features.


I could not find funding information on the Neon site. Is that information not public?

edit: I found the info here: https://boards.greenhouse.io/neondatabase/jobs/4506003004


We will announce in a few weeks. Top tier Silicon Valley investors.



Oops thanks!


Just yesterday I was comparing managed serverless postgres offers and was sad to temporarily end my investigation with a compromise of using managed aws RDS for development, hoping that a fully serverless postgres with a nice free tier would pop up before going to production, and here we are!

Congrats to the team for what feels like an amazing product. Signed up for the early access, can't wait to get my hands on this!

For anyone interested, these ere the DB offers I looked into:

* DO managed postgres, no free tier but price scaling was not too aggressive, the issue is that it's not natively serverless and we're gonna get 100s of ephemeral connections.

* Cockroach, was the best option for our use case but it doesn't support triggers and stored procedures, so we can't use it right now (closely following https://github.com/cockroachdb/cockroach/issues/28296)

* Fly.io price scaling is too aggressive 6$ -> 33 -> 154 -> 1000s a month and no free tier that I could find.

* Aurora serverless v2 is only for aws internal access and we are using gcp.

* Aurora v1 was what we were gonna go with, but a lot of people online have showed their negative opinion around slow scaling. I didn't investigate enough but I'm thinking we'd need to setup RDS proxy for it handle all our connections, which would've bumped up the price by a good amount. Also no free tier.

* Alloydb looked promising but also no free tier and starting price is a bit much for our current phase of development, but it was definitely something we'd look into in the future.

And now Neon, natively serverless with a (hopefully) good free tier to test things out and some hints about cross region data replication, amazing stuff!


If CockroachDB was fitting your use case the best, you should have a look at YugabyteDB. It does triggers, stored procedures, extensions, almost everything. Some alter table features aren’t working yet but it’s getting there.

Not associated with the company but a very happy user.

Bonus point: YugabyteDB is full Apache 2-licensed so you can roll your own.


Just took a look and it seems pretty nice!

But found their pricing page (which was very hard to find other than the generic "contact sales" page) and it seems the starting price is 360 USD/month, that's not something we're comfortable with right now.

USD 0.25/vCPU/hour, minimum 2 vCPU = 0.25*2*24*30 = 360

https://www.yugabyte.com/yugabytedb-managed-standard-price-l...


> Fly.io price scaling is too aggressive 6$ -> 33 -> 154 -> 1000s a month and no free tier that I could find.

Fly has a general purpose free tier of 3 of their smallest instances. You can use that to run their 2-node Postgres cluster plus an app server.

The pricing you pulled is examples of various compute + storage configurations, not the exhaustive list of options. It should look like $4 (or free tier) -> $11 -> $21 -> $62 -> $82 ... + storage, since it's just 2x their VM price (for the two nodes) + any storage above free tier.


Last I used them (last year) their postgres offering, even scaled up to larger nodes, was significantly slower than the cheapest DO offering. I filed a few issues but haven’t checked back since.


Ah nice!

So the prices I mentioned where just example configuration. That's pretty cool then, specially with that free tier.

Will put fly.io back on the list and do some benchmarking in the future.

Thanks a lot!


Curious why a free tier is so important?

I think a FT encourages bad behaviors on both sides. I don't think pricing should be linear at all. But even for development, one is using resources, but most of the time they can be minuscule for individual devs.

Aside from production reliability, Postgres is one of the easiest things to get running on a VM and runs fine on a 5$ a month instance.


Free tier, like all things, is pretty bad if misused.

The reason we want a free tier is to try things out before we can actually commit to something. We don't know if what we are doing is actually gonna make money and sometimes we go a few months without working on it. So it's kind of a pain to pay for something we don't use.

That's why serverless is also nice to have on our current stage, things can just scale to 0 and there's no wasting of resource.

> running on a VM and runs fine on a 5$ a month instance

Easier said than done, unfortunately.


"Aurora serverless v2 is only for aws internal access and we are using gcp." You can have public access to serverless v2. I'm using it with retool for example. That said I moved a Postgres DB to Aurora, the process was hilarious in how crazy it was. Also they haven't implemented scaling to 0 yet!!!! And the minimum 0.5 Compute unites are actually pretty expensive.


Nice point about the minimum 0.5 ACU, forgot about that one. From what I've read 0.5 on v2 is the same price as 1 on v1, which seems pretty dumb coming from aws.

Could you elaborate more on this:

> You can have public access to serverless v2

Because the docs mention the following:

> You can’t give an Aurora Serverless DB cluster a public IP address; you can only access it from within a VPC based on the Amazon VPC service.

Potentially I could setup an RDS proxy or vpn inside the vpc and give that public access, but that seems a bit of a roundabout way of handling this. https://aws.amazon.com/blogs/database/best-practices-for-wor...


I can 100% confirm public access works :)


Could you elaborate on how you got it to work without a public IP?


Its a completely crazy process but it works roughly like this: (going from RDS Postgres on 12.6 to Aurora Serverless V2)

1. You create a snapshot of your original db

2. You update that snapshot to 13.4 (NOT 13.6!!)

3. You have to use the AWS CLI (because the online migrate doesn't work) to create a cluster from the snapshot

4. Remember this cluster is 13.4, serverless only works with 13.6, so we have to upgrade it later again

5. However we can only upgrade it when there is a running system

6. so we create a non-serverless aurora instance. Most instance types returned an error but using t3.medium worked, you can create that with public access.

7. when they are both created you can already try to ping your DB (it should work)

8. upgrade to 13.6

9. Now you are able to change the DB instance type to serverless

Edit: BTW I am available for hire as an external contractor for this kind of stuff ;)


AlloyDB is free during its preview phase (not sure how long that is).

https://cloud.google.com/alloydb/pricing#fair-usage-limits


Read gcp's policy and they say preview periods can last around 6 months, and I'm not sure when alloydb preview started.

But even if there's a free period, it'd be complicated to develop stuff around the DB for free, just to turn into 100s of dollars after 6 months, that's not something we want to see happening. So an indefinite free tier with limited resources would be better. Like aws lambda 1M or firebase function 2M request free tier.


Did you look at Crunchy Bridge? Not sure if they support that use case.


Took a look just now and they start at 35/month. They have some nice points around support, backup and disaster recovery. But if that's the starting point, I'd prefer something like digitalocean that has a similar product offer starting at 15$.

Thanks for the tip though!


Postgres is mind boggling, coming from sqlite. In a good way, and both are amazing tools.

   with ordinal

   jsonb_*

   ‘3 minutes’::interval

   create index on my_json ->> ‘a key’
It’s amazing how much stuff there is available. All the toys!


Just a quick point in defense of SQLite: that last one is almost verbatim possible in SQLite, and it is possible to calculate ordinals, although the syntax is with standard SQL rather than a custom syntax. The SQLite docs mention that they never found a use case for jsonb that ended up being faster or more efficient than json, so they left it out, although they do reserve the BLOB data type for jsonb if such a use case is discovered.


Well this is a doozy: so you’re saying they are both equally awesome as opposed to being individually awesome in different ways.

What a time to be a developer.


From the teams page, the CEO of Neon is the cofounder of MemSQL/Singlestore which is one of the best database products I've used. Looks like a solid team to get this done. Very similar approach to Yugabyte (real postgres compute layer + custom scale out data layer) and many others in the OLAP space.


Manish Jain of dgraph.io noted that building on top of Postgres or betting on Postgres seems like a necessary condition for database startups to be successful.

Some are commodotizing Postgres' wire format but implementing their own query and storage layers (like CockroachDB / Aurora / AlloyDB), while others are modifying parts of Postgres (like Timescale / EdgeDB / YugaByte), and others still are building atop it (Supabase).

https://twitter.com/manishrjain/status/1496174276474732544


Interesting note but that seems to be recency bias with news than anything concrete. Companies from MongoDB to Snowflake to FaunaDB have been successful. Manish himself is from DGraph which is a brand new graph database with no relational to Postgres.


Thank you for the kind words Mani! Singlestore is indeed an amazing product and company. I'm really proud of it!


Nikita - CEO of Neon here. We intended to post this at the launch next month, but since it here, I'm happy to answer any questions.

We have been hard at work and looking to open the service to the public soon.


Hey Nikita, could you maybe put some more legal information on the webpage?

I'm trying to find out if you're a company and where you are located. Is there no legal entity behind this? Do you have a privacy policy?


The company is Neon, Inc., which is registered in the USA. We're a remote company, with a significant portion of the developers being located in Europe.

Privacy policy and related stuff will be ready when we publish the public beta, which we expect to happen soon.


Yep, Delaware corp with top tier US investors. I'm in the Bay Area. Heikki is in Finland. Stas is in Cyprus. Majority of engineering is in Europe, some in the US and Canada.

Postgres is a global phenomenon.


How “cheap” is it to create new db instances?

I can imagine a world where it might be practical to have one master db for all of your customers/accounts. But a separate db instance for each customer’s data.

Is that the kind of architecture you think might be workable with your system?


It's cheap. Storage footprint is 15Mb and will be shrunk further. Min compute footprint is a 1 core container that shuts down when not used.

We are already working with customers that do that. This is for sure a great use case for Neon.


won't the tiny compute units on AWS have relatively slow storage? (no NVMe allowed for them I think) fine for small datasets that fit in ram but benchmarks are needed to show the bigger picture.


This is really exciting and thank you for making it open source. I am still trying to wrap my head around the Neon, but is there any design document or architecture description? I want to learn more about the Neon storage engine and how it all fits together.

Also, how do I get an invite code to try?

edit: found this to get started - https://neon.tech/docs/storage-engine/architecture-overview/


We will send you an invite code soon. This is a good start and also RFCs on github. We will be publishing more and more


Would love on as well.


Hey Nikita! I was just looking at the docs but I was a bit confused about what the various compute instances were doing. Do they all serve reads and writes? If so, is there data partitioning or does this support distributed transactions?


Various compute instances are different endpoints to separate databases. So for now it's single writer system. You can get a lot of power out of a 128 core compute node. In the future will will also spin up extra compute to scale reads.

In the future after that future we will introduce data partitioning - we have a cool design for it, but one step at a time.


Ah got it thanks! And what's the consistency on the instances that serve reads?

Super interested in this space since we're always looking for ways to evolve our pg!


Ignore this. I misread your previous reply ( ̄ー ̄;


Do you plan to solve for global data-at-the-edge availability? That to me is the killer feature for databases and one I’m direly in need of at work.


Yes, we are discussing either simply using Postgres replication to move data to other regions and use our proxy to rout reads to the datacenter closer to the user (like fly.io). This will have issues with supporting more than ~5 regions.

OR we can separate storage from replication and purpose build a multi-tenant replication service. This will support as many regions as you want (over 200) but it's more work. We will publish an RFC for that.


Cool stuff! Is PostGIS support difficult?


It's supported. The beauty of the architecture is that it doesn't break plugins.


Could you include it in the tech preview? https://neon.tech/docs/cloud/compatibility/


Sure, we will


Seems like this might implement database branching in the way most people would assume: branching both the data and schema? I remember being a bit disappointed to learn that PlanetScale's database "branching" was only for the schema [1], which is still quite useful, but this would be so much cooler!

I couldn't find much info about the replication models available/planned however. I would consider this to be table stakes at this point for a serverless database with the recent trend of pushing compute to the edge. This is much more interesting to me than scaling to 0, which is only really useful during the prototyping phase.

PlanetScale is single primary with eventually consistent read replicas, Fauna has strongly consistent global writes (or regional if you choose, but no option for replication between regions if you do) with a write latency cost, Dynamo/Cosmos are active-active eventual consistently replicated with fast writes globally. All useful in different scenarios, but I'd love to have one DB tech that can operate in all of these modes for different use cases within the same app, using the same programming model to interact with data across the board.

I think the decoupled storage engine here would open up some really interesting strategies around replication. What are the team's plans here?

[1] https://docs.planetscale.com/concepts/branching


Great questions!

1. Yes schema and data via "copy on write". This will let you instantly create test environments, backups, and run CI/CD. There is a long video here that shows a prototype with GitLab: https://www.youtube.com/watch?v=JVCN9X-vO1g&t=1s.

2. We don't have this feature at the launch, but Matthias van de Meent is already working on it. We will publish and RFC and solicit comments from the community.

3. We are working on two: regional read replicas and consistent multi-region writes (together with Dan Abadi who helped design FaunaDB). Former is much, MUCH easier.

4. An obvious one is a time machine - we want to allow you query at LSN (or timestamp). A less obvious one is templates: you can start your project with a pre-populated database. We will allow you to create and publish such "templates". Disclaimer - it might not be called templated when we ship it.


For those unfamiliar, LSN is "Log Sequence Number", a pointer to a location in the WAL (Write-Ahead Log).

https://www.postgresql.org/docs/current/datatype-pg-lsn.html


Amazing work by the Team. Congrats y'all. It was one of the best presentations in the PGcon22.

I did email Heikki the following questions, in case if someone from Neon is around here.

a) How does Neon compare to polardb https://github.com/ApsaraDB/PolarDB-for-PostgreSQL.

b) The readme mentions a component "Repository - Neon storage implementation". Does it use any special FileSystem? Any links to read more about it?

c) Heard the cold start is a second (IIRC), how does that value differ if one runs Neon on bare metal instead of k8s?


Thank you!

a. PolarDB is based on a similar idea. https://www.cs.utah.edu/~lifeifei/papers/polardbserverless-s.... This paper describes it. The biggest difference that I see glancing through the paper is that we really integrated S3 into the storage. In Neon architecture branches, backups, checkpoint are all the same thing and instant to run. This simplifies a good amount of database management AND deliver on better costs. S3 is cheap.

b. Neon doesn't need a special filesystem. Neon storage is in a way a filesystem, however it doesn't expose filesystem API. It's a key value store - serves 8k pages to Postgres and a consensus - update API to the key value store. Pages are organized in LSM trees and background processes put layers of the LSM trees to S3.

c. The cols start is 2sec right now. There is a dependency on K8S. Bare metal implementation will require new code to orchestrate starts and stops.


> S3 is cheap.

S3 has its limitations though, like too many small files and the get/delete/list ops get very expensive. There's also an upper-limit on throughput per S3-bucket partition. I guess, sstables that pageserver flushes periodically help work around these issues?

> Neon storage is in a way a filesystem, however it doesn't expose filesystem API.

Genuinely curious: When would anyone consider using filesystems like Amazon FSx for Lustre instead which is backed by S3 anyway over implementing a filesystem-esque abstraction of their own (like neon.tech does, and other solutions like rocketset.com, tiledb.com, xata.io, and quickwit.io do).

> Pages are organized in LSM trees and background processes put layers of the LSM trees to S3.

Curious how merges are handled? Also, are you using RocksDB / some other engine underneath?

> Bare metal implementation will require new code to orchestrate starts and stops.

Speaking of new code... SingleStore started as a very high-throughput OLTP database and eventually evolved to into a HTAP (?) database. Do you see Neon evolving in a similar manner, too?

Thanks!


1. Yes. Our first attempt at storage implementation had a problem with many small file. Then the team rearchitected it around LSM trees and it got a LOT better. Our benchmarks show that we are very close in performance with vanilla Postgres and Aurora. There are some "worst case" scenarios where Neon is worse than vanilla Postgres. Aurora has similar problems too.

2. It's best to custom build a storage system here. External distributed filesystems introduce complexity, cost, and bottlenecks that you don't control.

3. Purpose built. LSM trees also have a temporal dimension - LSN. You can fetch a page by pageId and LSN. This is what allows time machine and branching.

4. I call it convergence when OLTP and OLAP is one system - ultimate dream for a database systems engineer. Since I spent 10 years building it I have both scars and aspirations. I think it will come, but this will take a long time. HTAP is in a way a subset of convergence - most systems will have some HTAP. Neon will have some too, but for now it squarely focused on OLTP and helping developers build apps.


The way you describe it, to me, is one of those “this sounds obvious in retrospect”. Sounds completely elegant and “right”. Congratulations on a great idea. I really hope you pull it off!


Thank you! We are super hard at work. You can see our velocity here: https://github.com/neondatabase/neon


> we really integrated S3 into the storage

Will it be possible to use something else in place of S3? I'm thinking on-premise or what some would call a private cloud.


Right now, it should be possible to use anything that is compatible with the S3 API, as our current focus is on getting the product to the market. Once the business model is proven, we'll likely branch out to other clouds, with their storage providers.

If you can't wait that long to run Neon on your own cloud, feel free to contribute an integration to your persistent blob storage: the code is available under APLv2 here: https://github.com/neondatabase/neon/


>serves 8k pages to Postgres

will page size be tunable on neon cloud for larger datasets?


No Postgres only requires 8K. One can imagine adapting Neon storage to other engines then of course this can be extended.


> It was one of the best presentations in the PGcon22.

I can't find it on Youtube, do you have the link?

edit: I found the link, seems it is not on the Youtube yet: https://www.pgcon.org/events/pgcon_2022/schedule/session/236...


I can't recommend this presentation enough!


> c) Heard the cold start is a second (IIRC), how does that value differ if one runs Neon on bare metal instead of k8s?

Yeah, as Nikita mentioned it's 2 seconds now. We did some tests and measurements and on bare metal, it's sub 500 ms usually, so the remaining part is the k8s (+ our own control plane) orchestration overhead. For example, with plain Docker (which we use in CI in addition to k8s) it's around 1 second already.

K8s provides a convenient abstraction layer, though. So I think that we'll continue using it and optimization will come with pods pool / over-provisioning and it'll be realistic to bring the startup time closer to bare-metal.

-- Cloud engineer @ Neon


Why is this a good idea? In my experience, getting Postgres up and running is trivial. Docker anyone? And in many cases your data is your business so why hand it off? And if you are going to offer this product why not just call it what it is, "Postgres as service", instead of serverless which seems a bit misleading. Really it is simply Postgres running on your server.


Not everyone can manage a database properly and, sorry to say this to you but, Docker is a terrible idea for a database in general. Setting up your own database somewhere still puts your trivial data on someone else's server more or less.

All these "Serverless" keywords pretty much mean you don't have to be spinning up servers (cloud) or setting up & maintaining one. Nothing is "Serverless" per-se so it's time to move on from picking on this, I agree, bad choice of words.


> Docker is a terrible idea for a database in general

Why? Genuine question, my gut feel is there's something wrong about it too but I can't put words to it nor have I found a benchmark that convinced me, but it's worth noting I'm not sure what I'm looking for


I manage about 200 servers and docker crashing accounts for 20%+ of my issues so far. The servers are brought up easily on crash and that's not an issue for my services. For a database docker is nothing more than an extra layer of complications on top with iptables, volume system and all the layers it brings. It's just a bad wrapper for a production database which needs stability.


To be fair, that sounds like a problem with your infra. There are plenty of enterprises running Postgres in Docker/Kubernetes successfully. A lot of technical problems are not the stack or the engineer's capability, but the company culture and lack of funding, time or resources. If you had more of all 3, im sure your problem could be "solved" to an acceptable margin of error. Likely by you.


I knew my bet to sticking with Postgres would pay off! This looks super exciting.

I thought of doing something similar for our data warehouse with AWS Fargate and Postgres but the cold starts and limited disk space required too much engineering on top to make it work.

Moving to Snowflake comes at the cost of losing so many Posgtres features in exchange for speed. Things like foreign keys, constraints, extensions etc which requires so much engineering to replace in Snowflake. I would be happy to pay 25x the price for a 10x speed increase for a specific query.


Thank you!

Snowflake is a better cloud data warehouse than Postgres, but of course Postgres is so versatile. Neon will give you some of the Snowflake features: time machine, cloning - we call this branching, data sharing.


Snowflake is a data warehouse though. Completely different use case.

If your data can be done via PG, highly recommend that over SF. Especially with this concept.

Snowflake is great when you use a tool like dbt, their modern SQL approach and functions are fantastic. Downsides is it's pretty pricey, and can catch you out.


We already use DBT and the data is less than 10TB, something Postgres can handle well. And most of the data is concentrated in a few tables. With a serverless approach I'd be happy to allocate 10x resources for just a query or two and for the rest a minimal server is fine.

I manage the data warehouse mostly alone because Postgres offers guarantees, unique constraints, triggers and relationships between columns of different tables. It does the work of two engineers. Snowflake is fast but not Postgres compatible. In order to move to Snowflake, I have to write tests and maintain them which Postgres does for me for free.

I'd stick with Postgres at least until 20TB before considering Snowflake.


Snowflake is an OLAP system. It's an entirely different kind of "speed" designed for analyzing vast amounts of data through scans and aggregations.


Neon now opens the door (at least in my mind) for Postgres to be used for analytics or a data warehouse for almost an order of magnitude more data before having to consider Snowflake.

Basically if someone is already using Postgres as a warehouse, then they can prolong their migration to Snowflake by at least a year by using something like Neon.


Sure but there are plenty of OLAP solutions like Greenplum, and extension-based offerings like Citus and Timescale, that can all partition and scale across nodes to massive datasets with column-oriented storage.

AWS Redshift is also built on Postgres (although a much older and customized version).


I hope y’all have a plan for when AWS decides to pick up your open source project and turn it into a managed cloud solution. It’s a pattern of theirs. And with the way egress charges are structured they’re likely to snap up any clients straddling their cloud and yours.


AWS already has Aurora, which is their own in-house closed-source variant that does very similar things.

We think we'll be able to provide a better experience at lower cost for smaller developers, while having some very useful quality-of-life features like zero-cost branching and instant PITR.


AWS already has this, Google for Aurora Serverless. It is not cheap though, and this might well be cheaper.


AWS has Aurora Serverless v2 out already for postgresql along with RDS for postgresql. Not scale to zero though / bottom is 40/month or so


I am trying to understand how it works without digging into the code. It sounds like the disk-backed storage here uses S3 which would introduce some severe latency as well as orders of magnitude more access errors (S3 is not going to be more reliable than EBS, let alone physical disk arrays on a day to day basis). Also how do they mitigate latency from their network to mine? In other words why would I run this over a local install if performance mattered at all to me?


How it works is:

PostgreSQL WAL is sent to 3 'Safekeeper' nodes, which provide temporary persistence of WAL on their local disks. This allows us to provide low commit latencies.

After Safekeepers acknowledge the WAL, a PageServer will receive the WAL from these Safekeepers and transform it into LSM-tree "Layers" - blocks of lookup-optimized changelogs, which (when complete) are sent to S3. At that point, the data is considered fully persisted against most, if not all, outages.

The PageServer (which serves as the long-term data server for the running compute nodes) maintains a local cache of Layers. Still, by design, that is only a cache -- it allows for fast responses but is not strictly necessary for the persistence model.


Again I am super impressed with the technology involved but do want to clarify: in order to have D in ACID the update must be sent to S3, right? Is there a mode which makes it so that an INSERT, UODATE, and DELETE do it return until this happens? What kind of latency does that introduce and is that latency affected by throughput at all?


Kind of. S3 is the long-term low-cost durability guarantee, while our Safekeepers (3, each in a different zone) provide a high-cost short-term durability guarantee with their local persistent disks.

Latency from PostgreSQL WAL to S3 depends on WAL throughput and the configured pageserver checkpoint distance (default 256MB, and this config field is not equal to that of PostgreSQL).


When you say short term do you mean for hot data or that the guarantee is short term? As in, once it is written to the Safekeeprs is there any chance that the data will disappear?


We keep it there for a short duration, until the changes are confirmed to also be written to S3.

Writing to 3 instances in 3 availability zones is considered persistent enough while also maintaining a high performance, and even though it does not provide the 11 9s of durability that S3 has, 3 availability zones dropping out with loss of all instance-local storage is considered rare enough that we do not think that it will impact our availability and durability guarantees.


That makes sense, thank you! Sounds pretty damn robust.


Many distributed systems offer ACID by using distribution + replication for the initial write commit.

It's much faster and cheaper to just have your data on multiple nodes (RAM or local disk) and provides better reliability against crashes. Data can then be compacted and streamed out in an async fashion to more durable storage.


Can we set our own logical replication? I'm listening to the WAL the same way Supabase does it: https://github.com/cpursley/walex#logical-replication


In our cloud offering logical replication is not yet supported. There's an open epic to support receiving a logical replication stream [0], but we've not yet checked on sending logical or physical replication streams.

[0] https://github.com/neondatabase/neon/issues/1632


Everything you've described sounds unbelievably awesome - this is the only thing I can think of that would make it better. If we can get all of these features (I've been dreaming of branching for years, scale to zero!, unlimited storage - just not having to think ahead about it!) plus easy, even on by default logical replication, you will have found the holy grail of cloud databases imo.


Well, you can also run this as a local install if you want to eliminate network latency. :-)


Short answer: page servers and cache on compute nodes hide S3 latency. An architecture description is here: https://neon.tech/docs/storage-engine/architecture-overview/. We will be publishing more on that.


How fast can a "scaled to zero" database start up? Does Neon use a "uninitialized hot spare" strategy to reduce startup latency like crdb?

How much memory do they expect a typical single postgresql compute instance to take? I saw that Neon is targeting 'thousands' of postgresql processes per server, though with giant multi-TB servers these days that doesn't really narrow it down.

Are the postgresql processes multi-tenant as well, or is multitenancy isolated to the storage layer?

---

Heikki from the Neon team presented a talk about why they chose to develop Neon in Rust and what their experience was in Rust Finland 2022. https://www.youtube.com/watch?v=kAQeout-mh8


Postgres process in single tenant. Right now both provisioning a new Postgres instance (we call it project) and cold start is 2 seconds. We will be improving on that.


https://youtu.be/kAQeout-mh8?t=700

Nobody knew Rust, so they started out by hiring someone who did. Good move.

Business idea: consultancy that hires out competent Rust devs to new projects.


Are they going to stay up-to-date with the latest version of Postgres? One problem with Yugabyte, TimescaleDB, Aurora etc. is that they are stuck on older versions of Postgres, which makes it feel like an entirely different product after a few years.


"One of these things is not like the other"

TimescaleDB, because it is packaged as a PostgreSQL extension (and not a fork, unlike the others), stays compatible with mainline PostgreSQL, especially as PostgreSQL improves. This is one of the key advantages of our approach.

(Timescale co-founder)


We are planning on supporting the latest stable version of PostgreSQL. Right now, we're a bit behind (we're at 14.1, latest PosgreSQL is 14.3) but that shouldn't be much of an issue.

We don't yet know how we're going to do major version migrations, as the product is still not even out of private beta.


"how will this handle major version upgrades" was one of my first/biggest questions when reading through the homepage fwiw


Major versions don't come out suddenly. We have time to test and make sure they work well with our storage.

Customers don't always want to upgrade to new version when old versions just work. However this can lead to version creep. Ideally we want to always run the latest version of Postgres. We will hold this line as long as possible.


Sorry, I meant less "how long until they catch up to trunk" and more "what is the downtime / operational burden of performing major upgrades like".

With separate storage and compute it seems tantalizingly possible to have low-effort zero downtime upgrades for the user, which would be a huge selling point.


Having a relational database where you're charged purely for the calls you make is a game-changer.

All of the relational databases I looked at in the past required you to have a gateway node on at all times, which is far too expensive for a simple hobby project.


In case anyone else is wondering, it was called Zenith / ZenithDB before launch: https://github.com/neondatabase/zenith


Neon is a better name!


This is really amazing, super excited to try it out.

I read the docs and I noticed you can run it locally, but has the kubernetes bits been made available? I see https://github.com/neondatabase/helm-charts and https://github.com/neondatabase/neon/tree/main/.circleci/hel... but I think there is some charts missing?


Correct: we do not yet use k8s for provisioning the Safekeepers and PageServers for our closed-beta cloud offering, and the PostgreSQL instances are managed in k8s by our closed-source console. As such, there's little we can open-source at the orchestration level at this point in time.


Is there a timeline on releasing orchestration things?


> Branches are virtually free and implemented using the "copy of write" technique.

Copy on write, presumably.


Fixed, thank you!


This sounds really interesting! I wonder what kind of scaling use cases neon is good for. Is it e.g. good for custom scenarios like a geospatial timeseries database on top of postgres?

We have admittedly not really a clue about current database cluster tech as we are IoT/ML researchers, but we are running a custom timescaledb cluster that receives constant nonchunked write load from a lot of devices and may encounter some long running queries on an around 500GB DB filled with geolo (even timing out if users are too creative), why we splitted into a single ingress master and multiple outgres WAL readonly replicated query clients to relax the consistency and sync, that seemed to be killing us (we need postgres because of postgis and have no capacity to rewrite the front-end). I wonder if neon would be good for such a use case and if it easily supports postgres extension like timescaledb hypertables and postgis). Most of the time our system just measurements, but sometimes we really need to scale up for PoCs, which makes dimensioning really hard (for us).


Neon has the ability to make a Copy-on-write replica of a database, which for you would allow you to create an instant read-only copy of the data that has been ingested up to that point, without significant storage overhead. The new data would still be written to the primary database, and long running queries would only see the snapshot that the instance was started with (using their own pool of CPU and memory resources).

Assuming that the extensions that you use are compatible (that is, they don't access database files in a way that PostgreSQL doesn't, and the licence is compatible) then Neon could be a good solution to that issue.


Nikita, I am a huge fan of SingleStore - amazing what you’ve accomplished there and now what you’re building at Neon.

Do you plans to build a columnstore index on top of postgres that supports insert/update/delete?

Love how MSSQL has a columnstore index for a subset of columns on a row store table.

Always wondered why nobody has built something like that for postgres yet.

Citus is nice but it’s append only, which is a huge restriction.


We are not building it in house right now - we are focusing on OLTP. But have plenty ideas of how to do it in the future. In the first iteration we are thinking about an integration with Singlestore, Snowflake, and BigQuery. Singlestore will be best due to low latency updates, but the other two are very popular so integration with Snowflake and Big Query just makes sense.

In the future we can integrate a columnstore right into the engine to make a smooth on system experience. There are some awesome open source implementations: arrow and duckdb. Updatability is tricky but doable as proven by Singlestore and SQL Server (I'm ex SQL Server and a huge fan of this feature). Not this year.


Makes sense. I agree.

DuckDB is pretty phenomenal. I enjoyed reading its source code and playing around with it.

Another open source nice C++ codebase is Typesense for Hyperfast text search (algolia competitor).

It’s been on my mind for many months how to build indexing like this as postgres extensions.

I love how versatile postgres is with so many indexing datastructures.


Again it is a big commitment to do a good job on implementing columnstores. If you don't do it all the way it not very usable and you are confusing your users by giving them too many options. The performance expectation is now set by great columstore implementations and you just can't afford a half ass job here.

Here is an example from Google AlloyDB: https://twitter.com/mim_djo/status/1527900193626025984. My understanding that DuckDB is even faster on the TPCH benchmark. TPCDS is much harder and I doubt AllowDB can even run it at any reasonable scale.


Well this seems like a big deal. "Needs a database" has been an ongoing pain point for working with serverless apps. Staying in the postgres command library, if it works, means you can now just prototype and assume "postgres will be there".

I have a couple of apps I would've used this exact service for in the recent past. Looking forward to trying it in the future.


This sounds awesome, but one of my first reactions to the notion of separated compute/storage Postgres with copy-on-write is "okay, so… slow Postgres?"

Is there anything the dev team can share on read/write performance compared to RDS, NVMe EC2 instances, EBS-backed EC2, etc? In what situations would this setup perform poorly, and in what situations would it excel?


I am glad someone is taking on this challenge! Its been a few years since I was saying that the last piece of the serverless puzzle is a good postgres serverless database.

If I was not working on my startup I would apply for sure. It would be nice to present the project on the CMU database group youtube channel at some point to dive into the implementation.


I'll work with Andy Pavlo who is a friend to set up a presentation.

And yes, we are hiring! So if and when you are ready let us know.


Can you use extensions with it like normal postgres?


Sure. So far, we just precompiled a few popular extensions, and they are available for installation. Ultimately we want to provide an option to bring your own extensions e.g., by specifying a docker image based on our base image. But that is some work on a security front: with a custom extension, you have access to the corresponding Unix user and can construct malicious WAL, send requests to the control plane, etc.

Disclaimer: Neon co-founder


I'm guessing no, or at most just have set of extensions that you can use (kind of like it's done in AWS RDS or AWS Aurora).

The claim is "serverless" i.e. you don't have access to the server and if you could install any extension you essentially have full access to the server as there's no restriction what you could do in an extension. I don't think that would be allowed.


https://neon.tech/docs/cloud/compatibility/ says "During technical preview Neon has restrictions on user ability to install PostgreSQL extensions. Following PostgreSQL extensions come pre-installed: [..]"


What’s the licensing story? Can I run this in my org and expose this as a service to users outside of my org?

Or does it have strings attached, like CockroachDB?

Asking because github says Apache 2 but the devil is in details.


Yep, for now the storage engine and the modifications on PostgreSQL are licensed under APLv2. Our cloud offering which orchestrates the scale-to-0 and other glue between the components are closed-source and not covered by that license.

Do note however that the licensing story is not entirely fleshed out yet, as the product we're building is still in closed beta. As we work towards a paid cloud offering, we'll further flesh out the license model for the code, but for now we're planning on keeping this license.


You can, no strings attached


Awesome, will take it for a spin.


How do I keep my application servers physically close to Neon?


Right now, we have our hosting in AWS (us-west-2).

We're planning on expanding to other regions and cloud providers eventually, though.


Having data in Europe will be important for many users, hope you expand to EU soonish.


I'm a bit concerned that the free trial mentions "compute up to 1 vCPU / 256 MB"

Why would I need to worry about this for a serverless database provider?


We are limiting the amount of compute we are giving for free during the technical preview. We will open it up to much large computes later. Like those will be in the paid tier.


That sounds good, but my question was (and I probably phrased it poorly), as a solo developer, why would I need to concern myself with resources associated with servers (compute, CPUs, MBs of RAM) on a serverless platform?

I could maybe see it in the sense if it's only applicable when Postgres is executing complex custom queries or whatever, but for any operation? That's a huge red flag, for me at least.

When I think of serverless, I think of things like Cloudflare Workers, FaunaDB, Ably.io, all of which have pricing based on the usage of their features, not the consumption of their resources. The whole point being that I can way more easily calculate how many messages my users are sending, than how much CPU all those messages are taking to send.

Maybe I'm operating on the wrong definition of serverless, and all of the other features look amazing, but that concept is really a dealbreaker.


FYI, the image at the bottom says "Zenith's prices" instead of "Neon's prices"

https://neon.tech/static/saas-illustration-lg-410ada378df755...


Fixed. Thank you for catching in. Renaming a company is hard even pre launch :)


I know the pain :)

Good luck, this is a very exciting project. I'm extremely curious to see how it unfolds…


This is really interesting. Are there restrictions or limitations on the PageServer + Safekeeper design when running OLAP queries on larger datasets?

Phrased another way, would a query that needs to access a relatively large amount of data (10-100 GB) ever need to read from s3, incurring extra latency?


In general, all data of live clusters will also be stored locally at a PageServer.

Only in recovery scenarios will a PageServer not hold the data that is needed to serve the requests of a compute node - but that would recover quickly as the local cache of the PageServer is repopulated with data from S3.


Awesome, thank you!


Does this storage layer still require VACUUMing? The Aurora system that AWS has uses a log based storage system, so along with point in time restore there’s no longer a need to go free up old garbage rows. This system says it’s copy on write, wondering if vacuuming is still necessary.


The difficulty comes from the fact that VACUUM requires most of the same state that is required when processing a user transaction. For example, VACUUM needs the commit log to check whether a given transaction ID is for a transaction that committed or aborted (VACUUM also truncates the commit log to remove information about older XIDs when it is no longer required).

Aurora Postgres does still have VACUUM, which seems to work in the same way -- which is the same in Neon. AWS have in the past promoted Aurora as having significantly better performance characteristics when VACUUM is run. That may well be true, but the benefits seem to come from not having to generate full page writes in WAL, which are a way of preventing a low-level problem called torn pages. Theoretically you could just turn off full-page writes in standard Postgres if you had hardware that offered atomic page writes, though I don't think that it's a widespread practice.

I spend most of my time working on problems with VACUUM in Postgres itself (I'm one of the committers). An approach to organizing storage within transactional constraints seems necessary to push vacuuming down to storage, and that would more than likely need plenty of work in Postgres itself to be practical -- since it would cross a few layers of abstraction. Currently heapam doesn't specifically try to keep tuples inserted around the same time together, so it's hard to make anything that VACUUM does work implicitly or logically.

Disclaimer: I work for Neon


Thanks for the explanation, I’m nowhere near your experience level with this, but from what I understand Aurora is an append only log with rolling snapshots (this might be wrong). In an append only log storage system there would be no concept of page writes, right? The VACUUM process might still be necessary to find and mark dead tuples, but this seems very minor compared to filesystem operations to reclaim (sometimes gigabytes) of storage. The reclamation of storage would simply left to the background snapshotting process and would not affect DB performance in any way.


The most important job of VACUUM is usually to make sure that queries remain responsive -- it's roughly comparable to a "full scan" within generational garbage collectors (for managed language runtimes). VACUUM does reclaim space in various ways, but that's often considered to be of secondary importance.

In Postgres, the on-disk representation is virtually the same thing as the in-memory representation used by pages stored in the buffer cache. In a system like Aurora or Neon, the representation of pages in the buffer cache is identical to that of Postgres (or has very minimal divergence to deal with one or two isolated problems). That part doesn't really change, which makes it possible to have a very deep level of compatibility without much effort. So it's radically different in one narrow, scoped way, but otherwise very similar.

While the storage knows how to materialize pages on demand, these are not transactionally consistent pages -- they often need to be interpreted by using metadata about transaction commit status, just like in Postgres.


Yes, it still requires VACUUM. I believe Aurora does too.

It would be cool to push down at least parts of the VACUUM down to the storage layer, but it would require more invasive changes Postgres code, which we try to avoid. Maybe in the future. Ideally though, I'd like to improve PostgreSQL itself, to reduce the need for VACUUM in the first place.

Disclaimer: Neon co-founder


What are your thoughts on orioleDB? Which is a slightly modified Postgres that does away with the need for vacuum, amongst other changes. Ostensibly done by a dev who is “solving Postgres’ wicked problems”. Eventually the team behind orioleDB plan to upstream their minor changes to vanilla Postgres such that orioleDB becomes simply just an extension of Postgres itself.

Would you ever consider supporting the orioleDB extension in the future?


If it becomes more widespread and mature yes. We are in contact with Alexander Korotkov - Postgres committer and the auther of OrielDB and compare notes from time to time. Right now it makes a lot more sense to us to build our storage at the page level and have minimal changes in Postgres itself.


This is exactly what I was looking for windmill, an OSS tool for multi-step automation from scripts. By any chance, are documentation to host it on top of nomad be on the Roadmap ? If yes, I would try it immediately to replace the postgres server we currently use.


No documentation of how to self host yet. It's not hard, it requires K8S and S3 compatible object store. We want to harden Neon operationally before documenting and supporting on prem production installs.


Requires k8s as in - uses k8s APIs? Or is supported on k8s?

Apologies haven’t read the docs but wanted to highlight this. A hard requirement on k8s to the exclusion of other schedulers would be a shame and an odd choice


We've built our current cloud offering on k8s, and that is our only scale-to-0 implementation.

That does not mean that you cannot run Neon outside k8s, but we are not actively maintaining nor supporting other hosting options.


Got it, makes sense. I think we are pursuing similar strategies with OSS/self-hostable products. I would love to get in touch with you about it. Could you drop me an email at ruben@windmill.dev ?


Replied


that's one of the best lookin landing pages i've seen this year!

very well executed! congrats!


Thank you. PixelPoint (https://pixelpoint.io/) designed the landing page. Copy stuff may change we want to make the story clearer and highlight more features.


Could you document some of your differentiation against aurora, both on price and architecture? I don't care about scale to 0. I care about more scale to NNN, efficiently & reliably, with minimal devops needs.


We will put more of the Aurora differentiation docs up in a bit. Architecture wise we are slightly different b/c of the S3 integration and splitting storage to safekeepers and page servers. This is very technical and we will be highlighting both this and user visible features, such as branching.


Incredibly good sales copy by someone who understands their target market.


I wonder how well this will work with PostgREST. Looks very interesting.


We will be launching REST api with PostgREST later this year.


Why not use normal networked storage (nbd/drbd, AWS EBS, etc.) instead of the ad-hoc Pageserver + Safekeeper architecture?

Or even better, use simple local SSD/HDD storage where data is small enough.


Because all of those don't or can't have both low-latency reads and regional persistence guarantees, which is what we're after.


don't know too much about this yet but i've been desperately wanting there to exist a postgres equivalent to planetscale. Bravo guys! I'll keep you guys in mind for my next mvp.

PS: btw why call it Neon? thats already the name of the rust/nodejs interop library which I assume you know about since your storage layer is written in rust.

PS: imagine a colab between this and fly.io. I wish thsi stuff was available when I was starting Blinq


We already have a shared slack channel with Kurt - CEO of fly and a number of folks from Fly.io. We are discussing our partnership.


thats awesome. what will the pricing be when its done? my startup currently spends around 400-500/month on aurora. would be curious what the differences would be here in terms of TOC.

the ability to branch databasees alone is a pretty compelling killer feature imho


How exciting!

What version of Postgres are they targeting? And do they have a strategy for keeping up to date with new versions?


One idea we have is to allow running future unreleased and beta versions of Postgres. Do people want this?


Right now, we're based on PG14 (14.1 to be precise).

We're looking into supporting PG15 when that comes out too, but I wouldn't hold my breath on that as we still have a lot to do.


Says 14 in the docs


Does this handle sharding? Or is it like ordinary Postgres, which only performs replication?


It is ordinary PostgreSQL, in which only the storage interface with the file system is replaced with our own storage interface.


Anyone got a spare invite code? =D


Same here please!


same here ^_^


We got a lot of "join waitlist" submission. Thank you HackerNews for this!

Please email beta@neon.tech for the invite code. We will ask for feedback in return.


I need some kind of pricing for me to consider this. Did I miss the page describing that?


We haven't launched pricing yet. In fact our official launch is in a few weeks. This HN post "leaked". We will launch, allow people to use it for free and introduce pricing later this year.


The site just invite to trial run. Any basic doc, tutorial etc. come from db2 …


The docs are available on https://neon.tech/docs/cloud/about/ . It contains some information, but we're looking to expand that while working towards a public beta that's arriving soon.


In what way is this serverless? Surely it's running on a server?


Refers to management and autoscaling, not whether physical hardware exists somewhere. From your point of view it has DNS entries, you've defined how high/low it's allowed to scale and some other configs, you've defined network access rules, and allows client connections. That's it. You don't manage the server instances. Hence: "serverless"


Thank you for explaining. There must be a better term than serverless though. "Auto-scaled cloud database" would be more accurate, for example


Cool! Can be used in production already?


We are planning to open public tech preview next month. We'll go to general availability when we see a decent SLO for several months. Tech preview should be okay for hobby projects and staging workloads, though.


We cannot yet recommend users to run their critical workloads on Neon.


"We separated storage and compute"

No. just no.

Compute where the data lives, else you incur traffic cost and latency.

A lesson learned in life, known from the dawn of time..


Do note that many installations of PostgreSQL already have some form of "separation of storage and compute" through a networked storage solution like NFS, EBS, or other SAN-like systems.

The major part of what Neon does is remove the file system abstraction that is between that Storage and Compute, so that we can better utilize the available resources because we can better select what information is or isn't being lost.

A good example of what removing the file system abstraction enables for us is effectively free PITR, (lagging) replicas, and data branching. This is because PostgreSQL's file-system-based storage engine expects to be the only one working on the data directory, which means that any FS attached to a replica cannot be shared. If you remove that file-system based storage engine and plug in a different storage engine, those expectations are removed too, and after some effort integrating into the smgr-APIs, we're now able to provide a storage layer that only needs to contain one copy of the data for N physical replicas, instead of N copies.


The fact some systems do it, doesn't mean it is correct or optimals.

The latency of NFS & EBS or EFS is actually the reason many businesses *do not* use them for their databases.

I've seen deployments that had to go bare-metal because the tiny latency of EBS caused their compute times to rise exponentialy (doing AI training).


I agree that it does have its limitations, but databases are all about trade-offs.

Talking about the "correctness" of a choice between tradeoffs is weird though. Running your database on hard drives nowadays is not a great choice - yet people still did that because the cost of hard drives was way cheaper than that of memory.

Running your database in a way that doesn't guarantee that a 'committed' response actually retains the changes that it was responding on - yet people still ran their database in such configuration to scrape the barrel on performance.

All Neon does is put up another option: If you don't mind the implications of networked storage, then here's one database system that has zero-cost cloning and does not lose data on single-node failure.


This isn't binary as much as it's a continuum of abstractions already. Cloud servers already use virtualized disks where the actual drives can be in an entirely different building from the compute racks. Do you consider that separate?

Separating storage and compute at the database nodes is just one architecture method that allows for simpler and more efficient scaling in modern cloud-based deployments where you have large pools object storage that's fast and close. I think you'd be surprised by what you can achieve with some metadata and caching.


What? This isn't controversial, Snowflake did this years ago and this is the main reason it became a $40B business.

Amazon came out with Redshift, a cloud OLAP database, but it tied compute with data so teams couldn't scale compute and data separately and thus had to pay disproportional costs to their required workloads.

Sure, there's good reasons to keep compute and data together. But there is obviously a massive market for technologies that keep them separate...


Snowflake is a BI data platform, which doesn't require low-latency and fast response like systems powered by RDBs do.

Apples and oranges.


I believe you were the one who implicitly started the apples and oranges comparison with your original post

> "no. Just no."

Conveys the feeling that there is no scenario where doing this is feasibly and you yourself just said it's acceptable for BI but not AI training.

So the "no. Just no." actually means, "my use case does not allow for this, so I believe no one else should use this" which is a fallacy on its own.

In conclusion, this is fairly usable, but like everything else, it's not for all use cases.


Easy trade-off to get scalability. Writing log off-host is fast with 10G network and nvme on the other side (see Amazon Aurora, Microsoft hyper scale, etc)


The goal of the "cloud" is to extract maximum $$$ from its users. Once you realise that, it's not hard to see through the marketing propaganda and the rest of the decisions following.

else you incur traffic cost and latency

Thus, they are separated so they can charge you (more) for them separately.


Zero bandwidth cost within the same zone.


What is “bottom-less” storage? This just seems like a managed hosting of PostgreSQL, nothing else…


We wrote a custom storage layer for postgres, so in our setup, postgres node (k8s pod actually) doesn't store any data, and it is easy to start/stop/reschedule it. So while we actually are DBaaS, the closest analogy here is Aurora or Alloy, not RDS-like setups.

A bit more details are on https://github.com/neondatabase/neon


Typically it means S3 and alike. Not being bounded by disk size of your server.

Or in their case they might refer to horizontal scalability of their storage layer which is independent from computing.


Yes, you will never run out of storage with Neon. The reason to it that in our tiered storage implementation S3 is the cheapest tier and we will offload data into S3 if it gets too big.



Our Postgres is just a cluster of Postgres servers. It's not "serverless"; it's server-y. :)


Typo in footer 'Maid in SF and the World' should be 'Made in SF and the World'


Fixed - sorry, sleepless night.


Please stop this abuse of language.

Here's serverless sqlite: https://www.sqlite.org/serverless.html


People don't say "serverless" to mean "there are no servers involved." They say "serverless" to mean that the data architecture is such that:

1. scaling does not require thinking about servers; and

2. you don't have to pay for committed capacity measured in servers.


Then why not call it “scalable”, “elastic”? All these neologisms are tiring.


Also, simpler answer than I already gave: “elastic” and “scalable” X don’t imply a lack of committed resources, but rather just a control plane that auto-scales your commitment to ensure that you’ve always got more than you need.

E.g. AWS Elastic File Store: an auto-resizing NAS-like abstraction, but at any given point, a committed size where you’re paying for unused space.

Vs. Amazon S3 (which might as well stand for “serverless storage service”) — where you only pay for what you use, with no committed capacity beyond current usage.


Marketing doesn't care about your personal preferences :)

And Marketing is King, for bottom-up SaaS businesses (businesses selling to individual developers), which most companies open-sourcing their core projects most often are.


The other key feature of "serverless" is that the store-of-record for a "serverless" system isn't a part of the running system — i.e. the system's state doesn't live on any particular server / cluster that is run as part of the operations of the system.

Instead, a "serverless" system's store-of-record is some external and semantically-abstracted storage system — e.g. "a remote git repository" (GitOps); "an object-storage bucket" (Snowflake); "a document store" (Lambda); "a blockchain"; etc. Where all that's important about this storage system is its API, such that the design is portable to any backing storage service that supports the same API.

This means that, in serverless systems, the mutable state in the cluster is just an ephemeral cache representation of the externally-managed data-at-rest.

Another way to think about this is by thinking of "a server" as a thing with durable state that you have to worry about — e.g. protect from disk corruption, make backups of, etc; vs. something like "an ephemeral immutable-infrastructure container workload" that can die and be recreated with no problems. Serverless systems are systems without any "servers" in this sense — nothing to back up; nothing to disaster-recover; etc. Nobody operating these systems ever needs to think about individual servers. Nobody ever needs to SSH into a server, upgrade a server, restart a server, etc. The operations for such systems can be handled entirely at the "cluster" and "ephemeral workload" levels. Nodes within the cluster that "go bad" can simply be drained and deleted — this may even be automated.

And further, because of this lack of local durable state, there's no need to worry about "allocating" that state, and thereby allocating customers to particular clusters. Serverless compute clusters are usually just one huge cluster (per region), where customers' individual request workloads just get scheduled onto that cluster wherever they'll fit.

Of course, the external store-of-record for a serverless system must have "servers" — the data-at-rest ultimately has to reside durably on some disk somewhere. But 1. they're not your servers, and their ops problems are not your ops problems; and 2. because they're a much lower-level abstraction, they can scaled much more robustly, and so can have far fewer operational problems in the first place; and 3. because they're a much lower-level abstraction, they benefit from economies-of-scale in shared tenancy in ways domain-specific compute/DBMS/etc. clusters usually don't; and so your system can benefit from a storage layer that's hyperscaled + hyper-robust from serving millions of tenants' low-level needs.


From your link:

Classic Serverless: The database engine runs within the same process, thread, and address space as the application. There is no message passing or network activity.

Neo-Serverless: The database engine runs in a separate namespace from the application, probably on a separate machine, but the database is provided as a turn-key service by the hosting provider, requires no management or administration by the application owners, and is so easy to use that the developers can think of the database as being serverless even if it really does use a server under the covers.

It sounds like this is neo-serverless but serverless nonetheless.


It “is” in the generous sense that the quoted section interprets generously for the purpose of interpreting it generously.


This ship is so sailed already that the stories of its voyages are legendary classics that were told to our great grandparents.


It’s a tale so old that “server” lost meaning before we could even be in a state of “serverless”ness. First there was a cloud, and then there was no clarity of what conditions a process even runs under. And it wasn’t good, but it became stable. And on the next day, there were databases from on high, as if they’d spontaneously burned a bush.


> This ship is so sailed already that the stories of its voyages are legendary classics that were told to our great grandparents.

Specifically, the term is 10 years old this October: http://readwrite.com/2012/10/15/why-the-future-of-software-a...

It has never, ever meant that the software wasn't running on a server. From TFA:

"The phrase 'serverless' doesn’t mean servers are no longer involved. It simply means that developers no longer have to think that much about them. Computing resources get used as services without having to manage around physical capacities or limits."


Ah, the captainless ship (where the captain is behind a locked door).


Don't say server-"less" if you mean in-process (which may well run on a server).

Here's Sqlite without a server, running in the client and backed by static file hosting: https://phiresky.github.io/blog/2021/hosting-sqlite-database...


Modern marketing makes me mad sometimes. They call something "serverless" while in fact it runs on their servers somewhere in the world, where you have even less control than on your cloud. Though, I believe it was popularized by Amazon, which doesn't care much about being honest in its expansionism.


you are really nitpicking and taking offense over something you made up in your head. The context of serverless has changed to mean cloud offering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: