Neon – Serverless Postgres

anilgulecha · on May 28, 2022

This is the missing piece on cloud for masses

  * we already have compute scale-to-zero (cloudrun, lambda, fly.io).

  * Network is default pay for use. Storage (S3) is default pay for use.

  * The only piece in the stack that was always-on was the database (only serverless db thus far was firestore, or something like sqlite+litestream)

With something like this we get a solid RDBMS engineered to be scale-to-zero, and with good developer experience.

This opens up a world of try-out mini applications that cost cents to host. serverless db (postgres) + serverless compute (cloud-run) + use as you go storage+network. This is a paradigm-shift stack. Exciting days ahead.

manigandham · on May 28, 2022

There are plenty of serverless database options already: Firestore, DynamoDB, CosmosDB, FaunaDB, even MongoDB, and there are "newsql" distributed relational systems like CockroachDB and Planetscale with serverless plans.

jillesvangurp · on May 28, 2022

Many teams would prefer a postgresql compatible database with full sql support without compromises, missing features, etc. So, this could break that market open a little. Both AWS and Google are unreasonably expensive for this stuff. Most teams don't need a huge database and would be able to run postgresql on a tiny instance and get away with it. Have 2 of those and failover and backups and it's good enough for a lot of small shops.

Most managed/serverless options begin at hundreds of dollars per month. So, you get lots of companies either just handing over the cash or jumping through hoops to get something more reasonable. The latter is a stupid waste of time if you can afford the former. That's how Google and Amazon make money: they make the expensive option more tempting and the cheap option needlessly hard. They are not interesting in supporting frugal teams. The whole point is squeezing their customers hard.

So, this is potentially very nice if it offers some competition on the cost front. I'd certainly consider using this if it proves reliable. In fact, the whole reason I opted out of a relational database is the above. What I'd need is something that is reasonable in cost relative to the modest data I store and retrieve.

davidzweig · on May 28, 2022

We have a single big bare-metal machine. We run Postgres with a ~1TB DB, moderate load, on a Hetzner AX101 (16C/128GB ram). It has 2* 3.84TB nvme drives (zfs mirror with hourly snapshots) used for postgres storage only, and a seperate pair of mirrored sata drives for system/boot (had to request the extra drive, ask support to change boot option in BIOS, and reinstall OS using rescue system). It's about 100 EUR/mo with unlimited data. We bounce all incoming requests from clients (the machine also runs a node backend) through a digital ocean machine (NGINX proxy), as their peering agreements are better, without this some users in Brazil, Turkey etc. have very slow access. OVH I think would be even better for this use (better? peering and IIRC cheaper data). ZFS snapshots are backed up with sanoid to a machine under my desk with spinning disks. AX101 can be fitted with up six 3.84TB drives, that's almost 12TB of mirrored storage, we should be good for a while. You can (should) use at least lz4 compression on zfs.. can consider zstd-1, bit slower, that could double the effective space. The compression also applies to the in-ram ZFS cache, that can be 100GB+.

We used firestore before.. got a bit tired of some of the limitations (latency, indexing). Cost-wise I don't think it's that different actually, but we aren't using much bandwidth, then self-hosted can be dramatically cheaper. Have to manage some details of course (zfs filesystem parameters, set up backups, config postgres etc.), but I found that stuff quite interesting and it's knowledge that will always be useful.

edmundsauto · on May 28, 2022

How do you measure the 1 TB db estimate? Is that in index, as a CSV, or back of the napkin math based on estimates?

I've always struggled estimating my DB size, curious what you mean by your estimate.

ed25519FUUU · on May 28, 2022

Sounds neat. What’s the weak link here though?

skrtskrt · on May 28, 2022

No read replicas that can be promoted to master in an outage?

davidzweig · on May 28, 2022

Ah, nop. If it goes down, we'll figure it out.. 24hrs outage wouldn't be the end of the world. Would have been nice to have a second (read replica) machine in East Asia, to reduce latency for users there, but didn't find a provider like Hetzner.. maybe could take a server in a suitcase and install it in a collocation center there. Bit of a hassle though.

alduin32 · on May 28, 2022

I have a similar setup as OP, and I use ZFS snapshots to be able to quickly rebuild the whole machine on another host. Of course, it still requires manual intervention in an outage (it could be automated, theoretically), and I may lose up to 5 minutes of data due to my snapshot schedule.

szundi · on May 28, 2022

Maybe that zfs snaphot under a running database? Is that ok?

alduin32 · on May 28, 2022

Yes, ZFS snapshots are atomic, so as long as the database use sane semantics to do disk I/O, you're fine (AFAIK, PostgreSQL and MySQL/InnoDB are ok, MySQL/MyISAM is not though).

A snapshot is effectively identical to a killed database server (due to OOM, or power outage), and database servers should be crash-resistant.

With PostgreSQL, checkpointing the database before taking the snapshot may make the crash recovery quicker when loading the snapshotted database.

nikita · on May 28, 2022

Hi Mani! For sure there are many serverless options - fewer that separated storage and compute and fewer that are open source end-to-end. Neon is also 100% compatible with Postgres (unlike CockroachDB) because compute is Postgres.

Our intention is to standardize the separation of storage and compute cloud architecture - that's why it's open source under the Apache 2.0 license.

boomskats · on May 28, 2022

You should update your HN profile :). Neon looks really great. Far more interesting to me and my team than S2.

I noticed you mention Azure BS in your RFCs as a potential backend. Have you done much work towards that yet?

mattashii · on May 28, 2022

For now we've focused on making the product production-ready on AWS, so that we can get users to try it out. Once the model has been proven we'll likely branch out to other clouds.

However, if you really can't wait to run Neon on Azure, you could contribute the integration yourself: the code is available under Apache 2 at https://github.com/neondatabase/neon/

boomskats · on May 28, 2022

Exact reason why I asked :) Thanks for the response & all the work you're putting in, it looks really promising.

hlinnaka · on May 28, 2022

Haven't done anything with Azure yet. Shouldn't be hard to add though, we don't rely on any special cloud storage features, just simple upload/download of files.

anilgulecha · on May 28, 2022

(Some of these are options I've not looked deeper into - Fauna, Planetscale)

This sentiment is perhaps right, but I was careful about calling out scale-to-zero. We do have options that are zero cost (or pay as you use), but there's a fundamental difference in something that may be zero cost because a cloud provider is using it as a customer-acquisition ploy.

Options like litestream+sqlite+s3, or what Neon seems to be, are verifiable you-pay-for-when-db-is-booted up, else the verifiable cost is storage only.

So the trifecta that will be very productive for masses is 1) database where compute is scale-to-zero, 2)open source or commoditised, and 3) is RDBMS.

nikita · on May 28, 2022

There is a big difference between in architecture between Neon and PlanetScale, CockroachDb, and Yugabyte. Neon is shared storage (storage is distributed but shared) and the others are shared nothing. Shared nothing systems are hard to build with supporting all the features of the base system. E.g. https://vitess.io/docs/13.0/reference/compatibility/mysql-co....

Neon is 100% compatible from Postgres b/c we didn't (or almost didn't) change the Postgres engine.

manigandham · on May 28, 2022

What is the effective difference to you? Technically all compute can enter hibernation by dumping RAM (or just using virtualized memory backed only by SSDs).

CockroachDB already does true scale-to-zero if that's your requirement: https://www.cockroachlabs.com/blog/how-we-built-cockroachdb-...

anilgulecha · on May 28, 2022

The effective difference is commoditization of the paradigm - there's confidence in a commodity technology that a proprietary cannot give (standardized use and wide support and community, multiple providers, self-hosting).

Eg: of innovation -> commodity. EC2 was innovative, but is today common-place. S3 was the same. DBs that scale-to-zero have not reached that state yet.

Thanks for the link on cockroachdb - it sounds promising. I wonder what's the minimum self-deployable unit of cockroachdb - will google around a bit.

ignoramous · on May 28, 2022

I think it is more "market leadership" than commoditization that's working in favour of the MySQL-Postgres duopoly.

manigandham · on May 28, 2022

> "minimum self-deployable unit of cockroachdb"

If you host it yourself, or use their dedicated enterprise clusters then you still have instances which are individual (virtual) servers.

The serverless model has no instances exposed to you and instead is a multitenant architecture that uses a pool of shared compute nodes with a routing layer that intercepts and introspects your connection to load up your specific database context for query processing.

pid-1 · on May 28, 2022

Most stuff you mentioned:

1 - Is not an actual relational DB

2- Doesn't really scale to zero

Planetscale does scale to zero, but has a ridiculous billing model.

manigandham · on May 28, 2022

Both CRDB and Planetscale "scale" to zero - but neither expose any concept of individual instances so I'm still not sure what difference it makes.

acjohnson55 · on May 28, 2022

What's the billing model / how is it ridiculous?

pid-1 · on May 28, 2022

Per row read / write billing -> how do you estimate that?

derekperkins · on June 5, 2022

Either you have an existing db running and check the metrics, or you don't have a workload to move and you can just try it out from scratch. You can always migrate away if the billing model doesn't work for you.

Alternatively, it's likely that Planetscale will eventually provide a provisioned billing model, similar to DynamoDB. Then you can choose between scalable usage based pricing or less flexible flat rate pricing. Either way, innovating different billing models for RDBMS is a win.

rektide · on May 28, 2022

MongoDB & CockroachDB are the only open source ones, the only ones we can hack on & improve & grow.

Neon seems like a vast vast improvement & great & desperately needed potential leap for mankind.

manigandham · on May 28, 2022

Planetscale is Vitess which is also opensource: https://vitess.io/

> "great & desperately needed potential leap for mankind"

Are you being serious? That's very hyperbolic if so.

rektide · on May 28, 2022

> Are you being serious? That's very hyperbolic if so.

Yes Im serious. This is one of the most foundational & key levels of computing: storing & quering data. Without this, computing isnt good for much.

Getting good at this is a huge task for humanity. Right now that task is almost entirely being fulfilled by far off hyperscalers. Aurora, BigTable, Firebase, DynamoDB, CosmoDB, more special works like DataDog Husky... the world is running off vast super-awesome dataengines. But ones that are not ours, that we cant hack on, that we cant run ourselves. It might as well be the Martian's (little green men's) databases as far as I'm concerned: these are not humanity's heritdge & humanity is cut off from active participation with them.

Right now there are so so so few scalable data systems available for the world, that we have. This seems like a great & novel effort to radically open up the range of human capabilities, in one of the most important sectors of computing: handling data.

We have lots of other cloudware for humankind but data has seemingly been much slower & un-scaled. I agree with this @anilgulecha comment[1]:

> This is the missing piece on cloud for masses

[1] https://news.ycombinator.com/item?id=31537313

manigandham · on May 28, 2022

There are tons of databases, I just named several and there are hundreds more for every possible niche.

> "But ones that are not ours, that we cant hack on, that we cant run ourselves."

Who is "we"? Is your entire issue that not everything is open-source?

> "these are not humanity's heritdge & humanity is cut off from active participation with them."

This is still incredibly exaggerated. Everything humanity does is humanity's heritage. There are plenty of open-source databases of all types and sizes if you look: https://db-engines.com/en/ranking

rektide · on May 28, 2022

Your free to your opinion, but I respectfully do not share your values & sense of collective ownership over remote, far off software I can't see, shape, control or rework.

The most notable feature of Neon, that makes it so exceptionally smart, is that it takes the humanity's existing first best choice default-go-to database, by a country mile, and layers in really smart decoupling of storage to help it scale. Reinventing things from scratch can be a win, but that this is already a well known quantity, much loved & cherished & used by all, hacked by all, grown by all is a planet-sized huge plus mark.

I think you vastly overrate the broad availability of alternatives to this, of potential starting places others might explore. Most of your list is dominated by proprietary whatever. Even skipping that, yes, there is a ton of novelty & promise, the potential for breakthroughs. But this is starting with the best, and pushing it into the cloud-layer, into the troposphere.

manigandham · on May 29, 2022

> "I think you vastly overrate the broad availability of alternatives to this"

There are dozens of "scalable postgres" alternatives from Redshift to Citus to CockroachDB to Yugabyte. I find it strange that you say its overrating the availability when you're also comparing it to something that isn't even a full product yet.

tomatowurst · on May 28, 2022

[flagged]

bryanrasmussen · on May 28, 2022

I've met my share of developers without perspective in time, and hey I was probably one of them for my first 5-6 years as a developer, that I don't think one needs to assume corporate toadying for hyperbolic importance for mankind to be ascribed to a corporate offering.

nikita · on May 28, 2022

rektide is not Neon employee, nor is he or she was asked to post this.

Thank you for the comment rektide. We of course would love to see constructive comments and criticism.

- Neon CEO

tomatowurst · on May 28, 2022

[flagged]

dang · on May 28, 2022

Can you please stop posting these off-topic comments (which are against the site guidelines, as you'll see if you review https://news.ycombinator.com/newsguidelines.html)? Also, can you please stop posting unsubstantive comments generally? You've been doing that in other threads as well.

I appreciate your concern for the integrity of HN discussions, but the thing to do if you suspect abuse or manipulation is to email hn@ycombinator.com so we can look into it. Posting about it in the threads themselves is explicitly against the rules.

tomatowurst · on May 28, 2022

understood, dang sent you an email

but please take a look at the flagging attack on all of my comments outside this thread. the timing is too peculiar here

somebody really didnt like what i wrote above ;)

dang · on May 29, 2022

Moderators did that—your comments have been breaking the site guidelines quite badly!

tomatowurst · on May 29, 2022

that explains why seemingly YC money related threads are my criticisms getting flagged

pvg · on May 30, 2022

Users can flag comments too and yours get flagged because they break the guidelines in overt and obvious ways which is pretty easy to avoid.

psnehanshu · on May 28, 2022

Doesn't seem like an employee due to another comment they posted. https://news.ycombinator.com/item?id=31537613

avinassh · on May 28, 2022

Minor correction, both of them are source available, but not open source

Cockroach DB license - https://github.com/cockroachdb/cockroach/blob/2c4e2c6/LICENS...

Mongo license - https://github.com/mongodb/mongo/blob/39e4b70/LICENSE-Commun...

lawik · on May 28, 2022

Cockroach is eventually open source which I rather appreciate. Any code that is 3 years old goes Apache if I recall it correctly.

orra · on May 28, 2022

That only applies to the core. The "enterprise" features are proprietary until the copyright expires.

IIRC, backup was enterprise but is now part of the core. However, restore is still enterprise.

tluyben2 · on May 28, 2022

There is also gigabyte as open source. In our tests (which means nothing in general as it’s specific to our business case), it outperforms cockroachdb.

tluyben2 · on May 28, 2022

Sorry, cannot edit; Yugabyte (ios spellcheck).

geysersam · on May 28, 2022

Yes, but they are not "scale-to-zero"

manigandham · on May 28, 2022

Both CRDB and Planetscale are.

logifail · on May 28, 2022

> This opens up a world of try-out mini applications that cost cents to host

Given how much performance you can squeeze out of a $5/month VPS (I've been spinning them up and indeed down regularly over the last couple of years), is this really a paradigm shift?

sofixa · on May 28, 2022

On that $5/month VPS there's management overhead - you need to have at least basic Linux knowledge, and ideally more than that to know not to do stupid things like chmod 777 and database exposed on the public internet. You also need to do your updates, etc.

I'm an (former) SRE and run my own Kubernetes cluster for fun, but still use serverless (containers as a service, static website hosting) depending on the project.

lelanthran · on May 28, 2022

> On that $5/month VPS there's management overhead - you need to have at least basic Linux knowledge

Don't you need that as well as cloud-specific knowledge if you go serverless?

> and ideally more than that to know not to do stupid things like chmod 777 and database exposed on the public internet.

You still need some arcane knowledge to make sure your serverless doesn't experience cost overruns, right?

IME (your's obviously differs), the amount of cloud-specific + vendor-specific knowledge needed to avoid using a $5/m VM is a lot more in volume and a lot less in stability[1] than learning basic Linux once and using VMs everywhere[2].

[1] How the different cloud providers bill, when they bill, how to control your limits, etc changes much more often than knowing how to keep your server patched. Knowing how to get your serverless DB going on AWS doesn't help when you want to use Azure. And each cloud vendor regularly requires you to update your knowledge. Knowing how to keep a PostgreSQL-on-Linux up-to-date can be learned once and used for years. Even if running a managed DB, you'll still need to gain some of that knowledge anyway.

[2] Once you get to a scale where treating your machines like cattle rather than pets, you'll obviously have the team required to use cloud stuff optimally.

vidarh · on May 28, 2022

> You still need some arcane knowledge to make sure your serverless doesn't experience cost overruns, right?

I know people who have literally written their own analysis tooling just to figure out what's impacting their AWS spend. It's gotten better, but I could have retired many times over on what I've seen clients overpay to cloud providers because they didn't understand what would drive cost.

> [2] Once you get to a scale where treating your machines like cattle rather than pets, you'll obviously have the team required to use cloud stuff optimally.

The "problem" with the cloud story is that at that point you also have a team that could save you a fortune with a hybrid setup. Cloud providers get their margins off those who don't understand how much they're overpaying or who are too small to a) care or b) have leverage. Those big enough to have leverage who understand either negotiate hefty discounts or build out cheaper setups (basically at the point you're spending 7 figures a year, if you're paying anywhere near list prices for cloud services you're a chump; below that it's hit and miss)

I'm not at all against using cloud services, but I wish more people actually understood their costs and picked based on merits rather than cargo-culting. Some teams benefit greatly from cloud services, but usually if they're not cost sensitive. In my current job we have everything on AWS because we're never going to scale to somewhere where it'll get expensive and it's convenient. We'd save money if I moved it to, say, Hetzner, but the hosting bill is too small to matter. For that use it's fine.

The moment the bill starts to bite people ought to at least price out alternatives, and consider hybrid setups. E.g. I've had setups where even just putting a caching proxy in front of AWS to cache images to cut the egress bill would have paid for a team to keep it running. Their egress cost is still bad, but at the time it was just pure highway robbery.

vidarh · on May 28, 2022

With every cloud service there's a management overhead too. Just different skills you need to learn.

I've done SRE/devops work in various capacities including consulting longer than cloud services have existed, and my experience is that I've consistently earned more from clients who insisted on cloud services because they consistently need more help. Nothing is driving more demand for devops consulting services than cloud providers.

elondaits · on May 28, 2022

The issue with a full Linux system’s overhead is that if there are any new security vulnerabilities the situation could blow up in your face (e.g. the system is used to send spam, or host malware), so you need to maintain it at least minimally. With a serverless cloud architecture at worst it’ll stop working.

vidarh · on May 28, 2022

Or you just use Flatcar (a derivative of CoreOS), and don't worry about anything more than rebooting once a new image has been (auto-)installed, and run everything else in app containers where you have to worry about nothing more than what you would in your regular cloud setups.

This is not hard to get right. Yes, you need to learn how to do it, but the amount of money I've made from clients who thought cloud was simple and proceeded to create massive security holes for themselves is fairly substantial. People who think they're reducing their attack surface by using these services need to reevaluate - they're large, complex architectures that very few users understand properly. You need to learn the skills either way.

nikita · on May 28, 2022

We don't know. But we built it anyway because it may be that.

anilgulecha · on May 28, 2022

Millions of students and enthusiasts around the world would find that cost sufficiently friction-ful to not try out things.

logifail · on May 28, 2022

> Millions of students and enthusiasts around the world would find that cost sufficiently friction-ful to not try out things

I appreciate there are indeed billions of people for whom $5 is a lot of money, but just how many of them are "students and enthusiasts" itching to get started with Postgres?

I realise that - perhaps particularly here - a $5/month VPS is a deeply unsexy thing. You can, however, achieve (and learn) an awful lot with one.

Art9681 · on May 29, 2022

Nowadays there's a ton of idling compute capacity running client side that could be used to spin up a personal cloud environment with all core services if one has the proper knowledge. For anyone that has a core to spare and a few GB's of memory, which should be easy for most modrn midrange hardware, I would recommend they save that $5 paying for a third party VPS, go deploy an open source hypervisor such as KVM, and run the virtual server on your own hardware.

kortilla · on May 28, 2022

Who absolutely would not want to give a credit card to AWS where a bill is dynamic. If $5/mo is bad they definitely can’t handle the screw up that scales up and runs overnight for $500.

vidarh · on May 28, 2022

Free tier instances on various providers provides an option. And if you can't afford $5 a month you really shouldn't be playing with services where there's a risk of huge overages if you face a sudden spike in users.

avinassh · on May 28, 2022

You cannot sign up on AWS Free Tier without the credit card info and many students end up with a huge bill. This sub reddit.com/r/aws/ has many such posts.

Some links and discussions:

- https://cloudirregular.substack.com/p/please-fix-the-aws-fre...

- https://twitter.com/alexwlchan/status/1399095011178958851

- https://news.ycombinator.com/item?id=27044371

vidarh · on May 28, 2022

Well, and that is just as much of an issue if you use most serverless offerings. But you're not prepared to carefully manage your use, cloud services should not even be on your radar.

ptman · on May 28, 2022

Comparison of free tiers: https://paul.totterman.name/posts/free-clouds/

pid-1 · on May 28, 2022

Fly.io does not scale to zero.

Lambda has many limitations.

In particular, for some reason AWS is allergic to providing a container deployment service that actually scales to zero.

russellendicott · on May 28, 2022

> AWS is allergic to providing a container deployment service that actually scales to zero

Isn't this what Fargate is?

pid-1 · on May 28, 2022

hamandcheese · on May 28, 2022

Not yet, but soon.

dragonwriter · on May 28, 2022

AWS Aurora Serverless v1 (in MySQL and Postgres flavors) has had serverless, scale-to-zero for quite a while.

shaicoleman · on May 28, 2022

Aurora Serverless v1 has cold boot times of ~30 seconds when scaling up to zero, which precludes it from being a viable option for most usecases

pid-1 · on May 28, 2022

Unfortunately V1 is getting very little love from AWS and the new one, V2, does not scale to zero.

8organicbits · on May 28, 2022

What's the cold start time for something using sqlite+lightstream on scale-to-zero compute? I think you'd need to pull the db out of storage, so I would be slow to go from 0->1 instance. Anyone know if that's right?

Is there any cold start delay for neon?

nikita · on May 28, 2022

Right now it is 2 seconds. We are working on improving it.

rektide · on May 28, 2022

> This is the missing piece on cloud for masses

I like this perspective a lot & think it's absolutely key here.

We- the world- still pick single-node writer postgres & read replicas when we have to store & query data. There's great Kubernetes postgres operators, but it's still a distinctly pre-cloud pre-scale type of technology, & this decoupling & shared-storage sounds ultra promising, allows independent & radicaly scale up & scale down, sounds principally much more managable.

hamandcheese · on May 28, 2022

If you can scale your app to zero, couldn’t you also just scale your database to zero once no more app servers are running?

Or for try-out apps, as you mention, you could just run Postgres next to your app in the same container.

This might be possible with fly.io, or will soon, I think.

I’m not sure how comfortable I am using a custom flavor of Postgres (even if it’s just the storage layer).

antender · on May 30, 2022

We already had serverless db for ages and it's called ... Google Sheets. You can even query it with simple SQL-like language.

The problem with most other "serverless" databases is that they don't offer HTTP API to query them from restricted environments like serverless functions.

guggleet · on May 28, 2022

https://aws.amazon.com/blogs/aws/amazon-aurora-serverless-v2...

ldoughty · on May 28, 2022

I virtually never self-promote, but that exact article got me to investigate the offering:

https://ldoughty.com/2022/05/exploring-aws-aurora-serverless...

Short answer if you don't want to read my post: it constantly uses CPU, it's always on. After creation, waiting 2 days, never logged into it, never ran a script against it, never gave it access to any networks, minimum cost is $43/month because it can't actually scale down to 0.5 units unless you CAP it at 0.5, which makes it unusable, because it consumes all of that capacity just to exist.

It sounds like this Neon offering is exactly what I hoped AWS was offering... Or they are using Language to suggest it and mislead the customer just the same ... If it's the former, if probably sign up and try it out. If it's the latter, I'll probably never touch it for the false hope.

Edit: lots of typos from phone keyboard

rgbrenner · on May 29, 2022

I tried recreating your experiment.. created an aurora Serverless v2 db with 0.5 - 2 ACUs in us-east-1.. since you said it was for Wordpress, I disabled the multi-AZ replication, since AFAIK, WP can't use separate reader/writer connections (mentioning this because you didn't say anything about it in your article)... then I let it sit overnight so it had time to create everything.

It's sitting at 23% of 0.5 ACU.

So its either the replication setting (haven't tested with it on yet), or.. AWS is a shared service... wonder if it's similar to EC2, in that sometimes you get an instance on a machine that's more overloaded and the instance doesn't perform as well.. and you have to destroy it and try again. Might want to try it again.

Edit: I don't think its the replication setting... tried that with a new db and its at 25% on each replica after an hour.

taspeotis · on May 28, 2022

I am using Serverless v2 with min/max ACU of 0.5/8 and it spends most nights at 0.5.

fitzoh · on May 28, 2022

No scale to zero unfortunately

res0nat0r · on May 28, 2022

It does scale to zero no?

> It automatically starts up, shuts down, and scales capacity up or down based on your application's needs.

The only tradeoff is the additional latency someone will have when connecting to the db after it has shutdown and waiting for it to spin back up and become ready.

ldoughty · on May 28, 2022

AWS aurora serverless says:

> You pay only for the capacity your application consumes.

> Scales down to 0.5

But it actually can't scale down to 0.5 or the DB falls over just existing.. auto scaling won't let you go down that low unless you set 0.5 as the max, which literally makes it not scale up, and it's dead, because the DB can't run with that little CPU.

So it's fair to ask if neon can scale to 0, both in marketing, and in practice.

ololobus · on May 28, 2022

We do scale compute part down to zero after 5 mins of inactivity now (no active transactions). This 5 mins threshold is a random pick, it could be 1 min or 30 mins later, or even customizable by the end-user. Storage part is heavily multi-tenant, so it's always running and our main objective is to make resource utilization as effective as possible.

It still has a significant latency on the first connection attempt after suspend (1-2 seconds), but we are working on that and it seems to be realistic to put the startup time under 1 sec.

Pricing model is still work-in-progress, so cannot say much about it. Yet, my personal intention is to make it cost-effective for both end-user and us. I'd prefer to don't build a service with claims like 'here is your free-tier serverless Postgres with zero-latency on connect', which actually means that under the hood there is an always-running compute burning the investors money. Hope it's realistic to achieve :)

-- Cloud engineer @ Neon

ldoughty · on May 28, 2022

That's interesting to hear. That probably works great for my use cases, which is typically wake up to refresh a CDN for guests, but ready to work for a bit if a content creator logs in (e.g. a WordPress instance without comments or non-author logins).

Looking forward to seeing how this works out. I have no issues paying for services, I just hate that the minimum entry level cost is $20... I can't imagine why, at scale, it can't be more affordable for hobby/fun level projects.

tpetry · on May 28, 2022

How do you plan to start a PostgreSQL instance in less than 1 sec? Sounds interesting.

I tried fast booting of PostgreSQL instances and it always took multiple seconds. So i am really curious!

ololobus · on May 28, 2022

That's where the separation of storage and compute kicks in, I guess. Startup process of our Postgres instance (compute node) is a bit different from vanilla Postgres. We need to go to the network storage service (pageserver and safekeepers) to get the last known commit LSN, but we don't need to perform any sort of recovery on the compute node side. That way, compute is mostly stateless.

Basically, to start we need to know this LSN and to bootstrap the Postgres processes. This is really that quick. After that compute is ready to accept connections and serve requests, as it's able to get any missing pages from pageserver with GetPage@LSN request.

We do have the whole bunch of problems to solve: queries latency after cold start; startup after the unexpected exit of the heavily loaded Postgres instance could be slower; etc.

mattashii · on May 28, 2022

Some parts of the PostgreSQL start-up sequence take a long time:

- Initializing shared memory -> We, for now, have only small instances, so that doesn't hit us as hard

- Reading data directories -> We don't have to do that at all

- Replaying WAL from a previous unclean shutdown -> We don't need to do that, PageServer is responsible for that

- When initializing a whole new database: Initializing the data directory -> We have a copy that each instance gets initialized from, which makes the process "copy those ~16MB in the background", which saves us from having to do the costly initialization process.

And there's several more infrastructural optimizations, such as pre-loading the docker images onto the hosts.

tuukkah · on May 28, 2022

It seems Aurora v1 used to scale to zero but v2 has a minimum of 0.5 ACU.

derefr · on May 28, 2022

Snowflake is probably the closest comparison.

nikita · on May 28, 2022

Some users called us Snowflake for OLTP, some others Snowflake for Postgres. Obv b/c of separation of storage and compute.

TruthWillHurt · on May 28, 2022

uh... AWS Aurora? Azure CosmosDB? GCP BigQuery?

All serverless, scale-to-zero or pay for demand...

antifa · on May 29, 2022

GCP BigQuery is unusably slow for small datasets and is more like redshift/athena than postgres.

jokethrowaway · on May 28, 2022

I don't see the benefits over a 5$ VPS. Even admitting I'll save a few cents over a VPS (which is absolutely not guaranteed), the cost saving is so minimal I won't bother rewriting everything under the serverless paradigm just for it. Of course if your cloud doesn't cost an eye and an arm. I can understand people excited to save on expensive Aws instances but maybe you should just consider dumping Aws.

Scale doesn't matter for mini applications and scaling vertically (=throw money for a bigger server) will work for 99% of the companies. The 1% who need horizontal scaling will have custom everything regardless and will need to hire experts, not a good niche to release a product.

jeroenhd · on May 28, 2022

I do. There are real benefits to using a hosted server for large projects (think hundreds of gigabytes to terabytes of data) because getting sharding, fallbacks, and downtimeless maintenance right is difficult, risky, and expensive.

These are reasons why one might go for a hosted database at AWS/GCE/Azure. There are tons of good servers out there for small to medium projects and I don't think this service is right for those, unless they can make do with the free tier. The real benefit is in the larger cloud application space.

A system that does this type of scaling automatically while also reducing the dependency on a single cloud provider's service can be a gamechanger for some companies with huge database servers that risk getting locked in.

I think using this service for most existing applications will introduce a performance drop, a rise in expenses, and a complicated migration path, but on the other hand I think that developing against this system for new projects that are very likely to grow in scale will end up with some major data control benefits.

The open source nature also allows for competitors in markets like the EU to start serving databases that don't break privacy laws (although most companies don't care until they receive a fine).

I'm not sure how this company will become profitable while giving away its special sauce that can be modified to run at competing companies relatively easily, but that's a whole different story.

SonOfLilit · on May 28, 2022

> Neon allows to instantly branch your Postgres database to support a modern development workflow. You can create a branch for your test environments for every code deployment in your CI/CD pipeline.

> Branches are virtually free and implemented using the "copy on write" technique.

Unless I missed that everyone supports this, this here could be a killer feature and should be advertised higher.

samokhvalov · on May 28, 2022

Agreed, this direction is underestimated and should be developed better -- we (Postgres.ai) do it for any Postgres with our Database Lab Engine [1], and Neon would bring even more power if it's installed on production

[1] https://github.com/postgres-ai/database-lab-engine

zxspectrum1982 · on May 28, 2022

You can get that feature on any Postgres server by installing Citus

mattashii · on May 28, 2022

Does Citus provide any such storage-level multi-cluster features? I can't seem to find any documentation on that...

zxspectrum1982 · on May 28, 2022

I don't understand what you are looking for. Care to explain?

jvolkman · on May 28, 2022

AWS Aurora Postgres supports this to an extent with "clones". You can even clone cross-account. The same copy-on-write stuff applies, so they're relatively cheap and fast. I hope that Google's new AlloyDB will also support it.

https://aws.amazon.com/about-aws/whats-new/2019/07/amazon_au...

There are some annoying restrictions, though. You can only have a single cross-account clone of a particular db per account.

samokhvalov · on May 28, 2022

The problem with Aurora's thin clones is extra cost each clone adds

For CI/CD, you want multiple clones running on the same compute power, in a shared environment, to keep the budget constant

jhgb · on May 28, 2022

It sounds like something you might be able to accomplish with a copy-on-write VFS on top of a Firebird database file. (Not sure about PostgreSQL, but with Firebird, you only deal with one file, so with Firebird, this should definitely work.)

nikita · on May 28, 2022

There is an enterprise company called Delphix that does it on top of Zfs - so the idea was in the air.

Instead of duct taping this together with a filesystem we purpose built database storage. The advantage to this is that we can much tighter control execution paths and can profile them end-to-end. Additionally this allows us to integrate with S3 and make it much much cheaper to run.

jhgb · on May 28, 2022

Technically Firebird just requires a block device, so you might not even need a filesystem.

rkwz · on May 28, 2022

What are the intended usecases for "branching" a database? Currently, I use separate databases for different environments, are branches better?

kelvich · on May 28, 2022

Now the most common setup is to copy the production database to the staging once in a while and test migration against staging. With branching, you can test each PR against its own production database branch -- just put branch creation in your CI config. Hence, it has fewer moving parts, is a bit easier to set up, and reduces the lag between prod and staging.

ukd1 · on May 28, 2022

Have a staging / qa env, then fork it for a branch for testing. Much faster than reseeding / restoring.

thejosh · on May 28, 2022

It's a great feature on heroku for branches, it shares data between review apps. Quite nice.

rektide · on May 28, 2022

Really interesting. I've seen so much disagregated database work, and so so so much of that exposes postgres interfaces. But all the good stuff has been closed source!

I'm very very excited to hear about a team taking this effort to postgres itself, in an open source fashion! From the Architecture[1] section of the README:

> A Neon installation consists of compute nodes and Neon storage engine.

> Compute nodes are stateless PostgreSQL nodes, backed by Neon storage engine.

> *Neon storage engine consists of two major components: A) Pageserver. Scalable storage backend for compute nodes. B) WAL service. The service that receives WAL from compute node and ensures that it is stored durably.

Sounds like a very reasonable disaggregation strategy. Really hope to hear about this wonderful effort for many more years. Ticks the boxes: open-source with a great service offering: nice. Rust: nice.

nikita · on May 28, 2022

We are committed to building a durable company and we are well funded. So yes, you will hear from us for years to come as we will be shipping more and more features.

avinassh · on May 28, 2022

I could not find funding information on the Neon site. Is that information not public?

edit: I found the info here: https://boards.greenhouse.io/neondatabase/jobs/4506003004

nikita · on May 28, 2022

We will announce in a few weeks. Top tier Silicon Valley investors.

rattray · on May 28, 2022

[1] https://github.com/neondatabase/neon#architecture-overview

rektide · on May 28, 2022

Oops thanks!

ranguna · on May 28, 2022

Just yesterday I was comparing managed serverless postgres offers and was sad to temporarily end my investigation with a compromise of using managed aws RDS for development, hoping that a fully serverless postgres with a nice free tier would pop up before going to production, and here we are!

Congrats to the team for what feels like an amazing product. Signed up for the early access, can't wait to get my hands on this!

For anyone interested, these ere the DB offers I looked into:

* DO managed postgres, no free tier but price scaling was not too aggressive, the issue is that it's not natively serverless and we're gonna get 100s of ephemeral connections.

* Cockroach, was the best option for our use case but it doesn't support triggers and stored procedures, so we can't use it right now (closely following https://github.com/cockroachdb/cockroach/issues/28296)

* Fly.io price scaling is too aggressive 6$ -> 33 -> 154 -> 1000s a month and no free tier that I could find.

* Aurora serverless v2 is only for aws internal access and we are using gcp.

* Aurora v1 was what we were gonna go with, but a lot of people online have showed their negative opinion around slow scaling. I didn't investigate enough but I'm thinking we'd need to setup RDS proxy for it handle all our connections, which would've bumped up the price by a good amount. Also no free tier.

* Alloydb looked promising but also no free tier and starting price is a bit much for our current phase of development, but it was definitely something we'd look into in the future.

And now Neon, natively serverless with a (hopefully) good free tier to test things out and some hints about cross region data replication, amazing stuff!

rad_gruchalski · on May 28, 2022

If CockroachDB was fitting your use case the best, you should have a look at YugabyteDB. It does triggers, stored procedures, extensions, almost everything. Some alter table features aren’t working yet but it’s getting there.

Not associated with the company but a very happy user.

Bonus point: YugabyteDB is full Apache 2-licensed so you can roll your own.

ranguna · on May 29, 2022

Just took a look and it seems pretty nice!

But found their pricing page (which was very hard to find other than the generic "contact sales" page) and it seems the starting price is 360 USD/month, that's not something we're comfortable with right now.

USD 0.25/vCPU/hour, minimum 2 vCPU = 0.25*2*24*30 = 360

https://www.yugabyte.com/yugabytedb-managed-standard-price-l...

spiffytech · on May 28, 2022

> Fly.io price scaling is too aggressive 6$ -> 33 -> 154 -> 1000s a month and no free tier that I could find.

Fly has a general purpose free tier of 3 of their smallest instances. You can use that to run their 2-node Postgres cluster plus an app server.

The pricing you pulled is examples of various compute + storage configurations, not the exhaustive list of options. It should look like $4 (or free tier) -> $11 -> $21 -> $62 -> $82 ... + storage, since it's just 2x their VM price (for the two nodes) + any storage above free tier.

nwienert · on May 28, 2022

Last I used them (last year) their postgres offering, even scaled up to larger nodes, was significantly slower than the cheapest DO offering. I filed a few issues but haven’t checked back since.

ranguna · on May 29, 2022

Ah nice!

So the prices I mentioned where just example configuration. That's pretty cool then, specially with that free tier.

Will put fly.io back on the list and do some benchmarking in the future.

Thanks a lot!

sitkack · on May 28, 2022

Curious why a free tier is so important?

I think a FT encourages bad behaviors on both sides. I don't think pricing should be linear at all. But even for development, one is using resources, but most of the time they can be minuscule for individual devs.

Aside from production reliability, Postgres is one of the easiest things to get running on a VM and runs fine on a 5$ a month instance.

ranguna · on May 29, 2022

Free tier, like all things, is pretty bad if misused.

The reason we want a free tier is to try things out before we can actually commit to something. We don't know if what we are doing is actually gonna make money and sometimes we go a few months without working on it. So it's kind of a pain to pay for something we don't use.

That's why serverless is also nice to have on our current stage, things can just scale to 0 and there's no wasting of resource.

> running on a VM and runs fine on a 5$ a month instance

Easier said than done, unfortunately.

lysecret · on May 28, 2022

"Aurora serverless v2 is only for aws internal access and we are using gcp." You can have public access to serverless v2. I'm using it with retool for example. That said I moved a Postgres DB to Aurora, the process was hilarious in how crazy it was. Also they haven't implemented scaling to 0 yet!!!! And the minimum 0.5 Compute unites are actually pretty expensive.

ranguna · on May 29, 2022

Nice point about the minimum 0.5 ACU, forgot about that one. From what I've read 0.5 on v2 is the same price as 1 on v1, which seems pretty dumb coming from aws.

Could you elaborate more on this:

> You can have public access to serverless v2

Because the docs mention the following:

> You can’t give an Aurora Serverless DB cluster a public IP address; you can only access it from within a VPC based on the Amazon VPC service.

Potentially I could setup an RDS proxy or vpn inside the vpc and give that public access, but that seems a bit of a roundabout way of handling this. https://aws.amazon.com/blogs/database/best-practices-for-wor...

lysecret · on May 30, 2022

I can 100% confirm public access works :)

ranguna · on May 31, 2022

Could you elaborate on how you got it to work without a public IP?

lysecret · on June 3, 2022

Its a completely crazy process but it works roughly like this: (going from RDS Postgres on 12.6 to Aurora Serverless V2)

1. You create a snapshot of your original db

2. You update that snapshot to 13.4 (NOT 13.6!!)

3. You have to use the AWS CLI (because the online migrate doesn't work) to create a cluster from the snapshot

4. Remember this cluster is 13.4, serverless only works with 13.6, so we have to upgrade it later again

5. However we can only upgrade it when there is a running system

6. so we create a non-serverless aurora instance. Most instance types returned an error but using t3.medium worked, you can create that with public access.

7. when they are both created you can already try to ping your DB (it should work)

8. upgrade to 13.6

9. Now you are able to change the DB instance type to serverless

Edit: BTW I am available for hire as an external contractor for this kind of stuff ;)

jvolkman · on May 28, 2022

AlloyDB is free during its preview phase (not sure how long that is).

https://cloud.google.com/alloydb/pricing#fair-usage-limits

ranguna · on May 29, 2022

Read gcp's policy and they say preview periods can last around 6 months, and I'm not sure when alloydb preview started.

But even if there's a free period, it'd be complicated to develop stuff around the DB for free, just to turn into 100s of dollars after 6 months, that's not something we want to see happening. So an indefinite free tier with limited resources would be better. Like aws lambda 1M or firebase function 2M request free tier.

rattray · on May 28, 2022

Did you look at Crunchy Bridge? Not sure if they support that use case.

ranguna · on May 29, 2022

Took a look just now and they start at 35/month. They have some nice points around support, backup and disaster recovery. But if that's the starting point, I'd prefer something like digitalocean that has a similar product offer starting at 15$.

Thanks for the tip though!

gorgoiler · on May 28, 2022

Postgres is mind boggling, coming from sqlite. In a good way, and both are amazing tools.

   with ordinal

   jsonb_*

   ‘3 minutes’::interval

   create index on my_json ->> ‘a key’

It’s amazing how much stuff there is available. All the toys!

CGamesPlay · on May 28, 2022

Just a quick point in defense of SQLite: that last one is almost verbatim possible in SQLite, and it is possible to calculate ordinals, although the syntax is with standard SQL rather than a custom syntax. The SQLite docs mention that they never found a use case for jsonb that ended up being faster or more efficient than json, so they left it out, although they do reserve the BLOB data type for jsonb if such a use case is discovered.

gorgoiler · on May 28, 2022

Well this is a doozy: so you’re saying they are both equally awesome as opposed to being individually awesome in different ways.

What a time to be a developer.

manigandham · on May 28, 2022

From the teams page, the CEO of Neon is the cofounder of MemSQL/Singlestore which is one of the best database products I've used. Looks like a solid team to get this done. Very similar approach to Yugabyte (real postgres compute layer + custom scale out data layer) and many others in the OLAP space.

ignoramous · on May 28, 2022

Manish Jain of dgraph.io noted that building on top of Postgres or betting on Postgres seems like a necessary condition for database startups to be successful.

Some are commodotizing Postgres' wire format but implementing their own query and storage layers (like CockroachDB / Aurora / AlloyDB), while others are modifying parts of Postgres (like Timescale / EdgeDB / YugaByte), and others still are building atop it (Supabase).

https://twitter.com/manishrjain/status/1496174276474732544

manigandham · on May 29, 2022

Interesting note but that seems to be recency bias with news than anything concrete. Companies from MongoDB to Snowflake to FaunaDB have been successful. Manish himself is from DGraph which is a brand new graph database with no relational to Postgres.

nikita · on May 28, 2022

Thank you for the kind words Mani! Singlestore is indeed an amazing product and company. I'm really proud of it!

nikita · on May 28, 2022

Nikita - CEO of Neon here. We intended to post this at the launch next month, but since it here, I'm happy to answer any questions.

We have been hard at work and looking to open the service to the public soon.

zeusly · on May 28, 2022

Hey Nikita, could you maybe put some more legal information on the webpage?

I'm trying to find out if you're a company and where you are located. Is there no legal entity behind this? Do you have a privacy policy?

mattashii · on May 28, 2022

The company is Neon, Inc., which is registered in the USA. We're a remote company, with a significant portion of the developers being located in Europe.

Privacy policy and related stuff will be ready when we publish the public beta, which we expect to happen soon.

nikita · on May 28, 2022

Yep, Delaware corp with top tier US investors. I'm in the Bay Area. Heikki is in Finland. Stas is in Cyprus. Majority of engineering is in Europe, some in the US and Canada.

Postgres is a global phenomenon.

timmg · on May 28, 2022

How “cheap” is it to create new db instances?

I can imagine a world where it might be practical to have one master db for all of your customers/accounts. But a separate db instance for each customer’s data.

Is that the kind of architecture you think might be workable with your system?

nikita · on May 28, 2022

It's cheap. Storage footprint is 15Mb and will be shrunk further. Min compute footprint is a 1 core container that shuts down when not used.

We are already working with customers that do that. This is for sure a great use case for Neon.

unraveller · on May 28, 2022

won't the tiny compute units on AWS have relatively slow storage? (no NVMe allowed for them I think) fine for small datasets that fit in ram but benchmarks are needed to show the bigger picture.

avinassh · on May 28, 2022

This is really exciting and thank you for making it open source. I am still trying to wrap my head around the Neon, but is there any design document or architecture description? I want to learn more about the Neon storage engine and how it all fits together.

Also, how do I get an invite code to try?

edit: found this to get started - https://neon.tech/docs/storage-engine/architecture-overview/

nikita · on May 28, 2022

We will send you an invite code soon. This is a good start and also RFCs on github. We will be publishing more and more

alex-korr · on May 29, 2022

Would love on as well.

akmodi · on May 28, 2022

Hey Nikita! I was just looking at the docs but I was a bit confused about what the various compute instances were doing. Do they all serve reads and writes? If so, is there data partitioning or does this support distributed transactions?

nikita · on May 28, 2022

Various compute instances are different endpoints to separate databases. So for now it's single writer system. You can get a lot of power out of a 128 core compute node. In the future will will also spin up extra compute to scale reads.

In the future after that future we will introduce data partitioning - we have a cool design for it, but one step at a time.

akmodi · on May 28, 2022

Ah got it thanks! And what's the consistency on the instances that serve reads?

Super interested in this space since we're always looking for ways to evolve our pg!

akmodi · on May 28, 2022

Ignore this. I misread your previous reply (￣ー￣；

httgp · on May 28, 2022

Do you plan to solve for global data-at-the-edge availability? That to me is the killer feature for databases and one I’m direly in need of at work.

nikita · on May 28, 2022

Yes, we are discussing either simply using Postgres replication to move data to other regions and use our proxy to rout reads to the datacenter closer to the user (like fly.io). This will have issues with supporting more than ~5 regions.

OR we can separate storage from replication and purpose build a multi-tenant replication service. This will support as many regions as you want (over 200) but it's more work. We will publish an RFC for that.

code_biologist · on May 28, 2022

Cool stuff! Is PostGIS support difficult?

nikita · on May 28, 2022

It's supported. The beauty of the architecture is that it doesn't break plugins.

tuukkah · on May 28, 2022

Could you include it in the tech preview? https://neon.tech/docs/cloud/compatibility/

kelvich · on May 28, 2022

Sure, we will

lewisl9029 · on May 28, 2022

Seems like this might implement database branching in the way most people would assume: branching both the data and schema? I remember being a bit disappointed to learn that PlanetScale's database "branching" was only for the schema [1], which is still quite useful, but this would be so much cooler!

I couldn't find much info about the replication models available/planned however. I would consider this to be table stakes at this point for a serverless database with the recent trend of pushing compute to the edge. This is much more interesting to me than scaling to 0, which is only really useful during the prototyping phase.

PlanetScale is single primary with eventually consistent read replicas, Fauna has strongly consistent global writes (or regional if you choose, but no option for replication between regions if you do) with a write latency cost, Dynamo/Cosmos are active-active eventual consistently replicated with fast writes globally. All useful in different scenarios, but I'd love to have one DB tech that can operate in all of these modes for different use cases within the same app, using the same programming model to interact with data across the board.

I think the decoupled storage engine here would open up some really interesting strategies around replication. What are the team's plans here?

[1] https://docs.planetscale.com/concepts/branching

nikita · on May 28, 2022

Great questions!

1. Yes schema and data via "copy on write". This will let you instantly create test environments, backups, and run CI/CD. There is a long video here that shows a prototype with GitLab: https://www.youtube.com/watch?v=JVCN9X-vO1g&t=1s.

2. We don't have this feature at the launch, but Matthias van de Meent is already working on it. We will publish and RFC and solicit comments from the community.

3. We are working on two: regional read replicas and consistent multi-region writes (together with Dan Abadi who helped design FaunaDB). Former is much, MUCH easier.

4. An obvious one is a time machine - we want to allow you query at LSN (or timestamp). A less obvious one is templates: you can start your project with a pre-populated database. We will allow you to create and publish such "templates". Disclaimer - it might not be called templated when we ship it.

rattray · on May 28, 2022

For those unfamiliar, LSN is "Log Sequence Number", a pointer to a location in the WAL (Write-Ahead Log).

https://www.postgresql.org/docs/current/datatype-pg-lsn.html

vira28 · on May 28, 2022

Amazing work by the Team. Congrats y'all. It was one of the best presentations in the PGcon22.

I did email Heikki the following questions, in case if someone from Neon is around here.

a) How does Neon compare to polardb https://github.com/ApsaraDB/PolarDB-for-PostgreSQL.

b) The readme mentions a component "Repository - Neon storage implementation". Does it use any special FileSystem? Any links to read more about it?

c) Heard the cold start is a second (IIRC), how does that value differ if one runs Neon on bare metal instead of k8s?

nikita · on May 28, 2022

Thank you!

a. PolarDB is based on a similar idea. https://www.cs.utah.edu/~lifeifei/papers/polardbserverless-s.... This paper describes it. The biggest difference that I see glancing through the paper is that we really integrated S3 into the storage. In Neon architecture branches, backups, checkpoint are all the same thing and instant to run. This simplifies a good amount of database management AND deliver on better costs. S3 is cheap.

b. Neon doesn't need a special filesystem. Neon storage is in a way a filesystem, however it doesn't expose filesystem API. It's a key value store - serves 8k pages to Postgres and a consensus - update API to the key value store. Pages are organized in LSM trees and background processes put layers of the LSM trees to S3.

c. The cols start is 2sec right now. There is a dependency on K8S. Bare metal implementation will require new code to orchestrate starts and stops.

ignoramous · on May 28, 2022

> S3 is cheap.

S3 has its limitations though, like too many small files and the get/delete/list ops get very expensive. There's also an upper-limit on throughput per S3-bucket partition. I guess, sstables that pageserver flushes periodically help work around these issues?

> Neon storage is in a way a filesystem, however it doesn't expose filesystem API.

Genuinely curious: When would anyone consider using filesystems like Amazon FSx for Lustre instead which is backed by S3 anyway over implementing a filesystem-esque abstraction of their own (like neon.tech does, and other solutions like rocketset.com, tiledb.com, xata.io, and quickwit.io do).

> Pages are organized in LSM trees and background processes put layers of the LSM trees to S3.

Curious how merges are handled? Also, are you using RocksDB / some other engine underneath?

> Bare metal implementation will require new code to orchestrate starts and stops.

Speaking of new code... SingleStore started as a very high-throughput OLTP database and eventually evolved to into a HTAP (?) database. Do you see Neon evolving in a similar manner, too?

Thanks!

nikita · on May 28, 2022

1. Yes. Our first attempt at storage implementation had a problem with many small file. Then the team rearchitected it around LSM trees and it got a LOT better. Our benchmarks show that we are very close in performance with vanilla Postgres and Aurora. There are some "worst case" scenarios where Neon is worse than vanilla Postgres. Aurora has similar problems too.

2. It's best to custom build a storage system here. External distributed filesystems introduce complexity, cost, and bottlenecks that you don't control.

3. Purpose built. LSM trees also have a temporal dimension - LSN. You can fetch a page by pageId and LSN. This is what allows time machine and branching.

4. I call it convergence when OLTP and OLAP is one system - ultimate dream for a database systems engineer. Since I spent 10 years building it I have both scars and aspirations. I think it will come, but this will take a long time. HTAP is in a way a subset of convergence - most systems will have some HTAP. Neon will have some too, but for now it squarely focused on OLTP and helping developers build apps.

timmg · on May 28, 2022

The way you describe it, to me, is one of those “this sounds obvious in retrospect”. Sounds completely elegant and “right”. Congratulations on a great idea. I really hope you pull it off!

nikita · on May 28, 2022

Thank you! We are super hard at work. You can see our velocity here: https://github.com/neondatabase/neon

1500100900 · on May 28, 2022

> we really integrated S3 into the storage

Will it be possible to use something else in place of S3? I'm thinking on-premise or what some would call a private cloud.

mattashii · on May 28, 2022

Right now, it should be possible to use anything that is compatible with the S3 API, as our current focus is on getting the product to the market. Once the business model is proven, we'll likely branch out to other clouds, with their storage providers.

If you can't wait that long to run Neon on your own cloud, feel free to contribute an integration to your persistent blob storage: the code is available under APLv2 here: https://github.com/neondatabase/neon/

unraveller · on May 28, 2022

>serves 8k pages to Postgres

will page size be tunable on neon cloud for larger datasets?

nikita · on May 29, 2022

No Postgres only requires 8K. One can imagine adapting Neon storage to other engines then of course this can be extended.

avinassh · on May 28, 2022

> It was one of the best presentations in the PGcon22.

I can't find it on Youtube, do you have the link?

edit: I found the link, seems it is not on the Youtube yet: https://www.pgcon.org/events/pgcon_2022/schedule/session/236...

nikita · on May 28, 2022

I can't recommend this presentation enough!

ololobus · on May 28, 2022

> c) Heard the cold start is a second (IIRC), how does that value differ if one runs Neon on bare metal instead of k8s?

Yeah, as Nikita mentioned it's 2 seconds now. We did some tests and measurements and on bare metal, it's sub 500 ms usually, so the remaining part is the k8s (+ our own control plane) orchestration overhead. For example, with plain Docker (which we use in CI in addition to k8s) it's around 1 second already.

K8s provides a convenient abstraction layer, though. So I think that we'll continue using it and optimization will come with pods pool / over-provisioning and it'll be realistic to bring the startup time closer to bare-metal.

-- Cloud engineer @ Neon

talkingtab · on May 28, 2022

Why is this a good idea? In my experience, getting Postgres up and running is trivial. Docker anyone? And in many cases your data is your business so why hand it off? And if you are going to offer this product why not just call it what it is, "Postgres as service", instead of serverless which seems a bit misleading. Really it is simply Postgres running on your server.

chimen · on May 28, 2022

Not everyone can manage a database properly and, sorry to say this to you but, Docker is a terrible idea for a database in general. Setting up your own database somewhere still puts your trivial data on someone else's server more or less.

All these "Serverless" keywords pretty much mean you don't have to be spinning up servers (cloud) or setting up & maintaining one. Nothing is "Serverless" per-se so it's time to move on from picking on this, I agree, bad choice of words.

smokey_circles · on May 28, 2022

> Docker is a terrible idea for a database in general

Why? Genuine question, my gut feel is there's something wrong about it too but I can't put words to it nor have I found a benchmark that convinced me, but it's worth noting I'm not sure what I'm looking for

chimen · on May 28, 2022

I manage about 200 servers and docker crashing accounts for 20%+ of my issues so far. The servers are brought up easily on crash and that's not an issue for my services. For a database docker is nothing more than an extra layer of complications on top with iptables, volume system and all the layers it brings. It's just a bad wrapper for a production database which needs stability.

Art9681 · on May 29, 2022

To be fair, that sounds like a problem with your infra. There are plenty of enterprises running Postgres in Docker/Kubernetes successfully. A lot of technical problems are not the stack or the engineer's capability, but the company culture and lack of funding, time or resources. If you had more of all 3, im sure your problem could be "solved" to an acceptable margin of error. Likely by you.

iknownothow · on May 28, 2022

I knew my bet to sticking with Postgres would pay off! This looks super exciting.

I thought of doing something similar for our data warehouse with AWS Fargate and Postgres but the cold starts and limited disk space required too much engineering on top to make it work.

Moving to Snowflake comes at the cost of losing so many Posgtres features in exchange for speed. Things like foreign keys, constraints, extensions etc which requires so much engineering to replace in Snowflake. I would be happy to pay 25x the price for a 10x speed increase for a specific query.

nikita · on May 28, 2022

Thank you!

Snowflake is a better cloud data warehouse than Postgres, but of course Postgres is so versatile. Neon will give you some of the Snowflake features: time machine, cloning - we call this branching, data sharing.

thejosh · on May 28, 2022

Snowflake is a data warehouse though. Completely different use case.

If your data can be done via PG, highly recommend that over SF. Especially with this concept.

Snowflake is great when you use a tool like dbt, their modern SQL approach and functions are fantastic. Downsides is it's pretty pricey, and can catch you out.

iknownothow · on May 29, 2022

We already use DBT and the data is less than 10TB, something Postgres can handle well. And most of the data is concentrated in a few tables. With a serverless approach I'd be happy to allocate 10x resources for just a query or two and for the rest a minimal server is fine.

I manage the data warehouse mostly alone because Postgres offers guarantees, unique constraints, triggers and relationships between columns of different tables. It does the work of two engineers. Snowflake is fast but not Postgres compatible. In order to move to Snowflake, I have to write tests and maintain them which Postgres does for me for free.

I'd stick with Postgres at least until 20TB before considering Snowflake.

manigandham · on May 28, 2022

Snowflake is an OLAP system. It's an entirely different kind of "speed" designed for analyzing vast amounts of data through scans and aggregations.

iknownothow · on May 29, 2022

Neon now opens the door (at least in my mind) for Postgres to be used for analytics or a data warehouse for almost an order of magnitude more data before having to consider Snowflake.

Basically if someone is already using Postgres as a warehouse, then they can prolong their migration to Snowflake by at least a year by using something like Neon.

manigandham · on May 31, 2022

Sure but there are plenty of OLAP solutions like Greenplum, and extension-based offerings like Citus and Timescale, that can all partition and scale across nodes to massive datasets with column-oriented storage.

AWS Redshift is also built on Postgres (although a much older and customized version).

captnObvious · on May 28, 2022

I hope y’all have a plan for when AWS decides to pick up your open source project and turn it into a managed cloud solution. It’s a pattern of theirs. And with the way egress charges are structured they’re likely to snap up any clients straddling their cloud and yours.

mattashii · on May 28, 2022

AWS already has Aurora, which is their own in-house closed-source variant that does very similar things.

We think we'll be able to provide a better experience at lower cost for smaller developers, while having some very useful quality-of-life features like zero-cost branching and instant PITR.

oxfordmale · on May 28, 2022

AWS already has this, Google for Aurora Serverless. It is not cheap though, and this might well be cheaper.

onphonenow · on May 28, 2022

AWS has Aurora Serverless v2 out already for postgresql along with RDS for postgresql. Not scale to zero though / bottom is 40/month or so

IgorPartola · on May 28, 2022

I am trying to understand how it works without digging into the code. It sounds like the disk-backed storage here uses S3 which would introduce some severe latency as well as orders of magnitude more access errors (S3 is not going to be more reliable than EBS, let alone physical disk arrays on a day to day basis). Also how do they mitigate latency from their network to mine? In other words why would I run this over a local install if performance mattered at all to me?

mattashii · on May 28, 2022

How it works is:

PostgreSQL WAL is sent to 3 'Safekeeper' nodes, which provide temporary persistence of WAL on their local disks. This allows us to provide low commit latencies.

After Safekeepers acknowledge the WAL, a PageServer will receive the WAL from these Safekeepers and transform it into LSM-tree "Layers" - blocks of lookup-optimized changelogs, which (when complete) are sent to S3. At that point, the data is considered fully persisted against most, if not all, outages.

The PageServer (which serves as the long-term data server for the running compute nodes) maintains a local cache of Layers. Still, by design, that is only a cache -- it allows for fast responses but is not strictly necessary for the persistence model.

IgorPartola · on May 28, 2022

Again I am super impressed with the technology involved but do want to clarify: in order to have D in ACID the update must be sent to S3, right? Is there a mode which makes it so that an INSERT, UODATE, and DELETE do it return until this happens? What kind of latency does that introduce and is that latency affected by throughput at all?

mattashii · on May 28, 2022

Kind of. S3 is the long-term low-cost durability guarantee, while our Safekeepers (3, each in a different zone) provide a high-cost short-term durability guarantee with their local persistent disks.

Latency from PostgreSQL WAL to S3 depends on WAL throughput and the configured pageserver checkpoint distance (default 256MB, and this config field is not equal to that of PostgreSQL).

IgorPartola · on May 28, 2022

When you say short term do you mean for hot data or that the guarantee is short term? As in, once it is written to the Safekeeprs is there any chance that the data will disappear?

mattashii · on May 28, 2022

We keep it there for a short duration, until the changes are confirmed to also be written to S3.

Writing to 3 instances in 3 availability zones is considered persistent enough while also maintaining a high performance, and even though it does not provide the 11 9s of durability that S3 has, 3 availability zones dropping out with loss of all instance-local storage is considered rare enough that we do not think that it will impact our availability and durability guarantees.

IgorPartola · on May 28, 2022

That makes sense, thank you! Sounds pretty damn robust.