New PostgreSQL Interface for Cloud Spanner

wmitty · on Oct 12, 2021

The Cloud Spanner SQL frontend/analyzer has been open sourced by google as ZetaSQL. https://github.com/google/zetasql

It is amazingly good.

You give it textual SQL (+ schema + all your function definitions) and it returns a really clean logical query plan. It is also happy to do this via a protobufs so you can use it from languages other than C++. It is also tested and documented up the wazoo. It has been such a pleasure to work with.

Anyway, the big problem with ZetaSQL is that it is not a common SQL dialect.

It seems that the only reasonable way to do this PostgreSQL interface for Cloud Spanner is to add a second parser (and other extensions) to ZetaSQL. If I am correct, I really really hope they open source that part of ZetaSQL as well - it would be a massive step forward for open source SQL tooling.

rad_gruchalski · on Oct 12, 2021

Wow, thanks for that info, very useful. When you say “I really hope they open source that part”, do you mean the mechanism to plug other dialects in, or the PostgreSQL interface?

wmitty · on Oct 13, 2021

I mean the postgres parser (and semantic changes) for ZetaSQL. The zetasql parser is in a file called zetasql/parser/bison_parser.y, I strongly suspect they now have a file called something like zetasql/pgparser/bison_parser.y as well (and much more pervasive changes to support the deeper differences in the dialects).

This is the lexical structure and syntax docs for the new postgres inteface to cloud spanner:

https://cloud.google.com/spanner/docs/postgresql/lexical

And this is the zetasql lexical structure and syntax docs:

https://github.com/google/zetasql/blob/master/docs/lexical.m...

Notice that the new PG docs are an edit of the Zeta ones - evidence that my hypothesis is correct.

orf · on Oct 12, 2021

Looks awesome! A few data types not yet supported: https://cloud.google.com/spanner/docs/postgresql/data-types, like interval or array, which is a shame.

Is just like to mention how awful the Google cloud docs are on a slow connection. Each link does some weird SPA bullshit that seems to take way too long, and you seem to often end up in an inconsistent state where refreshing the page takes you to a completely different article. Then pressing back seems to re-load the current page you are on whilst it does more SPA bullshit.

Small, simple HTML doc pages that are fast to load please. Copy how AWS does it.

numbsafari · on Oct 13, 2021

Google cloud docs are garbage on a fast connection, too.

ttul · on Oct 13, 2021

Back in the dinosaur age, I built a WiFi hotspot billing system using PostgreSQL. MySQL had a larger following, but was less reliable and lacked transactions. PostgreSQL was a competitor to Oracle, which was seen as the undisputed leader for large-scale databases containing mission-critical business data.

Fast-forward to 2021 and I am not surprised at all that PostgreSQL now seems to be taking over everywhere. The ethic of the PostgreSQL team seems to revolve around correctness and reliability rather than features and speed. I congratulate the team for sticking to their mettle all these years and achieving such success.

gopalv · on Oct 13, 2021

> I am not surprised at all that PostgreSQL now seems to be taking over everywhere

To a large extent, Postgres didn't out-compete Mysql in mind-share by being a better database. Arguably it was a better DB in the Web 1.x LAMP days and everyone knew it - it just was SO hard to setup replication, which is where it still hurts (not for RDS though, which is another part of the story).

Mysql went out of its way to lose free users which was compounded when it was bought out by Oracle.

The death-blow of sorts for Mysql+Java was the GPL licensing of the jdbc binaries. This made Mysql a very sticky backend to use with a Spring or similar JVM based framework in general, because you had to convince someone that GPL is okay for a product that never leaves your building.

The postgres community on the other hand reaps the rewards of being very very friendly towards people who want to embrace its client protocols or even outright cut-pasting their code into a proprietary system which offers postgres compat but nothing else under the hood.

Thaxll · on Oct 13, 2021

Until 2011~~ PG had a terrible replication / no official system so not so much "mission-critical business data".

the_duke · on Oct 12, 2021

> the PostgreSQL interface implements a core subset of PostgreSQL data types and DDL syntax.

> Familiarity and portability are the goals, not 100% compatibility

So, is there a compatibility matrix?

In general I see this as a good thing. It allows using Spanner without requiring a big rewrite should one want to migrate away from GC. Or using Spanner as a secondary data warehouse with the same queries.

But both of those do require a high level of conformance.

dudus · on Oct 12, 2021

This page describes what is or isn't supported.

https://cloud.google.com/spanner/docs/postgresql/overview

e1g · on Oct 12, 2021

Notably missing: JSON(B), triggers, sequences/serials, arrays, custom functions.

jvolkman · on Oct 13, 2021

Definitely seems like a work in progress, which I suppose is why it's marked pre-GA.

ellisv · on Oct 13, 2021

and I'm out

Lack of JSON(B) is a complete no-go for me

epberry · on Oct 13, 2021

jsonb has become the crutch on which I build everything. And I'm not going back.

e1g · on Oct 13, 2021

Same. PG jsonb is so fantastic that it’s closer to exoskeleton than a crutch.

babelfish · on Oct 13, 2021

What do you use JSONB columns for? The only use case that comes immediately to mind is stuff like settings or config blobs for individual entities

e1g · on Oct 13, 2021

I run a multi-tenant SaaS where every tenant can define custom properties on every entity. PG allows us to dump everything into a single jsonb column, and still have superb performance for filtering / analytics / searching etc. The backend codebase remains mostly agnostic to the data shape, and we rarely have to change the database during new releases.

derefr · on Oct 13, 2021

Postgres jsonb + redis_fdw = you can JOIN your structured Postgres data against your unstructured app-layer Redis data. Not a good idea to write any code that does this—just move the Redis data into the DB, for crying out loud—but for BI use-cases where you’re writing one-off SQL queries and you need presentation-layer data that lives in Redis, it’s perfect.

dboreham · on Oct 13, 2021

PostgreSQL json functions allow me to dispense with the ORM. Just send and receive JSON (for regular table field types, note, not jsonb).

isuckatcoding · on Oct 13, 2021

You know it’s funny because I’m looking into using the JSON type for spanner. Is that comparable to JSONB

slownews45 · on Oct 12, 2021

Wow - I did NOT see this coming.

Crazy - and GREAT!

Cloud Spanner in my mind was always "cool" - but I never saw myself understanding how to use it.

I think for folks bridging from something like postgresql, this could also make spanner more accessible.

My own use cases can all be handled so simply with postgresql I don't see needing the spanner scale, but for others perhaps valuable.

slownews45 · on Oct 13, 2021

Quick follow-up. They are now offering Spanner at < $700/month (which was old pricing I think for 1 node). Ie, you can do something like processing units in increments of 100, so maybe $70/month.

Kind of weird, google paying attention to what users want!

What's next, google home / assistant will support google work calendars? Alexa has supported google work calendars forever, but google's own devices did not - despite plenty of requests.

I always wrote google off as so disconnected from users / developers wasn't worth dealing with them, but this is interesting to see some changes that look user / developer directed (vs just wowing with google fu features with modest uptake).

jeffxtreme · on Oct 13, 2021

Re your Assistant question, think this should be available by: https://support.google.com/assistant/answer/10265998

Do not know why it's not more marketed... a lot of people miss this!

whoknowswhat11 · on Oct 13, 2021

This wasn't supported for years.

https://services.google.com/fb/forms/googleassistantforgsuit...

was where you could try to get into a beta.

Google Home, Home Mini Home Mox Home Hum Nest Mini etc did not support gsuite accounts during launch as far as I can tell.

jeffxtreme · on Oct 13, 2021

Ah I see, yeah this is a newish feature and still doesn't support GSuite accounts as the main driver.

It just lets you link your consumer account to your GSuite (for calendars, etc.)

But as it seems so many people wanted it (including myself), just surprised it wasn't marketed

da39a3ee · on Oct 12, 2021

Is anyone able to comment on the extent to which Cloud Spanner and CockroachDB are in competition with each other? (CockroachDB is also wire-compatible with Postgres and originates as an implementation of something similar to Spanner)

Nican · on Oct 12, 2021

Just being "Postgres" is not simple. Both Spanner+CRDB provides some custom domain specific language that helps the data distribute itself around the cluster (or the world). Or for example, change the primary key of a table, so that the database can change the distribution of a table without having any down time.

I feel like the usual "lift and shift" claims that Cloud providers claims never goes as planned, and it is not something that you can just easily find by just reading documentation. For example, only after I started using CRDB, I found out that case-insensitive indexes is not yet implemented.

But I have been using CRDB+k8s for >1 year now, and it has been a great database. Basically 0 downtime, with minimal effort, including database upgrades.

rad_gruchalski · on Oct 12, 2021

There’s also YugabyteDB which is built on actual Postgres 11.2, reuses most of Postgres itself.

dudus · on Oct 12, 2021

They achieve the same goal, but cockroachDB usually relies on the NTP and Spanner makes use of specialized clocks on Google datacenter to be able to sync the multiple instances.

dilyevsky · on Oct 13, 2021

Both have managed versions but crdb has “run it yourself “ option as well as completely free version (missing chngefeeds and incremental backups i believe but other than that is pretty feature complete)

Diggsey · on Oct 13, 2021

Spanner, CockroachDB and similar are really exciting, but the fact that none of them support stored procedures begs the question of whether it's inherently hard/impossible to do so in a distributed database, or whether that just hasn't been a priority...

dalyons · on Oct 13, 2021

Why would they prioritize stored procedures ? They’re pretty much seen as an anti-pattern by most people today, outside of very niche use cases

electrum · on Oct 13, 2021

Procedures are very useful to enforce business logic in the database, rather than in the application. It’s like putting a higher level application-specific API on top of your database schema.

For Cockroach, it could help substantially with latency for multi-statement transactions, since the procedure could be executed on a database node closest to the leaseholder for the data. Executing a multi-statement transaction on the client means incurring per-statement latency between the client and gateway node, and between the gateway node and the leaseholder.

dboreham · on Oct 13, 2021

It's also a bad idea.

Diggsey · on Oct 13, 2021

> They’re pretty much seen as an anti-pattern by most people today

That's a bold statement :)

teeohco · on Oct 13, 2021

I worked at CRL. There's nothing about the architecture of CRDB that makes it hard or impossible. The company has bigger fish to fry with limited product and engineering hours.

ddorian43 · on Oct 13, 2021

YugabyteDB does https://docs.yugabyte.com/latest/explore/ysql-language-featu...

sciurus · on Oct 13, 2021

> To enable PostgreSQL tools that don’t know anything about Spanner, the PostgreSQL interface also includes an open-source adapter that runs as a sidecar proxy, close to a client.

I wonder why they decided to do this instead of building a postgres-compatible endpoint into the spanner service itself.

ralusek · on Oct 13, 2021

This is awesome, but what are the limitations? Unless I can use it with ORMs that I already tend to depend on (like Sequelize), it's probably a no-go. And I'm assuming no joins, or very limited left joins?

thorum · on Oct 13, 2021

That’s very cool. Does anyone know if there is a good list of other databases that include a PostgreSQL interface?

bigbizisverywyz · on Oct 13, 2021

I'm surprised people are applauding this. The net result of this support (and cockroach's) is that it will take mindshare away from Postgres which has worked hard for decades to build a loyal fanbase.

It's also step 1 of the e.e.e playbook, not that I'm suggesting Google want to extinguish Postgres, but it's clearly aimed at building their market share at the expense of Postgres's.