How and why GraphQL will influence the Sourcehut alpha

WhatIsDukkha · on June 11, 2020

I don't understand the attraction to Graphql. (I do understand it if maybe you actually want the things that gRPC or Thrift etc gives you)

It seems like exactly the ORM solution/problem but even more abstract and less under control since it pushes the orm out to browser clients and the frontend devs.

ORM suffer from being at beyond arms length from the query analyzer in the database server.

https://en.wikipedia.org/wiki/Query_optimization

A query optimizer that's been tuned over decades by pretty serious people.

Bad queries, overfetching, sudden performance cliffs everywhere.

Graphql actually adds another query language on top of the normal orm problem. (Maybe the answer is that graphql is so simple by design that it has no dark corners but that seems like a matter of mathematical proof that I haven't seen alluded to).

Why is graphql not going to have exactly this problem as we see people actually start to work seriously with it?

Four or five implementations in javascript, haskell and now go. From what I could see none of them were mentioning query optimization as an aspiration.

tshaddox · on June 11, 2020

GraphQL is quite similar to SQL. They’re both declarative languages, but GraphQL is declaring a desired data format, whereas SQL is declaring (roughly) a set of relational algebra operations to apply to a relational database. GraphQL is really nothing like an ORM beyond the fact that they are both software tools used to get data from a database. You might use an ORM to implement the GraphQL resolvers, but that’s certainly not required.

I wouldn’t expect the performance issues to be much more problematic than they would be for REST endpoints that offer similar functionality. If you’re offering a public API, then either way you’re going to need to solve for clients who are requesting too many expensive resources. If you control the client and the server, then you probably don’t need to worry about it beyond the testing of your client code you would need to do anyway.

As far as query optimization goes, that’s largely out of scope of GraphQL itself, although many server implementations offer interesting ways to fulfill GraphQL queries. Dataloader is neat, and beyond that, I believe you can do any inspection of the query request you want, so you could for example see the nested path “Publisher -> Book -> Author -> name” and decide to join all three of those tables together. I’m not aware of any tools that provide this optimization automatically, but it’s not difficult to imagine it existing for some ORMs like those in Django or Rails.

snorremd · on June 11, 2020

Hasura is one example of a graphql server that uses the SQL joins pattern to optimize graphql queries. See https://hasura.io/blog/architecture-of-a-high-performance-gr.... It automatically generates graphql queries based on your own schema and translates it to efficient postgres sql queries. The post I linked also briefly discuss the Dataloader pattern for batching graphql resolvers to avoid the N + 1 problem.

zwkrt · on June 11, 2020

For a fun hack-week project I created a graphql server that would automatically create any necessary SQL tables to support a query given to it. So if you made the query

    query {
        user {
            name
            email {
                primary
                secondary
            }
            posts {
                body
                karma
            }
        }
    }

It would create an entire database schema with users, emails, and posts, and the correct indexes and fk relations to support the graphql query. It would also generate mutations for updating the entities in a relational, cascading manner, so deleting a user would also delete their email and posts.

carapace · on June 11, 2020

That sound cool! Is it online anywhere? I'd love to look at it.

ahhsum · on June 12, 2020

Yo, this is legit what hasura does!

kevan · on June 11, 2020

Seems like you're looking at this through the lens of a single system that could submit a query to a single database and get all the data it needs. From that perspective GraphQL is definitely an extra layer that probably doesn't make sense. But even then there's still some value in letting the client specify the shape of the the data it needs and having client SDKs (there's definitely non-GraphQL ways to achieve these too).

My impression is GraphQL starts to shine when you have multiple backend systems, probably separated based on your org chart, and the frontend team needs to stitch them together for cohesive UX. The benchmark isn't absolute performance here, it's whether it performs better than the poor mobile app making a dozen separate API calls to different backends to stitch together a view.

kabes · on June 11, 2020

That's indeed one of its selling points. But most products I see that adopt graphql are exactly 1 database.

chrisfosterelli · on June 11, 2020

I think lots of people here are fixed on GraphQL from a backend perspective and are missing that GraphQL has a fantastic front end development experience using libraries like Apollo.

I honestly think the backend benefits are relatively marginal, but on the client being able to 1) mock your entire backend before it even exists, 2) have built in automatic caching on every query and entity for free, 3) use a fully declarative, easily testable and mockable React hook API built with out-of-the-box exposed support for loading and error states, is so valuable. Components written against Apollo feel so clean and simple. That's not to say anything about the benefits you get from being able to introspect the entire backend API or test queries against it with Apollo's extensions or how you can easily retain the entire entity cache across reloads with simple plugins to Apollo.

Can you do all that with REST? Sure. But writing queries against REST APIs is a pain in the butt compared to what you get for free by using Apollo.

austinpena · on June 11, 2020

Plus typescript + auto generating queries from graphql-let is literally heaven.

square_usual · on June 11, 2020

As GP said:

> But even then there's still some value in letting the client specify the shape of the the data it needs and having client SDKs

It may not exactly "shine" in those cases, but it reduces round trips and makes it easy for fronted engineers to make views that revolve around use cases instead of the resources in the database.

GordonS · on June 11, 2020

> it reduces round trips

Is that really a big deal with HTTP2 and pipelining?

I can also imagine situations where it results in a better UX to make multiple small calls, rather than one big one, as you'll have something to render faster.

lukevp · on June 11, 2020

If you are round tripping because of data, you’re having to traverse the network to the origin and likely composing your queries with dependent data, so http2 is little benefit. For example, if you are loading a book detail page, and want to also show the author name, but you need to retrieve the detail record to get the author ID first so you can call the author detail endpoint. Graphql solves for this by being able to fetch the book detail and the joined author name with one round trip.

GordonS · on June 11, 2020

Sure, I didn't mean multiple requests would always be a net benefit, only that I often come across cases where there is a need to load independent data simultaneously, in which case multiple requests can sometimes provide a better UX.

> Graphql solves for this by being able to fetch the book detail and the joined author name with one round trip

I don't see why we need GraphQL to solve this though - a REST backend could have an endpoint that returns the exact same data.

I can see how GraphQL might be somewhat nice for front end developers when the data to be displayed hasn't been nailed down yet - maybe we decide to show the book's ISBN number, and we can do that without changing the backend. Maybe this justifies the extra complexity for some teams, but I'd personally much prefer the simplicity of a REST API, to which you can always add OData on top if you really want.

virtue3 · on June 11, 2020

The real benefit of this, is that you can have all this data available, and now the mobile team and the web team can share a query, maybe the webteam wants everything, ithe ISBN the Author, etc. The mobile team only wants the author and title (and will show the rest when you click in, doing the same query or slightly different but with all the fields).

It's the power of the frontend teams to create a rest endpoint out of a schema.

When you combine it with typescript/JVM/Swift, you get AUTO typing for the graphql queries, you know exactly the data model you get back based on your query. It's quite lovely.

The other aspect is that on the apollo/graphql server you can utilize dataloader and streamline a nested request into as few calls to each service as possible.

And the last benefit over a rest service. If you had to make multiple calls, you're doing round trips from the CLIENT to the backend services. The graphql server is _already_ in your backend service network, so all the data combining is on the orders of <10ms versus <100ms (or much worse for mobile).

GraphQL has a major advantage over rest in that you can't just change the schema without the clients breaking, so you know that your API isn't going to magically just screw you. (Most places use versioning for this, but not always). You can get some of this with RPCs but it's not as robust as the graphql schema.

Nextgrid · on June 11, 2020

I always find it sad when developers waste time on developing over-engineered methods to reduce the amount of API calls while having 10 advertising/analytics SDKs in the background sending countless amounts of requests (which can still saturate the data link and make the main API requests stall even if it's a single one).

karatestomp · on June 11, 2020

There are automagic GraphQL layers that sort-of make sense to me, since at least they remove the biggest pain points. But AFAIK they're all single-database.

Actually stitching together multiple services or DBs with it manually seems like it'd be a hellish experience that'd end in a massive data breech or repeated accidental dataloss + restore-from-backup. Or else valid GraphQL going to such a system would be so restricted that the benefit over just using REST (or whatever) is zero.

WhatIsDukkha · on June 11, 2020

Yeah as you say """(there's definitely non-GraphQL ways to achieve these too)."""

These are largely matter of architecture design and graphql doesn't really fix those problems (my sense is it will make those problems harder actually).

devit · on June 11, 2020

The advantage of GraphQL is that the code for each API endpoint, which depends on frontend design (e.g. how many comments should be visible by default on a collapsed Facebook story), is now part of the frontend codebase (as a GraphQL query, that is then automatically extracted and moved to the backend), and thus frontend and backend development are no longer entangled.

Without it or a similar system frontend developers have to ask backend developers to create or modify an API endpoint every time the website is redesigned.

Also, it allows to combine data fetching for components and subcomponents automatically without having to do that manually in backend code, and automatically supports fine-grained caching of items.

macca321 · on June 11, 2020

Or the frontend devs could have just, like, learned how to write sql queries... :D

A major issue with pushing it to the frontend is that malicious clients can issue unexpected requests, putting strain on the database.

If the graphql query implementation doesn't allow that level of querying on the database, then it's not offering much more before you need to speak to the backend devs than a filterable rest endpoint.

This all came up years ago with OData.

Nextgrid · on June 11, 2020

My worry with GraphQL is that the server component is essentially a black box (as I don't have time to audit/review it) complex enough that there's more chance an edge case in a GraphQL query will end up exposing something you don't want.

A REST endpoint on the other hand is fairly simple and understood; there's (mostly) a static set SQL queries behind it and as long as those are not returning any unwanted data you are pretty much guaranteed to not expose something you didn't want to.

virtue3 · on June 11, 2020

the graphql server has a contract (the schema) that it will follow, or 500. So you know what you get back is exactly to spec. Or you get nothing.

REST endpoints are usually way more blackbox.

You can't claim that REST is better cuz you can look at the server... when you could do the same thing to the graphql server.

Graphql will -never- return you unwanted data. Because you wrote in the query exactly what you want.

If you want to examine an endpoint and JUST what it returns, you can do so really easily with graphiql.

https://developer.github.com/v4/explorer/

Just enter the api and you get an auto complete list of all the data fields you have access to. Or just use the schema explorer and click through. 100x easier than going through a sql query and analyzing a table.

grok22 · on June 11, 2020

>> Graphql will -never- return you unwanted data. Because you wrote in the query exactly what you want.

But couldn't you intentionally or unintentionally write a query such that it returns too much data and borks the system? Un-intentionally is the worrisome aspect.

tomnipotent · on June 11, 2020

There is nothing inherent in other systems that prevents this scenario, so why should GraphQL? This is a design decision orthogonal to whether it's REST, GraphQL, SOAP, or what have you.

grok22 · on June 14, 2020

With REST, for example, you usually have a smaller set of well defined APIs whose surface area is pretty visible and it could be custom optimized up-front or even disallow certain kinds of queries. GraphQL seems to provide enormous flexibility for the front-end engineer to generate any kind of request that it might not be possible upfront to anticipate all the kinds of requests that will be made and optimize them?

While it might be orthogonal to the design decision, it might add to the amount of unanticipated work that will be required just because of the enormous flexibility.

tomnipotent · on June 15, 2020

Nothing you said can't also be applied to GraphQL. It takes the same level of work to add pagination to a REST as it does to GraphQL, and you can add any arbitrary constraint you want as you see fit - nothing about GraphQL takes this away from you.

real_ben_michel · on June 11, 2020

Having seen many product teams implement graphQL, concerns were never around performances, and more around speed of development.

A typical product would require integrations with several existing APIs, and potentially some new ones. These would be aggregated (and normalised) into a single schema built on top of GraphQL. Then the team would build different client UIs and iterate on them.

By having a single queryable schema, it's very easy to build and rebuild interfaces as needed. Tools like Apollo and React are particularly well suited for this, as you can directly inject data into components. The team can also reason on the whole domain, rather than a collection of data sources (easier for trying out new things).

Of course, it would lead to performance issues, but why would you optimise something without validating it first with the user? Queries might be inefficient, but with just a bit of caching you can ensure acceptable user experience.

jayd16 · on June 11, 2020

I wonder if GraphQL would make more sense as a client side technology. The goals of dev ease seem better served by a graph the client can build (and thus span multiple remote services). Instead of transforming the backend, you simply get a better UI dev experience and the middleware handles query aggregation.

You'd want code gen to easily wrap REST services.

You could get some of the pipeline query/subquery stuff back (and lose caching) by setting up a proxy running this service or fallback to client side aggregation to span services not backed by the graph system (and maybe keep caching).

Maybe we're back to SOAP and WSDLs, though.

andrewingram · on June 11, 2020

I always say that GraphQL is best thought of as frontend code that lives on a backend server. If in your current world you have front-end code that talks to multiple traditional REST endpoints, the frontend is therefore responsible for managing all the relationships between the API data in order to build a coherent object graph which the UI can interpret. GraphQL turns this object graph into a first class citizen that can be consistently queried, rather than something everyone keeps having to reinvent, and moves it closer to the source of truth (the downstream services) for performance reasons (so that each network hop can be measured in single digit milliseconds rather than tens or hundreds).

I've seen GraphQL schemas being implemented on the client, it's certainly doable, but the performance is terrible compared to doing it on a server close to the source of truth.

jbreiding · on June 11, 2020

Graphql is generally referred to as be4fe, meaning backends for frontend.

Not exactly client side, but the main consumer is the client and s2s or service to service isn't intended target.

someotherperson · on June 11, 2020

Where I'm at now is my first foray with GraphQL - Graphene on the Django backend and Apollo on the frontend.

I'm not sure if it is the implementation - and it could very well be - but there has been more overhead and complexities than with traditionally accessed REST APIs. I can't see much value-add.

This becomes a lot more apparent when you start to include TS in the mix.

Perhaps it just wasn't a good use case.

tomnipotent · on June 11, 2020

It all depends on whether you expect to 1) use GraphQL as an ORM replacement or 2) use GraphQL as a layer to aggregate multiple disparate services into a unified API.

Most people want #1, Graphene is a bad choice because you still have to write a lot of boilerplate code. It has the added benefit that the current process is responsible for parsing the GraphQL query and directly calling the database, vs. using something like Prisma/Hasura which (may) require a separate process which in turn calls your database (so 2 network hops).

GraphQL was never intended to be an ORM replacement, but many have steered it towards that direction. It's not a bad thing, but it's still the same level of abstraction and confusion that people have wrestled with when using traditional ORM's except now you're introducing a different query API vs. native code/fluent interfaces/SQL.

tango12 · on June 11, 2020

(I'm from Hasura)

While Hasura + GraphQL can be used as an ORM (especially for serverless functions!), Hasura is designed to be used directly by clients as well.

Hasura has a metadata configuration system that works on top of an existing database that allows configuring mapping, relationships and most importantly permissions that make the GraphQL feasible to be used by even frontend apps. [1]

Further, Hasura has remote joins that can "join" across Postgres, GraphQL, REST sources and secure them. [2]

[1] https://hasura.io/blog/hasura-authorization-system-through-e...

[2]: https://hasura.io/blog/remote-joins-a-graphql-api-to-join-da...

dmitriid · on June 11, 2020

> I don't understand the attraction to Graphql.

It's attractive primarily to frontend developers. Instead of juggling various APIs (oftne poorly designed or underdesigned due to conflicting requirements and time constraints) you have a single entry into the system with almost any view of the data you want.

Almost no one ever talks about what a nightmare it becomes on the server-side, and how inane the implementations are. And how you have to re-do so many things from scratch, inefficiently, because you really have no control of the queries coming into the system.

My takeaway from GraphQL so far has been:

- good for frontend

- usable only for internal projects where you have full control of who has access to your system, and can't bring it down because you forgot an authorisation on a field somewhere or a protection against unlimited nested queries.

rhlsthrm · on June 11, 2020

As a full-stack dev, I'm going to always reach for things like Hasura for building my backend from now on. It auto generates a full CRUD GraphQL API from my Postgres DB schema. Most of the backend boilerplate is eliminated this way. If I need to add additional business logic, I can use serverless functions that run between the GraphQL query from the front end and the DB operations (actions in Hasura). Most of the heavy lifting is through the front end anyways, and this keeps everything neatly in sync.

eveningcoffee · on June 11, 2020

It is not really a backend when it does not involve business logic but just an access layer for the DB. It is pretty much a client server model.

xgenecloud · on June 11, 2020

hey, you might want to check xgenecloud where it is seamless to add business logic for generated APIs.

XgeneCloud makes it really simple to add business logic for generated APIs (REST and GraphQL both) over any SQL databases.

We just launched this week [2]

[1] : https://github.com/xgenecloud/xgenecloud

[2] : https://news.ycombinator.com/item?id=23466782

Website : https://xgenecloud.com

(disclaimer: founder here)

dmitriid · on June 11, 2020

We're exploring a GraphQL serv(er/ice) for an internal back office system. It needs to combine multiple APIs into a single GraphQL interface. And everything just breaks apart :) (we have PoCs in C# and Java for now).

gavinray · on June 11, 2020

You can use Remote Schemas in Hasura to combine multiple API's, and Remote Joins to join relational data across data sources:

https://hasura.io/docs/1.0/graphql/manual/remote-schemas/ind...

https://hasura.io/blog/remote-joins-a-graphql-api-to-join-da...

If you need to convert your API's into a GraphQL first, you can wrap the endpoints with resolvers yourself, or use automated tooling:

https://github.com/Urigo/graphql-mesh

https://github.com/IBM/openapi-to-graphql

dmitriid · on June 12, 2020

I took a quick peek, it's no different than writing your own resolvers in any other implementation.

dgellow · on June 11, 2020

What about authentication, authorization?

Also, how do you handle transactional logic?

rhlsthrm · on June 11, 2020

They have great recipes for authentication/authorization. It's much better IMO because it actually provides per-request authorization. There is also great support for transactions using the GraphQL mutations. I'm not affiliated with Hasura in any way, it's just changed the way I view backend development. Backends (in most cases, my day job is actually not part of this generalization) should basically be a thin wrapper around your database, and any work you can outsource to the database, you should do that rather than building business logic.

eatonphil · on June 11, 2020

Those aren't hard to do if you declare up front what schema you need to conform to.

I'm working on a REST code generator (generates a Go backend and a typescript/react frontend) that reads your postgres/MySQL schema and some additional metadata you provide (should auth be enabled? Which table is the users table and which columns are username and password stored as bcrypt). I'm still working on authorization part but basically optional per-endpoint logic DSL for simple stuff and optional Go hooks for more complex stuff.

https://eatonphil.github.io/dbcore/

GordonS · on June 11, 2020

> - good for frontend

I'm not even sure about this part.

I worked on a project recently where the mobile front end team hated working with GraphQL. They were far more used to working with REST/HTTP APIs, and in this particular project they only communicated with a single backend.

The team saw it as extra layers of complexity for no benefit.

The GraphQL backend was responsible for pulling together data from several upstream systems and providing it to clients. But the architect never was able to convince me of a single benefit here compared to REST.

hurricaneSlider · on June 11, 2020

> usable only for internal projects where you have full control of who has access to your system, and can't bring it down because you forgot an authorization on a field somewhere or a protection against unlimited nested queries.

As someone who is building a public facing GraphQL API, I would disagree with this. Directives make it easy to add policies to types and fields in the schema itself, making it amenable to easy review.

A restful API also has the problem that if you want fine grained auth, you'll need to remember to add the policy to each controller or endpoint, so not that different.

The typed nature of GraphQL offers a way of extending and enriching behavior of your API in a very neat, cross cutting way.

For example we recently built a filtering system that introspected over collection types at startup to generate filter input types. We then built middleware that converted filter inputs into query plans for evaluation.

I previously worked at another company that offers a public REST API for public transport. Public transport info is quite a rich interconnected data set. Despite efforts to ensure that filtering was fairly generic, there was a lot of adhoc code that needed to be written to handle filtering. The code grew exponentially more complex as more filters were added. Maybe this system could have been architected in a better way, but the nature of REST doesn't make that exactly easy to do.

Bottom line is that I feel for public APIs, that there is a lot of demand for flexibility, and eventually a public facing RESTful API will grow to match or even exceed that of a GraphQL API in complexity.

GordonS · on June 11, 2020

> A restful API also has the problem that if you want fine grained auth, you'll need to remember to add the policy to each controller or endpoint, so not that different.

This is dependent on the framework, just as it is with GraphQL - for example, with ASP.NET Core you can apply an auth policy as a default, or by convention.

> Despite efforts to ensure that filtering was fairly generic, there was a lot of adhoc code that needed to be written to handle filtering.

I've never seen this problem with REST backends myself, but I work with a typed language, C#. Again though, this is more of a framework thing than a REST/GraphQL paradigm thing.

hurricaneSlider · on June 11, 2020

The transport API I was referring to was written in .NET Core. I think .NET core is great at what it does, but runs into the same kinds of problems that GraphQL tries to address from the start once your API becomes sufficiently featured, which is likely to happen if you're offering an API as a service.

I actually think that unless your company is massive or has a lot of expertise in GraphQL already, using it for private APIs may not be the best idea, as it could be a sign of certain internal dysfunctions or communication problems within or between engineering teams.

----

An example, however of the kind of filtering I was referring to, and why I still think it would be non trivial to do, even in something like ASP.NET, is the following: https://www.gatsbyjs.org/docs/graphql-reference/#filter. This of course isn't something you get out the box in GraphQL either, but the structure of the system made this (relatively) easy to do.

Of course you could add something like OData to your REST API which would definitely be a valid alternative, but that also would have its own warts, and is subject to similar criticisms as GQL.

GordonS · on June 11, 2020

> An example, however of the kind of filtering I was referring to, and why I still think it would be non trivial to do, even in something like ASP.NET, is the following: https://www.gatsbyjs.org/docs/graphql-reference/#filter. This of course isn't something you get out the box in GraphQL either, but the structure of the system made this (relatively) easy to do.

Ah, then I misunderstood; I was thinking along the lines of dotnet's authorisation filters.

Filtering might require some reflection, expressions or funcs, which aren't necessarily "everyday" things for some devs, but they shouldn't pose any real trouble for seasoned dotnet devs. If you really want a standard that works OOTB for Entity Framework (and I assume EF Core), you have the option of OData too.

dmitriid · on June 12, 2020

This filtering is a custom DSL that really has nothing to do with GraphQL.

You might get away with it in a GraphQL implementation because you can possibly slap it in top a centralized endpoint, but I really question its efficiency in this case.

jayd16 · on June 11, 2020

I agree with you but I do wish for something that can improve on rest in ways GraphQL at least purports to.

Query chaining/batching and specifying a sub-selection of response data seem like solid features.

The graph schema seems to make good on some of the HATEOS promises.

I like the idea of GraphQL but the downsides have me worried.

jblwps · on June 12, 2020

Graphiti always seemed like a cool project in this vein.

https://www.graphiti.dev/guides/

tomnipotent · on June 11, 2020

> but the downsides

What do you consider the downsides?

jayd16 · on June 11, 2020

No caching. Not enough benefit at the present to warrant throwing away all the restful tooling we currently have.

ryanblakeley · on June 11, 2020

GraphQL was developed by Facebook to be used in conjunction with their frontend GraphQL client library called Relay. Most people opt Apollo + Redux because they were more active early on in releasing open source, and people argue it is an easier learning curve. IMO Relay is a huge win for the frontend to deal with data dependencies; and is a much better design than Apollo + Redux.

GraphQL formalizes the contract between front and back end in a very readable and maintainable way, so they can evolve in parallel and reconcile changes in a predictable, structured place (the GraphQL schema and resolvers). And it allows the frontend, with Relay, to deal with data dependencies in a very elegant and performant way.

eveningcoffee · on June 11, 2020

It is attractive for a front end dev who does not have control over the backend endpoints i.e. public API like Facebook.

It clearly looks questionable adaption for a single organization.

_tw9j · on June 11, 2020

That is upto the graphql framework and the consumers of them. Graphql is just a query language.

You need to have data loader (batching) on the backend to avoid n+1 queries and some other similar stuff with cache to improve the performance.

You also have cache and batching on the frontend usually. Apollo client (most popular graphql client in js) uses a normalized caching strategy (overkill and a pain).

For rate/abuse limiting, graphql requires a completely different approach. It's either point based on the numbers of nodes or edges you request so you can calculate the burden of the query before you execute it or deep introspection to avoid crashing your database. Query white listing is another option.

There are few other pain points you need to implement when you scale up. So yeah defo not needed if it's only a small project.

kabes · on June 11, 2020

"You have to calculate the burden of the query before you execute it so you don't end up crashing your database."

This sounds like disaster waiting to happen.

staticassertion · on June 11, 2020

It's not nearly as complex as paging, which has a similar purpose of limiting single-query complexity.

tomnipotent · on June 11, 2020

Anyone use MSSQL before it got ROW_NUMBER and window functions? Paging was a literal nightmare - if you wanted records 101-110, you had to fetch 1-110 and truncate the first 100 rows yourself (either in the DB via stored procedure or in your app code). I wish LIMIT/OFFSET was SQL ANSI standard.

cnorthwood · on June 11, 2020

I don't see GraphQL as an ORM type solution, I see it more like a replacement for REST.

AsyncAwait · on June 11, 2020

What would be the advantage of GraphQL over gRPC in a REST replacement scenario?

jakevoytko · on June 11, 2020

I'm only passingly familiar with gRPC, so forgive me if it offers some kind of linking like what I describe below.

In REST (and seemingly in gRPC), you define these siloed endpoints to return different types of data. If we're imagining a Twitter REST API, you might imagine Tweet and User endpoints. In the beginning it's simple - the Tweet endpoint returns the message, the user ID, and some metadata. You can query the User

Then Twitter continues to develop. Tweets get media attachments. They get retweets. They get accessibility captions. The Tweet endpoint expands. The amount of information required to display a User correctly expands. Do you inline it in the Tweet? Do you always require a separate request to the User, which is also growing?

As the service grows, you have this tension between reusability and concision. Most clients need only some of the data, but they all need different data. If my understanding of gRPC is correct, it would have this similar kind of tension: business objects that gain responsibilities will likely gain overhead with every new object that is added, since the clients have no way of signaling which ones they need or don't need.

In GraphQL, you define your object model separately from how it's queried. So you can define all of these things as separate business objects: a Tweet has a User, Users have Tweets, Tweets can link to Media which has accessibility caption fields, etc. Starting from any entry point to the graph, you can query the full transitive closure of the graph that you can reach from that entry point, but you don't pay for the full thing.

This relaxes the tension between reusability and concision. Each querier can request just the data that it needs, and assuming the backend supports caching and batching when appropriate, it can have some assurances that the backend implementation is only paying for what you use.

GordonS · on June 11, 2020

Thanks for posting this - I think your Twitter example is first I've ever read where I've actually been able to see any real benefit over REST.

Doxin · on June 12, 2020

Honestly graphql is a fairly small step up from REST if you squint at it hard enough. You could get pretty much 90% of the effect of graphql with a REST framework and a couple of conventions:

- Have the client specify which fields to return, and return only those fields

- Use the above to allow for expanding nested objects when needed

- Specify an API schema somehow.

All GraphQL does is formalize these things into a specification. In my experience the conditional field inclusion is one of the most powerful features. I can simply create a query which contains all of the fields without paying for a performance penalty unless the client actually fetches all those fields simultaneously.

GraphQL queries tend to map rather neatly on ORM queries. Of course you run into the same sort of nonsense you get with ORMS, such as the n+1 one problem and whatnot. The same sort of tools for fixing those issues are available since your graphql query is just going to call the ORM in any case, with one large addition. Introspecting graphql queries is much easier than ORM or SQL queries. I can avoid n+1 problems by seeing if the query is going to look up a nested object and prefetch it. With an ORM I've yet to see one which allows you to do that.

Lastly GraphQL allows you to break up your API very smartly. Just because some object is nested in another doesn't mean they are nested in source code. One object type simply refers to another object type. If an object has some nested objects that needs query optimizing you can stick that optimization in a single place and stop worrying about it. All the objects referring to it will benefit from the optimization without knowing about it.

GraphQL combines all of the above rather smartly by having your entire API declared as (more or less) a single object. That only works because queries only run if you actually ask for the relevant fields to be returned. It's very elegant if you ask me!

Long story short: yes you run into the same sort of issues optimization wise you get with an ORM, but importantly they don't stack on top the problems your ORM is causing already.

osrec · on June 11, 2020

I couldn't agree more. While GraphQL does allow you to be explicit about what you want from your backend, I've yet to see an implementation/solution that gives you back your data efficiently. If anything, the boilerplate actually seems to introduce inefficiency, with some especially inefficient joins.

And when you are explicit about how you want to implement joins etc, you pretty much have to hand code the join anyway, so I don't see the point.

In almost all use cases that I've come across, a standard HTTP endpoint with properly selected parameters works just as well as a GraphQL endpoint, without the overhead of parsing/dealing with GraphQL.

umvi · on June 11, 2020

Super useful for bandwidth sensitive situations where you need to piece together a small amount of data from several APIs that normally return a large amount of data.

kodablah · on June 11, 2020

Pardon my naivete, are the benefits of the flexibility that GraphQL give worth the unpredictability costs (or costs of customizing to add limits) of that same flexibility compared to writing a tailored server side call to do those calls and return the limited data instead?

GordonS · on June 11, 2020

Yeah, I just don't get it - even if you use GraphQL, you're going to have to do server-side code to make the calls to the different backends.

Maybe it's just that the backend devs in my last project werenyevery good, but the backend GraphQL code was ridiculously complex and impossible to reason about.

andrewingram · on June 11, 2020

It's easy to write an incomprehensible GraphQL server the first time you try it, but it's by no means an innate trait of the technology. Assuming your downsteam APIs are done well, it should be possible to write a GraphQL server which is easy to reason about and has predictable performance (largely on par with what you'd get if you built the endpoints by hand).

satvikpendem · on June 11, 2020

Yes, as this is the main reason that Facebook created GraphQL, it is expensive for mobile phones to query that much data. Something else interesting is that while you begin to write the tailored server side call, and you optimize it, you will end up with something that looks like GraphQL anyway.

kodablah · on June 11, 2020

Server-to-server GraphQL seems much more reasonable as both sides are controlled. It's the client-to-server I have less of a justification for compared to targeted calls w/ individual, limited contracts that are quantifiable and optimizable. I have witnessed the costs of allowing highly flexible data fetching from the client once it grows large.

tomnipotent · on June 11, 2020

> seems much more reasonable as both sides are controlled

GraphQL has been around for years and people keep making this argument, but where are all the horror stories of unbounded queries being made and systems being hobbled? The argument is beginning to sound anemic.

nawgszy · on June 11, 2020

I would say that you're also completely ignoring the benefits of typing. It's fair to say JS's lack of typing is a deep flaw, and so tools like TypeScript and GraphQL (which pair magically by the way; free type generation!) are ways to lift the typing from the backend to the frontend and give frontends stories around typing and mocking APIs that greatly improve the testability of the code.

goto11 · on June 11, 2020

I don't see the conflict? If the GraphQL query is translated into SQL on the server, then then the query optimizer would optimize that just as effectively as if the query had been written in SQL originally.

jmull · on June 11, 2020

The SQL your graphql implementation’s ORM middleware generates won’t optimize as well as hand-written SQL in many cases.

A decent system will provide the hooks you need to hand optimize certain cases somehow, but There are always limitations and hoops to jump through and additional complexity to manage. The extra layers that are meant to make your life easier are getting in the way instead. (May or may not still be worth it, but the point is, it’s not a foregone conclusion.)

goto11 · on June 12, 2020

But why wont it optimize as well as hand-written SQL? How can the query optimizer even tell the difference? Surely the SQL is transformed into some kind of syntax-independent abstract query tree before it is passed to the optimizer.

jmull · on June 12, 2020

There are a lot of ways to write a SQL query against a set of tables, given a set of input parameters, to get the desired result. (Not just syntactic or other superficial differences, but differences in logic and how relationships are specified.)

And, of course, an optimizer will handle different SQL queries differently.

mixedCase · on June 12, 2020

No such thing. The database interface is SQL. The query optimizer is something run in the database.

goto11 · on June 12, 2020

Yes I mean inside the database engine. The engine receives SQL and parses it and then pass the query tree to the optimizer which generate a query plan. My question is how the optimizer could have problems if the SQL is generated as opposed to hand-written? How could that make a difference?

mixedCase · on June 12, 2020

As mentioned in another the comment, because SQL that is hand-written for non-trivial cases will too often be better than what an ORM generates.

goto11 · on June 13, 2020

This is so hand-wavy I can't really argue against it.

mixedCase · on June 11, 2020

> then the query optimizer would optimize that just as effectively as if the query had been written in SQL originally

...and other lies we tell ourselves to sleep soundly at night.

But just like ORMs, they do work for the simple cases which tend to abound and you can hand-optimize the rest.

goto11 · on June 12, 2020

> ...and other lies we tell ourselves to sleep soundly at night.

So tell me how the query optimizer could possibly tell the difference between generated and hand-written SQL?

mixedCase · on June 12, 2020

Because the SQL will not be the same. In complex cases, SQL generated by an ORM will do a combination of "more than it needs to" and plain inefficient queries. It can only do so much with what it knows about the database and the API used to call it.

Hand-written SQL gives the author a chance to be more precise in its needs, not only in the SQL but also by pre-generating obvious indexes to be performant from the get-go.

goto11 · on June 12, 2020

Appropriate indexes will be utilized (by the query planner) whether the SQL is hand-generated or generated from some other query languages. It doesn't make a difference. How could it?

And the whole point of GraphQL is that you specify exactly what data you need, so overfetching is avoided. This is in contrast to traditional REST API's where you get a fixed resource.

mixedCase · on June 12, 2020

> Appropriate indexes will be utilized

If present. And for that they need to be created, and which are the right ones is less obvious when working higher in the abstraction staircase.

> It doesn't make a difference. How could it?

Because the SQL generated by ORMs can be wildly stupid in many cases. Here's one blogpost with an example, where regular SQL and query builder that maps to SQL almost directly generate a decent query on a simple relation, while a full ORM does something stupid: https://blog.logrocket.com/why-you-should-avoid-orms-with-ex...

goto11 · on June 13, 2020

The article shows one query generator which generates SQL which are logically equivalent to the hand written SQL, and compares it to another tool which generate inefficient queries. So it is not an inherent problem with generated SQL, just with some particular tool. So...use the right tool?

GraphQL allows you to specify what data you need. Obviously, if you use some middleware which throws this information away and just fetches everything from the database, then you have an inefficient system. But this is not a problem inherent to GraphQL or generated SQL in general.

DrFell · on June 11, 2020

GraphQL is instant API to frontenders.

Their justification for needing it is that the API team takes too long to implement changes, and endpoints never give them the data shape they need.

The silent reason is that server-side code, databases, and security are a big scary unknown they are too lazy to learn.

A big project cannot afford to ask for high standards from frontenders. You need a hoard of cheap labor to crank out semi-disposable UIs.

yuchi · on June 11, 2020

There’s also a different side of the story: UI are very pricey to get right — that’s because user research is not cheap — so a very efficient approach is having disposable UIs since you know you are going to get them wrong at the start.

Carefully designed endpoints require a lot of back and forth between teams with very different skill set so to build them you need to plan in advice, but it does not work here where you literally have to move fast (and fail fast!).

You only have two options left: (1) you either ask frontenders (or UX-wise devs) to do the whole thing or (2) you build an abstraction layer that let frontenders query arbitrarily complex data structures with near-perfect performances (YMMV).

In case (1) you’re looking for REAL full-stacks, and it’s not that easy to find such talented developers. In case (2) well… that’s GraphQL.

GordonS · on June 11, 2020

> Their justification for needing it is that the API team takes too long to implement changes, and endpoints never give them the data shape they need.

I've certainly seen timing issues between frontend and backed teams; actually, I don't think I've ever been on a project where that wasn't an issue!

But on my last project, which had a GraphQL backend, this was still a problem. The backend integrated with several upstream databases and REST APIs, so the backend team had to build the queries and implementations before the frontend could do anything. At least with REST they would have been able to mock out some JSON more easily.

jorams · on June 11, 2020

In my experience GraphQL can be much nicer to implement than REST, and it offers a good structure around things that many REST APIs implement in particular ways (like selecting which fields you want). The pain you'll experience depends heavily on your data model and the abuse potential that brings.

I think the biggest problem with GraphQL is the JavaScript ecosystem around it, and all of its implicit context. It seems to be built entirely on specific servers and clients, instead of on the general concepts.

Relay[1], a popular client-side library, adds all kinds of requirements in addition to the use of GraphQL. One of those is that until version 8, it required all mutation inputs and outputs to contain a "clientMutationId", which had to be round-tripped. It was an obvious hack for some client-side problem which added requirements to the backend. Somehow it had a specification written for it instead of being fixed before release. This hack is now in public APIs, like every single mutation in the GitHub API v4.

GraphQL also includes "subscriptions", which are described incredibly vaguely and frankly underspecified. There are all kinds of libraries and frameworks that "support subscriptions", but in practice they mean they just support the websocket transport[2] created by Apollo GraphQL.

If you just use it as a way to implement a well-structured API, and use the simplest tools possible to get you there, it's a pleasure to work with.

[1]: https://relay.dev/

[2]: https://github.com/apollographql/subscriptions-transport-ws

kasbah · on June 11, 2020

I don't think Relay is used that much outside of Facebook. The community seems to have settled on Apollo. Personally I find Apollo over-engineered. When I couldn't delete things from the cache, because of a bug, and was faced with digging into the complex code-base, I ended up just using straight JSON with a HTTP client/fetch and caching in a simple JS object.

Other users of my API [1] just use straight HTTP with JSON as well. GraphQL clients seem to solve something we are not encountering. If urql [2] or gqless [3] work well when I try them I'd be up for changing my mind though.

[1]: https://github.com/kitspace/partinfo

[2]: https://github.com/FormidableLabs/urql

[3]: https://gqless.dev/

andrewingram · on June 11, 2020

Relay has a PR problem and i'm not 100% happy with it, but I've been using it since 2015 (never worked at Facebook) and would still choose it over Apollo. Main reason is that Apollo is geared towards getting up and running easier, but this leads to a worse overall experience beyond that point.

striking · on June 11, 2020

Relay has quite a few users, a partial list of which you can find at https://relay.dev/en/users. I personally prefer it myself.

tannhaeuser · on June 11, 2020

Tbh I'd expected a little better than framing this question in a "REST vs GraphQL" discussion coming from sourcehut.org. If you control your backend, you can aggregate whatever payloads you please into a single HTTP response, and don't have to subscribe to a (naive) "RESTful" way where you have network roundtrips for every single "resource", a practice criticized by Roy Fielding (who coined the term "REST") himself and rooted in a mindset I'd call based more on cultural beliefs rather than engineering. That said, a recent discussion [1] convinced me there are practical benefits in using GraphQL if you're working with "modern" SPA frameworks, and your backend team can't always deliver the ever-changing interfaces you need so you're using a backend-for-fronted (an extra fronted-facing backend that wraps your actual backend) approach anyway, though it could be argued that organizational issues play a larger role here.

[1]: https://news.ycombinator.com/item?id=23119810

jamil7 · on June 11, 2020

I like GraphQL but if you're just serving a single SPA I wonder about all this busy work we still have to do. Why haven't we gone a step further and just abstracted all the networking and serialisation steps away and our models are synced for us in the background. Maybe the apollo team is heading in this direction but their offline story isn't great yet.

Edit: I remember now that the Apollo team is made up of members of the former Meteor team which worked in a similar way using a client side database.

_tw9j · on June 11, 2020

You can use rxdb for replication and offline support.

jamil7 · on June 11, 2020

Yeah good point, WatermelonDB, rxdb and pouchdb all fit this model. This to me feels like the future for web and mobile.

_tw9j · on June 11, 2020

Hopefully not. In an ideal world, this would be the responsibility of the browser given the widespread web apps.

jamil7 · on June 11, 2020

I don't follow, how do you mean?

zapf · on June 11, 2020

If you don't like REST, don't use it.

Whatever you do, don't even think that GraphQL will solve your problems. You were on the right track staying away from it till now.

I can't also advise enough to stay away from a typed language (Go in this case) serving data in a different typed language (gql). You will eventually be pulling your hair out jumping through hoops matching types.

After my last web project that require gql and go, I did some digging around, thinking, there has to be a better alternative to this. I have worked with jQuery, React, GraphQL.

My conclusion was that next time I will stick to turbolinks (https://github.com/turbolinks/turbolinks) and try stimulus (https://stimulusjs.org/).

square_usual · on June 11, 2020

> stimulus (https://stimulusjs.org/).

And here I thought Basecamp was still 100% rails. Interesting to see that they're also developing backend JS frameworks.

odensc · on June 11, 2020

Stimulus is a frontend framework.

Scarbutt · on June 11, 2020

Turbolinks doesn't work with third party(js) anything.

say_it_as_it_is · on June 11, 2020

Where are the GraphQL lessons learned? The author hasn't even implemented a solution with it yet, but that hasn't stopped him from declaring it to the world. I don't find an announcement useful.

Maybe GraphQL adopters aren't sharing their experiences with it in production because they're realizing its faults? People are quick to announce successes and very reluctant to own, let alone share, costly mistakes. Also, people change jobs so often that those who influence a roll-out won't even be around long enough for the post-mortem. GraphQL publicity is consequently positively biased. If the HN community were to follow up with posters who announced their use of GraphQL the last two years, maybe we can find out how things are going?

gbear605 · on June 11, 2020

The reason this post was written was for users of Sourcehut, especially for people writing to its API. This post isn’t particularly relevant or explanatory for other audiences, but I don’t think it’s supposed to be so that’s fine.

say_it_as_it_is · on June 11, 2020

I understand what you mean. That makes sense.

dgellow · on June 11, 2020

The author isn’t the one who shared the link on HN.

tleb_ · on June 11, 2020

I think this article misses the explanation of why GraphQL over REST. I usually don't like "x versus y" articles but here both have been tested on SourceHut, the hindsight should probably appear as useful.

Thanks Drew and others for SourceHut.

ianamartin · on June 11, 2020

Well, that's too bad. I always thought this was a cool project. But if you can't dev your way into decent performance for a small alpha project using python/flask/sql, I don't think your tools are the problem. And I guarantee that a graphql isn't the solution.

So, I mean, good luck.

_tw9j · on June 11, 2020

I didn't read the post that way. I feel scaling is a small issue more so organization. Graphql does make sense for something like source hunt.

Why not use type hints in python? Isn't that a good enough substitute?

I wonder why go instead of rust if he wanted static typing, long term ease of maintanence and performance. Go's type system is not great especially for something like graphql. Gqlgen relies heavily on code generation. Last time I used it, I ran into so many issues. I ditched go together after several painful clashes with it that community always responded with: oh you don't need this.

(yeah except they implemented all those parts in arguably worse ways and ditched community solutions in the next few years)

One major benefit the GP fails to mention is that with graphql, it is easy to generate types for frontend. This makes your frontend far more sane. It's also way easier to test graphql since there are tools to automatically generate queries for performance testing unlike rest.

There is no need to add something for docs or interactivity like swagger.

erk__ · on June 11, 2020

As for the reason to use Go instead of Rust it is probably just down to the creator of Sourcehut he has multiple times expressed that he dislikes rust quite a bit.

He has written a blog post about how he chooses programming languages as well https://drewdevault.com/2019/09/08/Enough-to-decide.html

Kuinox · on June 11, 2020

It's the first time I hear that haskell has awful package management...

tome · on June 11, 2020

It used to a few years ago, before Cabal v2-style. Nowadays package management is rather good, but its reputation hasn't caught up yet.

Scarbutt · on June 11, 2020

I hear the opposite, it's so awful they have to use Nix to keep it sane.

tome · on June 12, 2020

Interesting. Where did you hear that?

NewJazz · on June 11, 2020

>One major benefit the GP fails to mention is that with graphql, it is easy to generate types for frontend. This makes your frontend far more sane.

By GP (grandparent?) do you mean the article / blog post?

Because if so I see no indication that Drew plans to adopt a SPA architecture -- he seems intent on continuing to use server side rendering with little javascript, which would make "frontend types" sort of irrelevant.

_tw9j · on June 11, 2020

You don't need spa to take benefits of the generated types from graphql. If you use something like typescript on the backend (SSR app), the types will be stripped out in the end so it doesn't affect your bundle size.

NewJazz · on June 11, 2020

You must understand my confusion when your original comment explicitly states that frontend types can be generated, but your reply here seems to be talking about a javascript/typescript backend service.

_tw9j · on June 11, 2020

Sorry by backend, I mean your SSR app.

vertex-four · on June 11, 2020

What bundle size? There's no javascript being shipped to the client.

gbear605 · on June 11, 2020

Technically there is some used for a few things, like a text editor (for writing build manifests) and for payments. But those are very limited and aren’t relevant to the use of GraphQL.

TeeWEE · on June 11, 2020

Code generation (of types and clinet/server stubs) can be done with any IDL (interface definition language). OpenAPI, GRPC, Thrift etc etc all support it. No reason to choose GraphQL only because of this.

The power in GraphQL comes from the graph and flexibility in fetching what you need. Usefull in general purpose APIs (like GitHub has).

_tw9j · on June 11, 2020

I never said it was the only reason. I said it is one of the benefits that GP failed to mention in the post. Other benefits that GP mentioned still applies.

You can of course do this with other standards but ime, it's easier to do this with graphql since you only have to build the api. There is less overhead overall since type information is part of the standard, not necessarily something people add afterwards or choose to. Introspection, graphiql and all the tooling is easier to use and doesn't need integrating something like swagger.

It comes setup by default on most solid graphql frameworks.

NewJazz · on June 11, 2020

This quote stuck out to me:

>Today, the Python backends to the web services communicate directly with PostgreSQL via SQLAlchemy, but it is my intention to build out experimental replacement backends which are routed through GraphQL instead. This way, the much more performant and robust GraphQL backends become the single source of truth for all information in SourceHut.

I wonder how adding a layer of indirection can significantly improve performance. If I were writing this service, I would go all in on GraphQL and have the frontend talk to the GraphQL services directly rather than routing the requests from Python through to a GraphQL service then presumably to PostgreSQL.

Perhaps I am missing something. Indeed good luck to Drew here.

ianamartin · on June 11, 2020

That quote is sort of exactly what's conceptually wrong with what's goin on in my opinion. Yes, I know, armchair quarterback and I'm not the one out there building stuff like this for free, etc., etc.

But claiming some nebulous backend that's more performant and robust than Postgres is like, WTF? Are you using an actual GraphDB like Neo4J? Are you putting a graph frontend on Postgres like PostGraphQL? None of the post really makes any sense because GraphQL is a Query Language, not a data store. What are the CAP theorem tradeoffs in the new backend? What does more robust mean? What does more performant mean? This is a source control app. Those tradeoffs are meaningful.

There seems to be a lot of conflation between API design and data store and core programming tools all mixed into a big post that mostly sounds to me like, "I don't get how to make this (extremely popular and well-known platform that drives many websites 10000x my size) work well, so I'm trying something different that sounds cool."

Which, again, the author has always said this is an experiment, and that's cool. But the conceptual confusion in the post makes me think that moving away from boring tools and trying new tools is not going to end up going well.

But this is a source control app, and it's hopefully backed up somewhere besides sourcehut so it should be fine if he needs to backtrack.

karatestomp · on June 11, 2020

> Are you using an actual GraphDB like Neo4J? Are you putting a graph frontend on Postgres like PostGraphQL? None of the post really makes any sense because GraphQL is a Query Language, not a data store. What are the CAP theorem tradeoffs in the new backend? What does more robust mean? What does more performant mean? This is a source control app. Those tradeoffs are meaningful.

GraphQL isn't particularly "graphy". Its name sucks. But don't worry, plenty of half-techy middle managers are out there making the same mistake and going "we do graph things, why don't you guys look into this GraphQL thing that's getting so much buzz?" It's not a great fit for graph operations, in fact. Not more than SQL, certainly.

As for N4J in particular, don't count on that to improve performance even if you're doing lots of graph stuff. Depends heavily on your usage patterns and it's very easy to modify a query in a way that seems like it'd be fine but in fact makes performance fall off a cliff. OTOH Cypher, unlike GraphQL, is a very nice language for querying graphs.

vertex-four · on June 11, 2020

The goal here is to generate a typed API across a bunch of microservices (written in some typed language suited for the job) that are consumed by a Python frontend. The current design is a pile of vertically-integrated monoliths that touch the disk, database, perform backend operations and rendering all in one process.

Python's single-threaded design makes it difficult to be responsive to small queries quickly while simultaneously serving large, time-consuming queries (i.e. git operations). You can get around this using worker queues to separate interpreter processes and an async design, or otherwise splitting your workload up... or you can use a language where "have a threadpool" is actually a properly supported concept, and an architecture where sharding git/email/etc backends is feasible.

pknopf · on June 11, 2020

> The goal here is to generate a typed API across a bunch of microservices

You are describing gRPC.

vertex-four · on June 11, 2020

I'm describing a lot of things, JSON-Schema documented REST APIs being another one. The other thing about GraphQL is that you can make a query that contains multiple requests and allows the server to optimise how to process them, which is not something that REST or gRPC are very good at.

zapf · on June 11, 2020

Well said! Couldn't agree more.

The GraphQL confusion is one more bullshit in the world of web dev.

ianamartin · on June 11, 2020

I'm not totally against GraphQL in general. As an alternative to REST it can sometimes make sense. And let's be real, most REST APIs are absolute garbage. Anything would be better than a bad REST API.

And if, in fact, you are storing a graph in a graph database, the QL makes a bit of sense.

But nothing in the post makes any sense out of any of that. It's just Python bad; REST bad; I read too much hacker news, and I feel like it's time for a change.

Like, when I complain about other people's REST APIs, that's out of my control. This guy is saying that his API is garbage, and instead of fixing it to make it better, he's just going to redo everything with a worse result. I don't get it.

ddevault · on June 11, 2020

It will be performing this indirection over localhost. I don't see it as much different from the indirection of SQLAlchemy. Yes, there is the question of parsing the GQL and so on, but I think that they're surmountable and fit well within our desired performance budget.

Performance is also just one of many reasons why this approach is being considered.

ddevault · on June 11, 2020

Performance is a secondary concern. SourceHut already has the best performance in the industry, built on Python:

https://forgeperf.org

But I think it could be even better, and this work will help. It will make it easier to write performant code without explicitly hand-optimizing everything.

There are more important reasons to consider GraphQL than performance, which I cover in detail in TFA.

pknopf · on June 11, 2020

> And I guarantee that a graphql isn't the solution.

I agree. Out of the pan and into the frier.

He had a good idea though.

ghoshbishakh · on June 11, 2020

what do you feel the problem is?

ianamartin · on June 11, 2020

There are lots of production sites that serve 10,000x as much traffic as sourcehut that are built on Python/flask/sqlalchemy serving RESTful APIs.

If you can't make that combination work well, there's another place to look for problems besides your tool kit. You might need to ask yourself if you really understand the tools you're trying to use.

But like I said, this has always been a very cool project. My "good luck" was meant more as actual good luck than a Morgan Freeman You're-trying-to-blackmail-batman kind of good luck.

xrisk · on June 11, 2020

What are some production websites that run Python stacks? I’m curious.

On a tangential note: if anybody has blog posts on scaling Flask/SQLAlchemy or Django stacks I would appreciate it.

ianamartin · on June 11, 2020

Reddit is one you might have heard of.

pypi.org is another that's familiar. You know, every time you type `pip install x` yeah, that's pypi.

Although I think those are both powered mainly by Pyramid rather than flask. Still, same concept.

As others mention, large parts of google and youtube are still python. Dropbox was so invested in python that they employed Guido van Rossum for a while. Instagram, a lot of Yahoo! back when they were a thing, Spotify, Quora, Pinterest, Hipmunk, Disqus, and this really obscure satire site called The Onion that totally never gets any traffic at all.

All of them powered by python at their core, many of them Django, some Pyramid, and some Flask.

Yes, getting that big does require big teams. Becoming one of the top 100 or so sites on the internet always requires some special sauce as well as dedicated teams. But most of these companies started with Python and a framework and got to massive web scale along the way and never changed the core platform because there really wasn't a need. Handling scale isn't about your core language or framework. It's about dozens of other things that you can offload to other things if you're smart. But let's be real: sourcehut isn't close to any of that level of traffic.

My negativity on this isn't about stanning a particular language. I'm an agnostic in multiple ways. I'll use whatever tool seems like the best fit. I'm down on this because the explanation is tool-blaming, murky, unclear, and doesn't provide a lot of the detail I would want to have if I were depending on this service.

On the other hand, the guy has always said this is an alpha project and you should expect major changes. That's all fine. It's just weird to me to see a "why I'm changing from X to Y" post that doesn't really explain anything other than "I might be bad at this."

kupaka · on June 11, 2020

jfyi, The Onion hasn't run on Python for a few years now. In 2017, they migrated over to Gizmodo's Scala stack.

alexchamberlain · on June 11, 2020

Don't Instagram and YouTube still have large Python code bases?

_tw9j · on June 11, 2020

Dropbox is a notable one born here.

procinct · on June 11, 2020

Especially considering they employed the BDFL.

square_usual · on June 11, 2020

No longer the BDFL, unfortunately.

iso-8859-1 · on June 11, 2020

The BDFL is a position that you have for life. It is embedded in the definition. So it is impossible to shed the title.

dragonwriter · on June 11, 2020

Like most “for life” positions (the papacy, US federal judges, the British monarchy, etc.), it can be shed though it is not regularly expected to be lost other than at the holder's decision.

leadingthenet · on June 11, 2020

Might I suggest taking a look at FastAPI?

It’s been a 10x+ improvement on Flask, in my experience.

AmericanChopper · on June 11, 2020

GraphQL as a query language is simply better than REST in most cases imo. REST has too much client side state, which not only has the potential to make things harder for clients to consume, but also has all the inconsistent states to handle where your consumer gets part way through a multiple-REST method workflow, and then bails. REST also absolutely sucks for mutating arrays.

Really I just look at GraphQL as a nice RPC framework. The graph theory operations like field level resolvers are mostly useless. But if you treat each relationship as a node rather than each field, you can get it to work very nicely with a normalized data set. I haven’t found it hard to preserve join efficiency in the backend either, and it so far hasn’t forced me into redundant query operations.

Just as long as you don’t use appsync. Really, don’t even bother.

GordonS · on June 11, 2020

> GraphQL as a query language is simply better than REST in most cases imo. REST has too much client side state, which not only has the potential to make things harder for clients to consume, but also has all the inconsistent states to handle where your consumer gets part way through a multiple-REST method workflow, and then bails.

How much client state you maintain seems to me to be orthogonal to GraphQL/REST.

Take your example or a multiple-REST workflow. I presume your point was that the workflow could be implemented by a single GraphQL query/mutation/whatever - but just the same, you can put as much code and logic as you like behind a REST call?

AmericanChopper · on June 11, 2020

You could do that, but if you start creating endpoints for transactions rather than method -> resource endpoints, then you’re not really making a REST interface anymore. But even ignoring REST purity, I’d argue that GraphQL is better suited to that design pattern in general.

GordonS · on June 11, 2020

When most people refer to REST APIs, they really mean "HTTP APIs". I really don't think we should reach for GraphQL just because of the ideological notion of daring to not adhere 100% to REST.

AmericanChopper · on June 11, 2020

If you’re going to abandon REST design principles, then you’ll eventually end up with something that pretty much does what GraphQL does. At that stage why not just adopt a more fit-for-purpose set of design principles?

Keep in mind that I’m only really advocating for it as a query language for HTTP APIs (which as a side benefit has some nice existing tooling which you may or may not find useful).

mehdix · on June 11, 2020

> With these, you can deploy a SourceHut instance with no frontend at all, using the GraphQL APIs exclusively.

This reminds me of Kubernetes' design. You have an API server which is practially the Kubernetes from user's perspective. `kubectl` is just one out of possibily many clients that talk to this API.

Edit: typos.

awinter-py · on June 11, 2020

> The value-add is difficult to understand

yup

orf · on June 11, 2020

The author says that he has soured on Python for “serious, large projects”. While it’s clearly personal opinion, and that’s fair enough , I can’t help but think his choice of framework hasn’t helped him and has likely caused significant slowdown when delivering features.

Looking through some of the code for Sourcehut, there’s an insane amount of boilerplate or otherwise redundant code[1]. The shared code library is a mini-framework, with custom email and validation components[2][3]. In the ‘main’ project we can see the views that power mailing lists and projects[4][5].

I’m totally biased, but I can’t help but think “why Flask, and why not Django” after seeing all of this. Most of the repeated view boilerplate would have gone ([1] could be like 20 lines), the author could have used Django rest framework to get a quality API with not much work (rather than building it yourself[6]) and the pluggable apps at the core of Django seem a perfect fit.

I see this all the time with flasks projects. They start off small and light, and as long as they stay that way then Flask is a great choice. But they often don’t, and as the grow in complexity you end up re-inventing a framework like Django but worse whilst getting fatigued by “Python” being bad.

1. https://git.sr.ht/~sircmpwn/paste.sr.ht/tree/master/pastesrh...

2. https://git.sr.ht/~sircmpwn/core.sr.ht/tree/master/srht/emai...

3. https://git.sr.ht/~sircmpwn/core.sr.ht/tree/master/srht/vali...

4. https://git.sr.ht/~sircmpwn/hub.sr.ht/tree/master/hubsrht/bl...

5. https://git.sr.ht/~sircmpwn/hub.sr.ht/tree/master/hubsrht/bl...

6. https://git.sr.ht/~sircmpwn/paste.sr.ht/tree/master/pastesrh...

stavros · on June 11, 2020

Exactly agreed. I basically only use Flask for things I want to explicitly be single-file these days. For anything larger, I reach for Django, because I know that if I need at least one thing from it (and I always need the ORM/migrations/admin), it will have been worth it.

My current favorite way of building APIs is this Frankenstein's monster of Django/FastAPI, which actually works quite well so far:

https://www.stavros.io/posts/fastapi-with-django/

FastAPI is a much better way of writing APIs than DRF, I wish it were a Django library, but hopefully compatibility will improve as Django adds async support.

fastball · on June 11, 2020

I built the backend for my knowledge-base platform[0] using Flask originally, but performance was definitely a struggle so I rewrote the whole thing with FastAPI. Have definitely seen a serious performance bump from that switch, and currently am quite happy with it. Many of our users are actually impressed with how fast everything is on the platform.

I still want to rip out SQLAlchemy ORM and replace it with pure SQL via `asyncpg`, as SQLAlchemy ORM is not async and that causes a bunch of extra switching in the backend that certainly doesn't help eek out more perf, but at the moment it's a bit too much effort and users are happy.

Scaling is handled by just throwing more instances of the application at the problem, behind a load-balancer.

[0] https://supernotes.app

O5vYtytb · on June 11, 2020

You can use the async databases[0] library, and there's a guide[1] for it. It's not a full ORM but works pretty well :)

[0] https://www.encode.io/databases/

[1] https://fastapi.tiangolo.com/advanced/async-sql-databases/

stavros · on June 11, 2020

That sounds like a good solution, and is a good data point to know, thank you. Did you try Sync FastAPI? I'm wondering how its performance compares with async

fastball · on June 11, 2020

When you say sync do you just mean having sync endpoints? If so, then yeah, it's required since we're using SQLAlchemy ORM. Otherwise the calls to SQLA ORM would block the main event loop. As it is, FastAPI (well really Starlette) creates threads for sync endpoints to prevent blocking the main thread/event loop.

So yeah, still seeing good speedups in our own benchmarks even though most of our endpoints are sync.

What was arguably more important though was how much switching to ASGI helped with handling WebSockets. We're using SocketIO, and trying to get a fundamentally async protocol working within sync (Flask) land was a massive pain. We had repeated reliability and deployment issues that were very hard to debug. Switching to FastAPI made that much easier.

stavros · on June 11, 2020

Oh, I can imagine, ASGI must be immeasurably easier. Where do the async speedup gains come from, though, if your database is still sync? Wouldn't threadpools provide comparable performance before?

fastball · on June 11, 2020

Well so actually we have both now.

For WebSockets, all of the code is async, so I'm already using `asyncpg` for any database stuff that is happening there.

With regards to why are the sync endpoints faster, I think it is a number of things, some of which are userland changes that could've been made under Flask, but all of which are somewhat related to the switch. With regards to things that FastAPI itself has changed, I think using a (de)serialization lib like Pydantic and serializing to JSON by default (which is what we were doing under Flask anyway, though with Marshmallow) makes a lot of the code paths in the underlying lib a bit faster, because with Flask there was more "magic" going on behind the scenes. For userland stuff, I think partly because there is less magic going in the background (I really like FastAPIs dependency injection system), it's made it easier to identify the bottlenecks and optimize hot code paths.

stavros · on June 11, 2020

That makes perfect sense, thank you. I love FastAPI just for the code clarity and ease of working with better type objects (the Pydantic classes) alone, though the speed benefit is nice to have too.

ramraj07 · on June 11, 2020

I maintain a fairly complex flask application and cannot see a better tool for that job. Our code looks similar as well. It's boilerplate repeated often for sure, but there will always be that one endpoint where you need that flexibility to do something a highly opinionated framework just won't let you. In the end it's deciding whether you write some extra code with flexibility or some extra code fighting the framework.

Can you show me a comparable codebase in django and how it looks? I'm genuinely curious how people deal with edge cases.

carapace · on June 11, 2020

I haven't used Django in years, so maybe things have changed, but I recall two incidents that stick in my mind and prevent me from taking the whole project seriously.

The first was when they removed tracebacks. Singularly useless thing to do IMO. But there's a --show-tracebacks option (or something like that, it was a long time ago) to show tracebacks, but it didn't work. I dug into the code for this one. IIRC, the guy who added the code to suppress tracebacks didn't take into account the CLI option. I patched it to not suppress tracebacks but there turned out to be another place where tracebacks were suppressed, and I eventually gave up.

The second incident (although, thinking about it they happened in chonologically reversed order) was when a junior dev came to me with a totally wacky traceback that he couldn't understand.

All he was trying to do was subclass the HTML Form widget, like a good OOP programmer, but it turned out that Django cowboys had used metaclasses to implement HTML Forms, and utterly defeated this poor kid.

I was so mad: Who uses metaclasses to make HTML forms? Overkill much?

(In the event the solution was simple: make a factory function to create the widget then patch it to have the desired behaviour and return it. But you shouldn't have to do that: OOP works as advertised, why fuck with a good thing?)

So, yeah, Django seems to me to be run by cowboys. I can't take it seriously.

FWIW, I'm learning Erlang/OTP and I feel foolish for thinking Python was good for web apps, etc. Don't get me wrong, I love Python (2) but it's not the right solution for every problem.

orf · on June 11, 2020

I appreciate your reply, but Django never “removed tracebacks” and I would love to see a declarative form that didn’t use metaclasses in some way.

Here’s the ~20 lines of cowboy code you’re referring to[1] - collecting the declared fields and setting an attribute containing them.

Not exactly the kind of thing that should make you mad, and rather than overkill it’s exactly the use case for metaclasses.

And to top it off, that metaclass is completely optional, if you want to create a list of fields and pass it into BaseForm then go for it. Most don’t.

1. https://github.com/django/django/blob/5776a1660e54a951591644...

carapace · on June 11, 2020

> I appreciate your reply

Cheers!

> Django never “removed tracebacks”

I don't want to get into a juvenile back and forth, but I must insist that Django did so suppress tracebacks. I don't know what it does today but I remember clearly patching the server code to re-enable them and that the '--traceback' CLI switch didn't do it.

> I would love to see a declarative form that didn’t use metaclasses in some way.

Here you go: https://pypi.org/project/html/

Clever design, elegant code, under 20k (including docs and tests), no metaclasses.

> Not exactly the kind of thing that should make you mad, and rather than overkill it’s exactly the use case for metaclasses.

There is no use case for metaclasses. GvR called that paper "The Killing Joke" for a reason. ( https://www.python.org/doc/essays/metaclasses/ https://www.youtube.com/watch?v=FBWr1KtnRcI ) I read it for kicks, because I'm that kind of freak, but it's not the kind of thing that you should ever use in production code.

What made me mad is that Django's gratuitous use of metaclasses broke my junior dev. The kid was doing the right thing and it was exploding in his face with an inscrutable error message: that's on Django.

orf · on June 12, 2020

> Here you go: https://pypi.org/project/html/

That’s not even the same thing - it’s a (horribly old) library for generating HTML.

We’re talking about forms: sets of typed fields including validations that can be optionally rendered to HTML. Think wtforms[1]

> There is no use case for metaclasses.

There are, that essay (about Python 1.5 no less) does little to dissuade people from using them, going so far as to offer concrete code samples.

It’s also hopelessly outdated: nobody uses metaclasses like that at all, especially not for tracing! It’s hard to blame it though, this document was written before even decorators where introduced.

And let’s not ignore the call to authority by pointing out that GvR uses metaclasses extensively while working on type hints, GaE libraries, and even in his early asyncio code.

In actual fact metaclasses a few use cases, including the most common: syntactic sugar. Like anything this can be heavily abused and is most useful when creating libraries rather than used within traditional application code. In any case, shunning it wholesale is stupid.

Half remembered issues with junior developers are not great arguments against a useful part of a language. Who’s to say that it was even related to metaclasses, and your apparent allergy to them isn’t colouring your memory?

1. https://wtforms.readthedocs.io/en/2.3.x/crash_course/#gettin...

carapace · on June 12, 2020

I really don't want to argue about this.

You're not going to convince me that Django isn't an overblown toy. I'm not going to convince you that it is.

Same with metaclasses, you're not going to convince me that they're a good idea (despite what GvR does with them) and I'm not going to convince you that using them is irresponsible.

So what are we left with?

> That’s not even the same thing

But html.py (would have) solved the problem we had. without the nasty surprise.

> In any case, shunning it wholesale is stupid.

No, it's conservative. That's a different thing.

I want to be able to hire someone who can modify a form. The more complex and obscure the the code is (even if it's only twenty lines long) the smaller the pool of folks who can use it with mastery.

Think about it.

Anyway, I'm off learning Erlang/OTP now and it really makes Python's runtime look like a joke in comparison. Web-app backends are Erlang-shaped, not Python-shaped. Not using it sooner makes me feel stupid.

Scarbutt · on June 11, 2020

Some devs prefer clear, flexible and performant/unambiguous code to 20 layers of abstraction.

orf · on June 11, 2020

They do indeed. But then they realise that copy-pasting repetitive authentication and validation boilerplate everywhere leads to inconsistencies, bugs and at worst security issues.

Then they build their own abstractions. And then, congratulations, they’ve spent longer than they should have to end up with a worse version of Django, that nobody but them finds “clear” or “unambiguous”.

If only we could capture these common, repetitive and important patterns and put them in some kind of library. A “framework”, if you will. That way you don’t need to copy-paste this stuff over and over again, and anyone who knows the library will find it clear and unambiguous!

In fact this is such a good idea that I’m going to do it myself. I’ll call the library Franz, after a famous pianist.

hn_throwaway_99 · on June 11, 2020

Based on the top comments in this thread, I can see my previous attempts to dispel mistaken beliefs around graphql have failed, but I'll still try!

1. The biggest mistake GraphQL made was putting 'QL' in the name so people think it's a query language comparable to SQL. It's not: https://news.ycombinator.com/item?id=23120997

2. Some benefits of GraphQL over REST: https://news.ycombinator.com/item?id=23124862

detaro · on June 11, 2020

I suspect as long as https://graphql.org says what it does, you're going to have a hard time with that fight...

FpUser · on June 11, 2020

I've been using home grown RPC for my servers (POST + JSON) for HTTP or binary serialization for lower level access. Works like a charm for years. Never felt like missing anything.

iooi · on June 11, 2020

I wanted to learn GraphQL recently and I wrote a small library to automagically generate GraphQL schemas from SQLAlchemy models. [1]

It's inspired by Hasura, the schema is almost the same. It's not optimized at all, but it's a nice way to quickly get started with GraphQL and expose your existing models.

[1] https://github.com/gzzo/graphql-sqlalchemy

cletus · on June 11, 2020

So I've now had the opportunity to use both GraphQL and protocol buffers ("protobufs" is the more typical term) professionally and I have some thoughts on this.

1. Protobufs use integer IDs for fields. GraphQL uses string names. IMHO this is a clear win for protobufs. Changing the name of a field of GraphQL is essentially impossible. Once a name is there it's there forever (eg mobile client versions are out there forever) so you're going to have to return null from it and create a new one. In protobufs, the name you see in code is nothing more than the client's bindings. Get a copy of the .proto file, change a name (but not the ID number) and recompile and everything will work. The wire format is the same;

2. People who talk about auto-generating GraphQL wrappers for Postgres database schemas (not the author of this post, to be clear, but it's common enough) are missing the point entirely. The whole point of GraphQL is to span heterogeneous and independent data sources;

3. Protobuf's notions of required vs optional fields was a design mistake that's now impossible to rectify without breaking changes. Maybe protobuf v3/gRPC did this. I'm honestly not sure.

4. Protobuf is just a wire format plus a way of generating language bindings for it. There are RPC extensions for this (Stubby internally at Google; gRPC externally and no they're not the same thing). GraphQL is a query language. I do think it's better than protobufs in this regard;

5. GraphQL fragments are one of these things that are probably a net positive but they aren't as good as they might appear. You will find in any large codebase that there are key fragments that if you change in any way you'll generate a massive recompile across hundreds or thousands of callsites. And if just one caller uses one of the fields in that fragment, you can't remove it;

6. GraphQL does kind of support union types (eg foo as Bar1, foo as Bar2) but it's awkward and my understanding is the mobile code generated is... less than ideal. Still, it's better than not having it. The protobuf equivalent is to have many optional submessages and there's no way to express that only one of them will be popualated;

7. Under the hood I believe the GraphQL query is stored on the server and identified by ID but the bindings for it are baked into the client. Perhaps this is just how FB uses it? It always struck me as somewhat awkward. Perhaps certain GraphQL queries are particularly large? I never bothered to look into the reason for this but given that the bindings are baked into the code it doesn't seem to gain you much;

8. GraphQL usage in Facebook is pervasive and it has first class support in iOS, Android and React. This is in stark contrast to protobufs where protobuf v2 in Google is probably there forever and protobuf v3/gRPC is largely for the outsiders. It's been several years now since I worked at Google but I would be shocked if this had changed or there was even an intention of changing it at this point;

9. The fact that you can do a GraphQL mutation and declare what fields are returned is, IMHO, very nice. It saves really awkward create/update then re-query hops.

10. This is probably a problem only for Google internally but another factor on top of protobuf version was the API version. Build artifacts were declared with this API version, which was actually a huge headache if you wanted to bring in dependencies, some of which were Java APIv1 and others Java APIv2. I don't really understand why you had to make this kind of decision in creating build artifacts. Again, maybe this has improved. I would be surprised however.

Lastly, as for Sourcehut, I had a look at their home page. I'm honestly still not exactly sure what they are or what value they create. There are 3 pricing plans that provide access to all features so I'd have to dig in to find the difference (hint: I didn't). So it's hard for me to say if GraphQL is an appropriate choice for them. At least their pages loaded fast. That's a good sign.

square_usual · on June 11, 2020

> People who talk about auto-generating GraphQL wrappers for Postgres database schemas (not the author of this post, to be clear, but it's common enough) are missing the point entirely. The whole point of GraphQL is to span heterogeneous and independent data sources;

I don't think they are missing a point; rather, they have a completely different point: eschew a backend and use GraphQL on a DB + a frontend that gets all that data. If you're developing rapidly and don't have complex backend logic, I can see why you'd want to do that.

ramzeus · on June 11, 2020

I think the problem with this approach becomes apparent when the database model changes, suddenly the API doesn't match the model anymore or the API needs to change. Both can be difficult problems leading to braking changes in the client(s). Of course, for some rapid testing and simple personal projects this doesn't matter so much but for most other projects I think this might bite you hard in the future.

matt_kantor · on June 11, 2020

> 3. Protobuf's notions of required vs optional fields was a design mistake that's now impossible to rectify without breaking changes.

Can you elaborate on this?

I've barely used protobufs, but in thrift I've found optional fields to be very useful.

ochoseis · on June 11, 2020

I think you're on the same page.

The gist is that while you'd hope to have a stable interface, in reality things change. By marking fields required, they basically have to stay that way forever so you can support older clients. In doing so, you create a brittle API that may not be able to evolve with changing requirements.

For this reason, I believe Google got rid of required fields in proto3 (everything's either implicitly optional, or a repeated field which may have 0 elements).

matt_kantor · on June 11, 2020

Ah! I read your comment as "optional was a mistake", not "required was a mistake". We are indeed on the same page.

ryanong · on June 11, 2020

addressing #7, graphql queries are dynamic and not stored on the server

crabmusket · on June 11, 2020

> Another (potential) advantage of GraphQL is the ability to compose many different APIs into a single, federated GraphQL schema.

If anyone else can share experiences of this sort of problems and solution, I'd be really interested to hear it. I've written non-GQL APIs before that back onto other internal and external services; what am I missing?

HellsMaddy · on June 11, 2020

I think Drew is referring to composition in terms of API end-user code, not sr.ht code, e.g. making it possible for the user write a single GQL query that combines data from multiple sr.ht services.

crabmusket · on June 12, 2020

Ah, I think that makes a little more sense. Thanks!

GordonS · on June 11, 2020

I've mentioned my (negative) experience elsewhere in this thread. It wasn't actually me doing the implementation, but the result was a horrendous, unmaintainable and slow code base. Ultimately, no part of the team benefited from the use of GraphQL instead of REST.

To be fair, the same devs that built it using GraphQL would likely have made many of the same mistakes with a REST API, but I do feel it would at least have been easier to reason about the code.

camgunz · on June 12, 2020

A lot of places have a hell of a time dealing with very nested graphql queries that effectively DoS your app servers, or a resolver that’s very slow. Caching is also an open question. But for the simplest systems, I’d hesitate to recommend graphql at this point.

ddevault · on June 11, 2020

Author here. Wow, there is a ton of "didn't RTFA" comments here. It seems like half of this thread saw GraphQL in the title, it knocked two gears into place in their head, and they started writing up their own little essay about how bad it is.

I evaluated GraphQL twice before, and discarded it for many of the reasons brought up here. Even this time around, I give a rather lackluster review of it and mention that there are still many caveats. It's not magic, and I'm not entirely in love with it - but it's better than REST.

Query optimization, scalability, authentication, and many other issues raised here were part of the research effort and I would not have moved forward with it if I did not feel that they were adequately addressed.

Before assuming some limitation you have had with it in the past applies to sr.ht, I would recommend reading through the code:

https://git.sr.ht/~sircmpwn/git.sr.ht/tree/master/api

https://git.sr.ht/~sircmpwn/gql.sr.ht

If you're curious for a more detailed run-down of my problems with REST, Python, Flask, and SQLAlchemy, I answered similar comments last night on the Lobsters thread:

https://lobste.rs/s/me5emr/how_why_graphql_will_influence_so...

I would also like to point out that the last time I thought a rewrite was in order, we got wlroots, which is now the most successful project in its class.

Cheers.

hpen · on June 11, 2020

I tried GraphQL but my resolvers became extremely complicated. Am I missing something or is this just how it goes?

geitir · on June 11, 2020

One thing I've noticed is it seems redux and graphql are solving the same problem in an entirely different way

iso-8859-1 · on June 11, 2020

Redux brands itself as a tool for managing state. How does GraphQL manage state?

geitir · on June 11, 2020

The way I see it is Redux is a local normalized store which can then be used via the selector pattern to generate custom objects / views. With GraphQL you are requesting that derived data to be generated for you via the query language.

Regarding managing state I don't see GraphQL helping with that at all.

pknopf · on June 11, 2020

I firmly believe GraphQL is a fad. It will never work, exposing the ORM to your clients.

I'm curious, why don't we see more public-facing APIs using gRPC?

zimpenfish · on June 12, 2020

Currently working on a public-facing gRPC API and it's much more faffy than a REST API. Browsers don't talk gRPC which means you need to support gRPC-web endpoints. You probably need a gRPC aware proxy too and most of those have their own challenges. Some stuff doesn't like protobufs which means you need JSON transport which means protoc plugins. etc.etc. It's all just more work for (IMHO) little gain.

iso-8859-1 · on June 11, 2020

gRPC is awkward on web because it assumes HTTP/2. The web is still half HTTP/1. Also, gRPC does almost everything, but not quite, it doesn't know how to synchronize state in general. It is a heavy weight choice considering that you may only need protobuf. And protobuf is not natively supported in browsers, web devs can't agree on serialization libs, they can only agree on the serialization format. And they didn't choose protobuf ;)

tmpz22 · on June 11, 2020

Big OOF. Even the flow of the article is muddied to the point where it takes significant effort to dig out the main points for OP's migration from REST to GraphQL. What I got from it:

> The system would become more stable with the benefit of static typing, and more scalable with a faster and lighter-weight implementation.

OK static typing I'm with you so far. Faster and lighter weight? I'm not so sure, sounds like you're having troubles with Flask and SQLAlchemy completely unrelated to REST. All production graphQL implementations I've seen are very heavy when they add in authentication and more advanced query capabilities. Is this REALLY so superior to REST?

> Another (potential) advantage of GraphQL is the ability to compose many different APIs into a single, federated GraphQL schema.

I guess the discoverability of GraphQL is better, but 90% of APIs on the internet prove that large REST APIs are very effective and achieve the same thing.

> I also mentioned earlier that I am unsatisfied with the Python/Flask/SQLAlchemy design that underlies most of SourceHut’s implementation. The performance characteristics of this design are rather poor, and I have limited options for improvement. The reliability is also not something I am especially confident in.

This is where you completely lose me. It's fine if you hate ORMs, its fine if you hate the SQLAlchemy API, but you're blaming your hammer for the fact that you think you built a shoddy house. Going out and buying a hammer won't fix the fact that you're lining up your nails all wrong.

> The GraphQL services are completely standalone, and it is possible to deploy them independently of the web application...

> ...it is my intention to build out experimental replacement backends which are routed through GraphQL instead.

I think these two captions go together, are you describing microservices? This can be achieved just fine with REST using simple load balancing strategies based on url routing or similar.

> Almost all of SourceHut today is built with Python, Flask, and SQLAlchemy, which is great for quickly building a working prototype. This has been an effective approach to building a “good” service and understanding the constraints of the problem space. However, it’s become clear to me that this approach isn’t going to cut it in the long term, where the goal is not just “good”, but “excellent”

This is a classic example of using a handful of annoying issues to justify an exciting large re-write that doesn't actually address the main issues you are having. If you are struggling with the SQLAlchemy library you will find alternative (and perhaps larger) struggles in all GraphQL implementations. Best of luck, this is a road I would not follow you on. Seriously though I wish you the best and hope your product succeeds despite this.

ianamartin · on June 11, 2020

This is a better way of saying what I've tried to say in like 3 comments.

xvilka · on June 11, 2020

Does this mean that SourceHut will become completely written in Go? Amazing news if so!

NewJazz · on June 11, 2020

Why? I am currently converting a Flask/SQLAlchemy application to Go and gRPC. Go has not been that much of a win from my perspective. Lots of rough edges and missing functionality in libraries available, a lackluster type system that does enough to get in the way but not enough to properly express programmatic intention, and an atypical non-ideal error handling model have left me with little reason to champion Go. Not that it is a terrible language, I just do not see the value add compared to even Python (let alone Kotlin or Rust).

zapf · on June 11, 2020

Had a similar experience.

Typed languages are great for systems development, and I think, not so good for writing web applications.

I also think, Ruby, Python, JS have dominated web dev world largely cause they don't come in the way of the developer having to constantly convert HTML (untyped) and JS (untyped)into types for the backend programming language.

Remember how ActiveRecord (not sure about SQLAlchemy) simply took away the pain of managing types? You didn't have to explicitly parse or cast "0.001" into 0.01.