Hacker News new | past | comments | ask | show | jobs | submit login
The GraphQL stack: How everything fits together (apollodata.com)
210 points by quodestabsurdum on Nov 9, 2017 | hide | past | favorite | 113 comments



My main concern with GraphQL is access control. What happens if the user doesn't have access to part of the requested data (a subtree)... Will the GraphQL engine return an incomplete result, an inline error or will the whole query fail? What if you only want to allow showing specific fields of a resource; for example when a user requests for another user's account details, we need a way to block them from getting certain fields like passwords or keys that might be attached to the account.

How does this access control play with cache? What happens when the access control rules change?

A while ago, I wrote a real-time REST-based plugin which solves all the problems and I use it in production, it's much simpler than GraphQL. See https://github.com/SocketCluster/sc-sample-inventory

I'm surprised that more libraries aren't following the REST-based model.


The solution for authorization/access control is to use "Dataloader" [1] which is also made by Facebook. You write a single source of truth for how authorization is handled, and make sure that graphql resolves with this source.

Dataloader is not as well known as GraphQL, but crucial for complex authorization systems imo. It also has a bunch of other features like batching and caching which makes your life easier when opting for this solution.

[1]: https://github.com/facebook/dataloader


I've only ever seen DataLoader used for batching database queries, how do you create a single source of truth for authorization with it? Do you have a code snippet somewhere?


Check out this article [1] (or video) on Dan Schafer's talk, on how they use Dataloader and GraphQL internally at Facebook. Covers most of it.

To summarize, they create a class for each GraphQLType which has their own "gen" function such that it is the only way to generate data. This way you get a single source of truth.

There's also a video of Lee Byron going through Dataloader's source code which was pretty fun to watch.

[1]: https://dev-blog.apollodata.com/graphql-at-facebook-by-dan-s...


Model access control into the schema. Create a BaseUser type and an UnauthorizedAccess type, then a union of the two, refer to the union in nearly all scenarios where you want to create a relationship to a User.

GraphQL gives you access to a relatively powerful (but sadly incomplete) type system, you have a much better time if you take full advantage of it.


Which features are you missing in GraphQL's type system?


Unions on scalars, unions on input types, intersection types, and generics. Arrays are already a generic type, but we're unable to create new ones.


We let our GraphQL server accept a JWT authorization header, which is then propagated to all service calls it does. When the user doesn't have permission to some (sub)resources being requested, the GraphQL server doesn't see it either.


But what comes back?

Lets say I request a user, his name, email and all their friends.

But I don't have access to the friends.

Now do I get nothing back and the error that my request would include data I have no access to?

Or would I get name and email, but the list of friends would be empty? would I get an additional warning somewhere so I know it's an permission problem and the user doesn't have no friends?

Would I get an inline error in the place of the friends?


That's entirely up to you and how you choose to implement your fields' resolutions.


Are there any best practices?


I don't know if there are for GraphQL in particular. I have my GraphQL implementation as a part of a Ruby app. graphql-ruby has this handy error handling doc: https://github.com/rmosolgo/graphql-ruby/blob/master/guides/... and there's the graphql-errors gem which sort of does this in a generic manner, allowing individual fields to fail without the whole query failing: https://github.com/exAspArk/graphql-errors


Hi, author of the post here!

With GraphQL, you can think of each field as a tiny endpoint, and you do access control on that in the same way as before.

It turns out that while GraphQL allows the frontend developers to select the data they need, that doesn't result in an unlimited set of queries. You often get a number of queries which is similar to what you would get if you hand-coded specific endpoints for different UI views, which turns out to be a common pattern outside of GraphQL.

> What happens if the user doesn't have access to part of the requested data

In this case, the gateway just falls back to the underlying server implementation, and it's a cache miss.


> It turns out that while GraphQL allows the frontend developers to select the data they need, that doesn't result in an unlimited set of queries.

Could you please elaborate on that? Im new to GraphQL and I assumed that you only have to provide it with a schema that defines entities and their relations and it will let you query any combination over it.

Does queries have to be explicitly set on the server? As an example, if I have the table "doctors" that has a 1:n relationship with the table "patients", do I have to explicitly define something like:

queryDoctors { id name patients { id name } }

To be able to query a doctor's patients?


You could do pretty much any of the things you describe, actually. Auth is really not straightforward at all though, they could definitely do a better job at making it easier to get started with GraphQL.


Access control and permissions shouldn't be part of your API implementation, either REST or GraphQL. It should be part of your business logic so it can be shared regardless of which API protocol you put on top of it.


> GraphQL knows all of the data requirements for a UI component up front, enabling new types of server functionality. For example, batching and caching underlying API calls within a single query becomes easy with GraphQL.

And immediately after that the article spends two pages of text explaining how insanely complex the "becomes easy with GraphQL" really is, and offers no actual details on the "easy" part.

Oh, your client has to be cache aware. Oh, and there has to be a gateway that's cache aware, and schema aware, and has to cache all responses... And invalidate them... But there are no tools yet. And then the graphql server should be cache aware. Oh, and your database layer should also cache all responses.

Caching is a hard problem, and there's nothing in GraphQL to make it easier. Heck, exclusively relying on POST requests they deliberately remove the most obvious and the easiest part of the equation.

I love how the every next part, starts ,with a sentence that shows how complex caching with GraphQL is (emphasis mine):

> With GraphQL, frontend developers have the capability to work with data in a much more fine-grained way than with endpoint-based systems. They can ask for exactly what they need, and skip fields they aren’t going to use.

I would love to see an explanation how GraphQL makes caching for this easy.

> Schema stitching is a simple concept: GraphQL makes it easy to combine multiple APIs into one, so you can implement different parts of your schema as independent services. These services can be deployed separately, written in different languages, or maybe even owned by different organizations.

It's called REST APIs and we've known how to do them since 2001.


> It's called REST APIs and we've known how to do them since 2001.

The aim of GraphQL insofar as I understand it is that instead of having to develop two things (a front-end and a back-end REST API that talks to your database or other services), you need just one (a front-end that talks to GraphQL). Mind you, graphQL becomes the back-end-for-front-end then, and will still need a lot of work. The main point is that with GraphQL you'll only need one for all of your apps, while traditionally you'd write one REST API for each client.


> you'd write one REST API for each client.

Erm... What?

Granted, web requirements are a bit different from mobile requirements, but still. Why would you write one REST API for each client?


Because requirements are different. Our admin GUI needs much more data and different access controls than our web and mobile apps. Third party consumers have even more different requirements.

We never need to bikeshed with developers from all other teams to make sure things are consistent or whatever.

We generate endpoints from a Swagger spec. With good abstractions it is not much work. Frontend devs do modifications to these endpoints as well since they are so trivial.


> Because requirements are different. Our admin GUI needs much more data and different access controls than our web and mobile apps.

Indeed. And how does GraphQL solve the need for different data and different access controls to that data? :)


Not sure, I haven't seen a good story for it. We'll stick with REST.


I don't think APIs in general should have anything to do with authorization, other than passing along a token.


> you'll only need one for all of your apps,

So where are my clients for statically typed languages? I looked into GraphQL and promptly dropped it cause I couldn't seem to call into it from Java in any sensible way. If all your "apps" are javascript, maybe this is true, but that hardly makes it a universal technology.


It could have better documentation but here's the Java client for Android: https://github.com/apollographql/apollo-android There's also a Swift client available: https://www.apollographql.com/docs/ios/


That looks pretty decent indeed, and makes me a lot more hopeful about the promises of GraphQL, but it sounds like its an Android specific implementation, rather then something I could actually use for JVM service-to-service communication.


(Author of the post here)

Thanks for the notes!

> And immediately after that the article spends two pages of text explaining how insanely complex the "becomes easy with GraphQL" really is, and offers no actual details on the "easy" part.

Hi, thanks for this piece of feedback! I could have done a better job here, since I was summarizing a 38 min talk into a few paragraphs.

These parts are actually talking about different concepts:

The first is about ensuring that a single fetch from the UI knows about all the data it needs, so you can easily avoid multiple roundtrips to the database. This is something you can do with a basic in-process caching tool called DataLoader: https://github.com/facebook/dataloader

The second is about caching _across_ requests, something that people usually have special infrastructure for in REST, such as Varnish. This talk was elaborating on how a GraphQL-specific piece of caching infrastructure might work, and sit in exactly the same place as REST caching.

> Oh, your client has to be cache aware.

I think the intention was to say that it _could_ be cache aware. It doesn't need to be, but you end up with a really nice situation.

> I would love to see an explanation how GraphQL makes caching for this easy.

My intention here was to say that GraphQL's ability to understand what fields are being asked for makes it easy to return specific cache controls, rather than having to put one on the whole query. So your server basically generates the control for you (this is something that exists today)

> It's called REST APIs and we've known how to do them since 2001.

I don't think REST has a way to automatically combine multiple REST services, and then query them all in one HTTP request without multiple roundtrips to the client, which was the goal here.

Happy to talk more, and thank you for bringing up some of the places where I can communicate more clearly in the future! The audience of the talk was a GraphQL conference, but I should definitely consider next time that people who aren't already bought into the idea of GraphQL will be taking a look as well.


> a single fetch from the UI knows about all the data it needs, so you can easily avoid multiple roundtrips to the database.

This needs to be backed up by data. In my opinion there's no difference between

    GET /users/1
    GET /users/1/friends
and

    {
      user(id: 1) {
        name
        age
        friends {
          name
        }
      }
    }
Unless the backend is extremely smart and can generate (optimized!) SQL queries on the fly from GraphQL schemas, this will be two trips to the database.

Granted, there's LINQ and some ORMs where you just chain requests. And still... And the above can also be solved by providing an `Accept: application/vnd.my-company.users-full+json` in REST.

> My intention here was to say that GraphQL's ability to understand what fields are being asked for makes it easy to return specific cache controls, rather than having to put one on the whole query.

How? Skimmed through the video. Oh. Right. By providing custom `maxAge` fields to the returned data so that a custom-built gateway (that has to be schema-aware) could build a cache of that data. An exact same thing can be done in REST: add custom fields, develop a custom resolver/caching layer, voila. There is a reason no one is doing that :)


but even in your example, you already made 2 rest requests as opposed to 1 via graphql. What if your scenario is to GET all friends whose names start with D of friends of a friend ?


> but even in your example, you already made 2 rest requests as opposed to 1 via graphql.

You can easily mitigate that with something like:

    GET  /user/1
    Accept: application/vnd.my.user-full+json
Where the server will return full data if the client sends `application/vnd.my.users-full` Accept header

> What if your scenario is to GET all friends whose names start with D of friends of a friend ?

I really wonder what your scenario would be on the server for this request in GraphQL

For REST you would look at actual use cases and design for that.

Most likely something like

    GET  /user/1?filter=friends&by=D
    Accept: application/vnd.my.user-full+json

    GET  /user/1/fof
    Accept: application/vnd.my.user-full+json


why "mitigate" anything when you are given a language specifically designed to describe and solve these types of problems? Of course, you can create a custom end-point for any type of request. The point is, you do not really have to. Also, as far as scenarios in the real world go, traversing some kind of graph is practically everyday kind of thing, from structured organizations to billing to who follows a guy who liked some tweet to things like snomed


> given a language specifically designed to describe and solve these types of problems

It's not a first language "specifically designed" and not the last. It is a language designed to solve Facebook's problems. Do you have Facebook's problems? I highly doubt it :)

Meanwhile there are multiple questions that are left unanswered in all the "GraphQL is amazing" articles:

- caching. Where is it? How do you do it? Official docs on caching are laughable at best [1]

- data access. With ad-hoc queries how do you make sure the right people get the right data? GraphQL docs on that are equally laughable [2]

- how do you make sure the server doesn't buckle under ad-hoc uncacheable queries where auth is solved by "passing a fully-hydrated object instead of an opaque token or API key to your business logic layer". Oh, hey, if you rely on microservices, and you need to return only a subset of data for the query, how do you do that? Remember, all queries are ad-hoc.

There are definitely more issues than the ones right off the top of my head. Such as "data collocation", "mutations are guaranteed to execute sequentially" etc.

[1] http://graphql.org/learn/caching/

[2] http://graphql.org/learn/authorization/


> "GraphQL is amazing" articles

Considering how bad most of the web technology is, anything reasonably designed is probably going to be overhyped. It's important to realize that graphql is not some kind of magic graph query language, it's actually a very simple RPC protocol allowing you to submit multiple calls per network request in a tree shape (which is a very neat idea btw). It also has a schema. You still have to solve all the hard problems yourself, but there is a hope that since graphql is relatively popular and not as dysfunctional as SOAP or simplistic as a typical bespoke schemaless one-call-per-request RPC protocol (e.g. non-HATEOAS "REST"), it might be ok target for writing some actually useful utility middleware libraries and dev tools. That's pretty much the whole reason for all the excitement.


exactly, most web apis feel like kv stores build on top of sql


> My intention here was to say that GraphQL's ability to understand what fields are being asked for makes it easy to return specific cache controls, rather than having to put one on the whole query.

To be more specific: "the entire query" in REST (or, really, in any HTTP-request) can be cached, alongside with all its data on multiple levels of the existing infrastructure with nearly no extra configuration. Cache-Control, ETags, Vary etc.

Most proxies don't even have to parse the query or the response to handle caching in this manner. Match headers, boom, you're done. With proper headers the request might not even leave a user's computer.

GraphQL on the other hand:

- eschews cacheable requests. Everything is a non-cacheable POST

- the "caching becomes easy with GraphQL" in reality becomes a custom caching server that has to parse both the request and the response for each request to find which fields are requested, and match them against whatever's in the cache.

- Actual quote from the video: "add cache-control to your data" ... "and all you proxy or your gateway has to do is interpret your data". WAT. The only time an API gateway should interpret data is when we're transparently upgrading calls from v1 to v2 and back :)


> REST APIs and we've known how to do them since 2001.

Meh. With the exception of usual html+browser example I've yet to see at least one HATEOAS api.


Even non-hateoas services can be quite good ;)


GraphQL is above all, just an optimization. It optimizes the amount of bytes sent over the network.

I'm very happy with REST APIs (maturity level 2/3) and see no reason to change as we've never had performance issues. Most companies are not Facebook with 1 billion customers.

The fact that clients can compose their own queries brings nothing new, as you still have to allow these capabilities on your server, just like with REST; you have to code against all the possible variations (primary resources, filters, sorts, etc) and optimize the DB/remote reads. So no, the users have no more freedom, the server remains the limiting factor.

The way it lets you compose queries is nice for developers, but you can very easily do that with a single aggregation REST endpoint too, provided you have resource links in your responses (which also makes it nice to use with human tools like POSTman) and you can still cache the individual REST responses with Varnish, etc.

Doing Subscriptions or mutations over GrapQL brings very little benefits. In fact, it's even dangerous as most system shouldn't allow multiple mutations per request.

The whole "just cache it on the client" is a big joke and many people seem to underestimate getting caching right. The default in Relay used to be "cache forever", this can't be serious. You could only do that if the current user was the only person able to modify the data or if you could guarantee a perfect synchronization with the server via continuous events and that's actually pretty hard to get right and generally a big investment. In practise, most apps/sites don't work like that.

I'm not sure what that leaves? "Free" barebone documentation? You can have that too with a good type system (e.g scala's) albeit with a lot more work but with a much nicer type system / expressivity.

I mean, I get why it's popular, but the thing's totally blown out of proportion.


(Author of the post here)

> GraphQL is above all, just an optimization. It optimizes the amount of bytes sent over the network.

This was a common thought when GraphQL was first announced, but working with organizations that are adopting it we've found just the opposite: It's actually the tooling and development velocity benefits that people get the most value of.

It's kind of like if you could design your API to be super orthogonal and fine-grained, while still getting the optimized network transport of hand-coded endpoints for each view.

> The whole "just cache it on the client" is a big joke and many people seem to underestimate getting caching right.

The post goes over a new architecture specifically for server-side caching. Caching on the client is definitely not sufficient! And Relay isn't the way most people are doing GraphQL today. Also, clients are about to start supporting cache control and TTLs, making life a lot easier. I'm curious what the comparison here is, since people using technologies like Redux with REST APIs are also usually caching responses forever.

This is really useful feedback for those of us that think GraphQL is going to be a super important technology going forward, and I hope you give it another shot in a year or two! It's still pretty fresh, especially compared to REST, but I think it will improve quickly :]


Thanks for your reply. I've not ditched graphQL forever, but there's a lot of buy-in involved for what it could give me right now. The tech is quite intrusive on your web server.

I didn't know people using redux also cached responses forever, that seems beyond naive to me :)


Note that "forever" here means "for as long as this specific browser tab is open", both for GraphQL and for Redux.


Can't upvote this enough. From what I've read it sounds like GraphQL solves a problem at Facebook where they have so many people working on related data at the same time that they started seeing duplicate API endpoints, and duplicate requests for the same data from different parts of the team. GraphQL provides a chokepoint to prevent that from occurring.

Makes total sense at that scale. Why startups are adopting this without that sort of problem, I have no idea. Because the query syntax is pretty?


There's some truth to what you're saying, but I think you're also selling GraphQL a bit short. Don't forget that Netflix also independently developed a very similar technology (Falcor) because of a need somewhat different than what you describe.

In their case, there was a need to have a single server-side be able to support literally hundreds of different client-side implementations that often had vastly different capabilities. An implementation for an embedded "smart" BluRay player is going to be very different from the first-party implementation that someone pulls up when they visit the .com in Chrome. Not only will it have different memory/CPU requirements, the BluRay player's client probably won't even be written by Netflix employees.

What GraphQL solves, to me, is any situation where the server-side implementation cannot make assumptions about the client-side. Whether that's because there are multiple clients or the client team is not coordinating their delivery with the server team, it's yet another example of the observation that software architecture eventually mirrors the way teams are organized. And I agree that many smaller businesses are probably jumping on the GraphQL bandwagon prematurely/unnecessarily. But I think the problem it solves is broader than you acknowledge and there are far more instances where the organization of the humans writing code favors an approach that decouples the server and client in the way that GraphQL and Falcor accomplishes.


A huge selling point for me is that the client explicitly enumerating all fields they're interested in allows targeted deprecation warnings and finely grained usage statistics.


That's nice yes. Isn't it rare to deprecate fields though? And you can't be sure some of your clients are not overfetching "for convenience / just in case", especially if they don't use the full suite of Facebook libs.


Another thing that sounds worthy when working on gigantic teams and less so when at a startup / small company.


It's mainly useful if the API consumers are outside your own company. For example my company is a small B2B SaaS startup and our customers integrate themselves into our platform via a web api.


Chose the right tech early and you’ll have less growth pain.

I can’t tell you the number of startups I’ve seen that spend time thrashing on their REST APIs who would be better off using GraphQL as the consumable interface.

Would have made many companies supplication development go WAY smoother


You're saying "graphQL is always better than REST for new/small companies"; that's simply not true. It's a tradeoff. Same with (g)RPC.

Just like you wouldn't buy and install 20 powerful servers to run kafka on day 1 just in case your startup meet huge success in 2 years.


I did not say that at all, I urge you to re-read what I wrote.


> Why startups are adopting this without that sort of problem, I have no idea. Because the query syntax is pretty?

IMO, we're seeing a trend where the server doesn't necessarily strongly control what's sent to the client. You can build and deploy a simple server then move most complex logic to the client.

It seems to make the development process a lot easier since you really only need to write client side data.

That being said, I don't intend to use GraphQL anytime soon. It seems to be a bit of a loose cannon.


The query syntax isn't just pretty. It's simple enough non-technical users can actually use it with very little training.


Do you have good examples of

> easily do that with a single aggregation REST endpoint too, provided you have resource links in your responses

I work with/on a "REST" API that has about 20 types of objects, 5-100 fields on each, with many different relationships between them. We've been looking at GraphQL to solve both the querying and mutation aspect, since it is extremely cumbersome to do it efficiently.

GraphQL seems great for this case, but if there is some way just adding resource links could get us most of the benefits I'd love to hear it!


This is a POC in scala I put together rather quickly:

https://github.com/AlexGalays/POC-api-aggregation

How you would write a query:

https://github.com/AlexGalays/POC-api-aggregation/blob/maste...

Unfortunately the PokeAPI is not the best at showcasing this with many level of unneeded nesting and resources with nothing but an "url" property but hopefully you get the idea :)

The fact that the query "language" is indendation based rather than GraphQL's a-string-that-looks-a-bit-like-json-but-isnt was just an arbitrary choice, it's easy to change.


It really isn't just an optimization, though. I actually work at a relatively small team (< 10 developers) and we're having a lot of success with GraphQL. For a variety of reasons we ended up moving to a microservices approach, and GraphQL has worked great for us for aggregating those services. It's not much more work than what a REST API with Swagger docs would be, but the tooling is a lot better and the schema is much more powerful. With a REST API, we would just have a bunch of endpoints and some basic code generation as a time saver for clients. With a GraphQL schema, we have the relationships between all of our data well defined and it's super easy to continue adding onto. There's more of an upfront investment, but I'm pretty confident it's already saved us time.

At the same time, I agree that it's overly hyped. It's not some magic bullet, and there's definitely a learning curve. It's a really powerful tool though, much more than just an optimization.


(Author of the post here)

If you aren't yet familiar with GraphQL and the problems it solves, it might be better to watch the talk on YouTube, since it includes much more of an introduction from the start, plus more color and detail: https://www.youtube.com/watch?v=ykp6Za9rM58

The blog post summary glosses over the introduction and focuses on what a GraphQL-familiar audience will find new, so it's all in the context of the current GraphQL community.

You can also catch up on some of the tools and benefits, and see some case studies from people using GraphQL in production, on our Explore GraphQL site: https://www.graphql.com/


Can anybody explain to me how it's solving any problems? The way I see it they only move the responsibility to combine data from client to server. At the same time, you now don't just expose a couple of endpoints, you need to support a query language, which seems to impose much more effort in things like authorization. I admit I don't see the benefit.


I'm in the process of moving a large legacy Rails app to GraphQL. I'm able to deprecate massive amounts of code.

We currently have a massive number of view models, which are responsible for transforming models or collections of models into payloads for consumption by various client endpoints. Each payload method takes a whole suite of options to allow us to select this or that subset of fields, or to eager load this or that join. In the cases where we have heavily disparate payloads, we've ended up with parallel several implementations of "here's how to generate a JSON object". It's a giant mess - it's worked, but it's a mess.

GQL completely eliminates all of that. Each payload specific to each view is now specifically enumerated via a GQL query. We basically took all our individual field helpers out of the view models and implemented our GQL field resolutions with them - the actual translation work was minimal.

The end result is that a) we have massively less code developing payloads and b) we aren't overdelivering massive amounts of JSON to client endpoints because implementation X happened to be a superset of the data that we wanted. It also solves the problem of "slow bloat" - you implement a #to_json for one view, use it, and later another view uses this, but needs to add an extra field. Now both views retrieve that extra field. Repeat this process over 5 years and multiple consuming views and you have gigantic payloads which are mostly wasted in any individual context.

We even still support our legacy REST API with this system - the app can act as a GQL client, making a query and then emitting the response as JSON. Our REST API is now just one specific, brittle subset of our GQL functionality.

As soon as you find yourself wanting to customize your data serialization per view/endpoint, GQL becomes extremely useful.


Moving the responsibility of combining data to the server is a huge advantage for many organisations. It allows application developers to move much quicker and provides a common data access layer for all applications. It sounds to me like you are over-indexing on the burden placed on the backend developer in your evaluation and not considering the whole picture.

From my experience, the biggest advantage of GraphQL is that having a well defined standard allows the community to build advanced tooling such as https://github.com/graphcool/graphql-playground by Graphcool and all the cool stuff Sashko is talking about in this presentation.


That I understand.

The problem is that on the way you basically reduce your server to the thin layer over DB. Exposing querying API means that you significantly lose control over what, where and how you provide from the server.

We are currently considering using GraphQL in current project and I took a long look at both the standard and library (Absinthe for Elixir). It looks terrible to me. Basically you need to provide yet another layer, supporting mostly generic operations. If your action is more then CRUD, you still need a REST API.

Moreover, you slowly lose control on what can be changed and where, your frontend basically eating logic, but unlikely to be working on it's own.

I understand that it helps when you have records with multiple associations and you want to increase performance and not load everything everywhere, but the cost appears to be huge in terms of maintenance. I totally get why FB is doing it, but it seriously worries me that they market it as an successor to REST, instead of an alternative with such-and-such tradeoffs.


We've used GraphQL in production for about a year now, alongside some legacy rest APIs. It's a lot easier to maintain the GraphQL endpoint, like it's not even close.

GraphQL's secret weapon is the static type definition. We can safely alter fields in the backend, and be notified at compile-time if client-side code is about to break as a result of those changes. We don't have the same guarantees with the rest api, which is the only reason they're still around at all - it's so hard to make changes if you have little idea which client-side code depends on which field.


> We can safely alter fields in the backend, and be notified at compile-time if client-side code is about to break as a result of those changes.

I think that's an immensely useful thing to do - that's why I've been doing it for years by using Thrift instead of REST :). Is there still a value proposition for GraphQL if I'm already using strongly typed interfaces?


Are you talking about a single client and its server? (a.k.a backend-for-frontend). This is not a hard problem to solve when you share a single programming language, with or without graphql.

Because you can't possibly now what your API consumers use on their side.


> Because you can't possibly now what your API consumers use on their side.

Not only do we know exactly what fields the API consumers are using, we also know the types they're expecting. We can make schema refactors fairly safely. That goes a long way towards making the codebase less brittle as a whole


> Because you can't possibly now what your API consumers use on their side.

That's the best part - with GraphQL you _do_ know, since people have to ask for every single field they want.


Think of a GraphQL server as client code that happens to live on the server.


>, you now don't just expose a couple of endpoints, you need to support a query language,

Here's how I think of it. (Others can correct or elaborate my understanding.)

Let's say you have 1 REST endpoint such as "xyz.com/customerlist" to return a JSON response and behind the implementation is a SQL "SELECT ∗ FROM T" and you get back a 100k response with all rows and all columns. You really only wanted customer's name and zipcode and you only wanted it for region of New York. (Unfortunately, "SELECT ∗" returned 30 columns which is 28 more than you need. The REST endpoint also didn't have a SQL WHERE clause which returned all 1000 rows where you only needed 20 rows.) You actually only needed 10k out of that 100k so you threw away 90k of data. Downloading data you throw away is especially wasteful with smartphones on slow mobile connections.

To address the finer grained slices of data, you either create more REST endpoints ("xyz.com/customerlist_name_zipcode") or add query parameters at the end of that 1 REST endpoint. It's doable but multiple endpoints will lead to a combinatorial explosion and the maintenance of them is not ideal for fast iteration.

With GraphQL, the client can request the "shape" of the data without having a pre-defined static REST endpoint that matches that shape. You can thin-slice the data without wasted bytes. The clients can get unforseen shapes of data that the developers of REST endpoints didn't envision.

I actually think the GraphQL landing page explains the rationale and motivation very clearly: http://graphql.org/


What if you had one one endpoint called xyz.com/sqlquery and you would just send it a SQL Query in the form a string, the server would validate it and authorize it and return the data or reject the request.

Is that different than GraphQL?


The difference is that GraphQL was designed to be used like that, so it is relatively easy to limit what capabilities the client can use. SQL was not designed to be used like that, so for complex queries it will be very difficult to verify that the client isn't doing something that it isn't allowed to do.


The difference is that you'd have to have very smart query builder on the front end side that would construct one sql query from combined declared needs of all the front-end components on one page. Then after it gets response (flat as is usual with sql) it would have to split it up and feed to all of the components.

On the backend you'd probably need sql query analyser that would figure out which tables and rows query touches to determine if currently logged in user should be able to see this data.

So on frontend you need a language in which components can specify their needs (why not graphql?). Next step is asking youself why flatten those to sql before sending them to server only to have it parsed again by the server. It's easier to pass them as grqphql and let server decide if it should respond and draw from data source (or possibly multilple data sources, for example solr and database).

Congrats, you invented graphql. Too bad nobody came up with that 15 years ago, this could save a lot of people a lot of trouble.


Then you need to validate, authorize, etc. queries correctly. I see no reason why anyone would want this.

Instead use GraphQL which is a thin layer, and hide all the business logic behind it. Way more manageable.


It got more approachable for me when I started to think about it as a specific "backend for frontend" pattern. I don't see it as a _real_ competitor to REST, even though it's trying to market itself this way.

I think recent "@rest" addition from Apollo ( https://dev-blog.apollodata.com/apollo-client-2-0-5c8d0affce... ) shows they can easily live together - backend uses REST and frontend can rely on GraphQL.


I would say Backend For Frontend solves the same problems for most people, and the tooling around REST is available everywhere.

http://samnewman.io/patterns/architectural/bff/

We have data repositories living behind very simple REST controllers, an endpoint probably doesn't take that much longer to write than a GraphQL query.


(Author of the post here)

I think there are 2 differences:

1. It's very very easy to write a GraphQL query, since GraphQL comes with auto-completing tools to do so. Even non-technical staff at our company often use it to get information about customers, etc. 2. The query is written inside the client codebase, which means you don't need to redeploy the server in response to a change in client data requirements. This reduces a significant amount of the friction caused by frontend features needing to wait on backend deploys.

Other than that, you've got it 100%: GraphQL is a technology to make the Backend For Frontend pattern much easier and more flexible.


It solves problems if you are an organization with many endpoints and data models (think hundreds if not thousands), and you can't predict how that data will be used because you have dozens or hundreds of clients, each with specific data needs - more or less fields, joins and mixes of data from various sources, etc.

It ties in to the microservices architecture, I'd say.


I'm an admitted dumbo about this stuff. Here's what it looked like from that point of view:

Hey cool, a new nice looking common query language that can do neat composition of results. Perfect, I've been wanting to break free of the SQL box since forever.

Wait, what? You have to write SQL queries for everything still? So you define types, and have to write SQL queries for those types, and graphQL gives you some composition on top of that?

Seems kind of redundant. Like wrapping your own christmas present.

But then I realize I'm thinking from the perspective of a one/two person team. Perhaps the real value of this comes from very decoupled consumers.


You're in luck, there are libraries that can do it for you! https://github.com/stems/join-monster

And even a system that generates your whole schema: https://github.com/postgraphql/postgraphql

Plus, the Graphcool framework allows you to bring your own DB: https://github.com/graphcool/framework


Holy crap, that's awesome, thanks!


Yeah the benefits are mainly when you have many different consumers all of whom want to look at the data in different ways, or when you have many different data sources in your architecture and want to aggregate them on the fly based on the exact incoming query.

If you're writing rest endpoints for an SPA and you have one backing store, then the benefits are not really there, it's just another layer. There are some nice things in relay like intelligent caching portions of the data and only refetching the parts needed, but it is very likely not worth it if you can just handcraft your endpoints and make your consumer change in sync.


I think what you're missing is that GraphQL isn't SQL, but rather a SQL-ish query language that supports a limited subset of what SQL does. Additionally, GraphQL lets the server define the schema that's being exposed to the client independent of the actual storage underneath that, including defining and limiting the types of mutations that can be applied. If you were to simply expose the ability for clients to feed SQL queries into your backend then you have to worry about policing those queries, scrubbing problematic values out of the queries, and in general preventing users from doing bad things with your database. By exposing a GraphQL interface you're providing a limited and tightly controlled view into your data, but still allowing the client the freedom to specify the shape and partitioning of the data being returned.


The thing that bothers me about GraphQL is that it adds multiple layers that I have to debug if something goes wrong.

I have an app that displays a hello world from the server, and it's not displaying, where is the problem? Is it in my UI code? is it in Relay? is it in the server side part of GraphQL? is it in my actual API that returns hello world?

If I'm using a good MVC framework on a the backend, I most likely model my schema already in some way, having to duplicate that same information for GraphQL feels wrong.

GraphQL fixes the over-fetching problem, however it requires me to implement a server side component for this, that makes me think if it would be less effort to just refine the server side endpoints and not having to support an extra service.


Has anybody found a way to combine GraphQL and REST in a common API? I've searched but so far found no idiomatic ways to accomplish that.

The reason I need REST in addition to GraphQL is that there are sometimes oddball cases where only GraphQL is a poor fit or where compatibility with older clients that I can't modify has to be ensured.

Also, is there a good way to do file upload of big files with GraphQL nowadays?


We have @ https://tipe.io. you can use rest and GraphQL. Basically, given a REST url, you dynamically create on graphql query on your server, and the execute that query against your schema. It's all meta


Wasn't able to find a library or a Github or anything like that from a quick glance. I'm assuming this is a closed platform. What I was inquiring about was something that I can use in my own software.


You seem to have misunderstood the parent here. They are saying that at their company, they are doing this (_not_ as a library or product you can purchase).

Parent was saying that to implement it, they essentially wrote their REST API by grabbing a URL, dynamically creating a GraphQL query for it on their backend server, and then executing that query against their own GraphQL schema and returning the result.


During a Q&A at GraphQL Summit, GitHub said that they are accomplishing this by making their REST endpoints use the GraphQL API instead of querying the data directly. They’ve been releasing the talks online slowly so that should be available soon.


Why do you believe you need to combine them?

We have a REST API and GraphQL API. They both work well for their respective use cases and together satisfy our needs.


I would like to present them to the user as "here is our API" as opposed to "here are our somewhat complementing APIs and for the one you use these docs and do this and for the other one there's the docs here and you use it differently."


In that case couldn't that be done with different endpoints?


Yes, of course. However, it's still a bad user experience, I think.


We're developing a REST framework, which lets the client to shape response on a predetermined and constrained set of endpoints on the server. The client can include/exclude parts of the data (attributes and relationships), set filters and request aggregation. You may take a look: http://linkrest.io


So you're basically cloning OData, aren't you?


So OData then?


You can use directives to retrieve a REST API through GraphQL.

For example: https://github.com/n1ru4l/graphql-schema-generator-rest


You can get some of these benefits for free and today by using https://github.com/brysgo/graphql-gun (disclaimer: I did not write this, but I am the author of the library it depends upon).

1. For instance, caching also happens on the client. This means if you reload the page, your GraphQL query will pull immediately from localStorage!

2. And then, quite important, it will be backfilled by the server with any new updates, including realtime updates that happen from other peers while the user is still using your app.

Great work on the tracing and schema support pieces! Those are certainly handy features.


I'd love to see some examples of a proper GraphQL gateway that consumes the cache control


We've got one here you can try out, and I hope people build more!

https://www.apollographql.com/engine/

(Disclaimer: I work on Engine)


Oh I know about engine. I'm talking an open source one


Why graphql instead of SQL with proper access control?


This question does not make much sense: a GraphQL API can an often is implemented with SQL !

You could ask: why graphql instead of RESTful?


Of course it makes sense. They're both query languages, right? Why is

   { project(name: "GraphQL") { tagline } }
Better than

  select tagline from project where name = "GraphQL"
Especially if the GraphQL server is just an intermediary later that ends up being translated to SQL anyway.

That real answer to the parent's question is that most people don't have confidence in SQL servers's access controls.


If you're proposing that web servers should not exist and we should just expose the SQL DB via HTTP: what about when you don't have all of your data in a single database?

What if I want/need Cassandra, ElasticSearch or MongoDB? An SQL DB is just one place where you have data and there are very valid reasons for using other solutions.

In fact, some people might argue that you want to do everything with event sourcing / kappa and use Kafka + specialised data stores.

GraphQL fits this universe perfectly. It also fits the one where you only use a single SQL DB since a GraphQL query translates quite nicely to SQL with minimal amount of code in between.


Using SQL as a query language doesn't limit your choice of backend data source(s) any more than using GraphQL does.


One immediate improvement: SQL queries return rows of scalars. It's very tedious to reconstruct nested objects from the resulting column aliases that are ultimately necessary (and in this case, the client would even need to do it all themselves!). Your example is simplistic in that it requests one top-level scalar, but any real API will not be like that. Try:

    {
      project(name: "GraphQL") {
        tagline
        authors {
          name
          friends {
            name
          }
        }
      }
    }


Modern SQL databases like PostgreSQL offer full JSON capabilities to serve this need.


I've used its JSON support extensively and no, it's not really the same at all. What you're saying is that you're OK with (1) planning to not use columns to store your individual data fields in the first place, forgoing the primary SQL features of column-based lookup, joins, etc., or (2) having the client write complicated Postgres-specific SQL queries that dynamically construct JSON strings built from the fields that are stored in the real row structure.


Two completely different things. Would you have clients type in raw SQL? I mean I guess they're comparable in that they're both query languages, but the application is quite different. You probably could come up with a safe way to send SQL queries to an api, or and SQL-to-GraphQL if you prefer SQL.


For anyone interested how a concrete implementation of this architecture can look like, we created an example project that includes the Gateway pattern with schema stitching and powerful resolver middlewares: https://github.com/graphcool/graphql-boilerplate


I'm interested in GraphQL but definitely haven't been following closely.

This seems to be a marketing video for Apollo's products. Is that what most GraphQL stacks look like in production these days?


How is this different, better, worse than OData?

http://www.odata.org/


Jonas Helfer gave a talk about this: https://www.youtube.com/watch?v=coU6OmISOBM


That wasn't very objective or thorough comparison. But I certainly learned that the guy didn't like OData.


That's sounds good..!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: