Hacker News new | past | comments | ask | show | jobs | submit login

> doing things that would be even the simplest of queries becomes a project in writing what are effectively bad performance joins via random http APIs

In some ways, is GraphQL trying to slap a fresh coat of paint on what is otherwise is a bad problem under the hood? At my employer, we're trying to adopt it, but we've never addressed the underlying performance issues of the microservices, and now the slowness is quite apparent with the GraphQL wrapper tier.

This isn't a diss to GraphQL. The queries are awesome to write. The developer experience of writing / being the client of a GraphQL server is extremely excellent. But I feel I might not be alone in a journey that is playing out to be a mediocre implementation because we are avoiding the core problem.




If you find yourself doing in-code joins across microservices, you need to stop and think:

Are my services correctly scoped? Am I doing these joins because there is maybe a higher level domain that goes across these two services?

Should I re-scope my services? Or should I build an Aggregator-View service that holds a JOINED copy of the data.

These aggregator-view services often make sense as Backend-For-Frontends. They must be restricted to a read-only API since they mirror data from other microservices.

GraphQL is a dead end when you are doing joins in code.


What you are describing is in my optinion the exact problem happening with almost all microservices. Building an aggregator microservice to join together two services is another microservice in a complex microservice landscape...

People on hn seem to hate monoliths, but a monolith with just a few services outsourced to microservices is much easier to handle. No more joining microservice, no more race conditions when trying to do transactional processing over some microservices and all the problems of microservices.


I agree that a well-maintained monolith is simpler and more stable than micro services.

The emphasis is on well-maintained. If you’ve ever had team A write a crazy query and kill the database for team B, you’ll start to appreciate the damage containment and encapsulation that microservices provide.

EDIT: Of course there might be other, better solutions to this problem.


With “kill the database” I mean things such as transaction deadlocks.


>If you find yourself doing in-code joins across microservices, you need to stop and think:

This is a place where far too few people stop to think about what you're doing. I've seen too many cases where an operation requires coordinating requests across multiple service methods THAT ARE ONLY USED IN THAT PARTICULAR OPERATION! The devs spend all of their time a) coordinating things and b) trying to make it performant and c) realizing none of the gains of having microservices to begin with.


It does beg the question, if the developer authors a GraphQL server to do joins, why didn't the application just do that in the first place? Is it always a symptom of a dead piece of engineering?


The alternative is an explosion of REST API routes as each application has different requirements. It is much simpler to have one graphql endpoint that is flexible enough for all consumers.


I've seen the explosion of REST endpoints you mention, several times. But isn't GraphQL just adding a thin veneer over that, giving the appearance of a single URL, but actually kind of containing all those REST routes under the hood? If you have an explosion of REST routes, you'll likely end up with an explosio6of GraphQL queries too.


Yup, pretty much. However, the implications of veneer can be significant.

Providing a single unified interface which provides granular, unified querying of disparate types has several benefits. In particular, it can simplify logic in the consuming application, and reduce network load (both in terms of the number of requests, and especially in the amount of data returned).

For example, a pet website backed by a traditional REST architecture, may have separate endpoints for say `/pets`, `/owners`, and `/purchases `. In this context, the front-end may need to make calls to all three of these endpoints, retrieving the full payload for each - only to discard most of the fields, and keep 2 or 3 relevant ones from each entity.

By comparisons, a GraphQL based approach would allow a single consolidate query for just those specific fields (from all entities).

Of course this isn't relevant in every use case, and there's no silver bullet - and in many cases, a REST based approach may well be better.


No.

It contains only a subset of the rest routes and allows to combine them efficiently. That is what avoids the explosion. You don't have to add any new "rest route" equivalent in graphQL if some user just wants a new combination of data but doesn't want to make 10 requests to get it.


That's the theory. But I've only seen it work in practice if there is a single backend data source, and that data source is a database. In which case, we've been able to achieve similar results using OData for several years.

Often in reality, you end up with specialised GraphQL queries for performance reasons, because they are reading and combining data from a plethora of backend data sources - cloud-based APIs, on-premise APIs, external APIs, databases...


That's the theory and it can very well be practice. It has worked very well for the teams where I used it or have seen it being used. And I'm talking about a service that provides an API to a frontend (or other backend services) and aggregating data from multiple datasources of different kinds into one graphQL API.

> Often in reality, you end up with specialised GraphQL queries for performance reasons

Ah, is that so? And with REST you don't?

You seem to have _a lot_ of experience with GraphQL. I wonder why your previous post has the form of a question. Or was that just a sneaky way of giving your opinion without backing it up?


> Ah, is that so? And with REST you don't?

No, that's exactly what you end up with using REST; I'm pointing out that, at least with deployments I've seen, GraphQL ends up with the same problems, only hidden behind a single URL.

> You seem to have _a lot_ of experience with GraphQL

I didn't (and wouldn't) say that I have a lot of experience with GraphQL (I have far more with REST/HTTP-based APIs), but I've seen it used enough times to know it's not a panacea, and doesn't always solve the expected problems, especially when it's fronting dozens of (sometimes painfully slow) backend data sources.

> I wonder why your previous post has the form of a question

It doesn't? I was proffering an opinion, based on my observations when GraphQL is fronting many data sources.

> Or was that just a sneaky way of giving your opinion without backing it up?

I would politely and respectfully point you towards HN's comment guidelines[0]

[0] https://news.ycombinator.com/newsguidelines.html#comments


> I'm pointing out that, at least with deployments I've seen, GraphQL ends up with the same problems, only hidden behind a single URL.

Cool, let me give you a mini example, you'll immediately understand it.

Two endpoints: 1. users, which contain video-ids and can efficiently be queried by user-id (index in database) 2. videos, which contain video data can efficiently be queried by video-id (index in database)

How do I get all uploads of a user? I get the user by id (request 1) then all his videos by making one request per video (n requests). No big problem for the database (it's all indexed) and sure, we can improve the performance, but for now the bottleneck here is the number of requests that goes over the network if we use REST. This doesn't work. You need to make a new endpoint or change an endpoint to make this work in a performant way.

In GraphQL we have one endpoint with two different queries. So the same problem can happen if someone queries naively. But, they can also write one nested query that says "give me user for id X and for each video id, give me the video data". It will be one request from frontend to backend, but still multiple selects to the database, unless we use some "smart" GraphQL framework that does magic for us.

But we already solved a big part of the problem. And maybe that is already performant enough for what we need and we don't have the need to improve the database query part. Lot's of time and code saved on the backend and frontend. Yay.

You might say "but we didn't have indexes". Then my answer is: well, you are not worse off with GraphQL and still gain the benefits for all the cases where you have indexes in place. If you have non of these, you seem to be doing something wrong.

> It doesn't? I was proffering an opinion, based on my observations when GraphQL is fronting many data sources.

Yeah, seems you edited your post. When I read it, there was 100% a question-mark in it. ;)


> How do I get all uploads of a user? I get the user by id (request 1) then all his videos by making one request per video (n requests). No big problem for the database (it's all indexed) and sure, we can improve the performance, but for now the bottleneck here is the number of requests that goes over the network if we use REST. This doesn't work. You need to make a new endpoint or change an endpoint to make this work in a performant way.

The answer to this problem is resource expansion. Dylan Beattie actually has a nice and somewhat humorous presentation about how to address this with REST interfaces: https://youtu.be/g8E1B7rTZBI

Here's the part that addresses such data access in particular with a social media site example (people who have photos, which have comments etc.): https://youtu.be/g8E1B7rTZBI?t=1807

Sadly, approaches like that and even HATEOAS are not awfully widespread, mostly because of the added complexity and the support by frameworks simply not being quite there.


Yeah sure. You can use resource expansion and also field targeting (to only get fields you need), add some typesystem on top of it, make sub-resource-expansion look a bit nicer by... I don't know, creating a small DSL for it. And then standardize documentation and tooling around it.

Voila, now you have your own implementation of GraphQL that might even be better than the original one, who knows. Only one way to find out!

Until then I will continue to use GraphQL wherever it works well. :)


> In GraphQL we have one endpoint with two different queries. So the same problem can happen if someone queries naively. But, they can also write one nested query that says "give me user for id X and for each video id, give me the video data". It will be one request from frontend to backend, but still multiple selects to the database, unless we use some "smart" GraphQL framework that does magic for us.

In reality, you'd more likely have an endpoint like `/videos/user/12` or `/users/12/videos`, which would provide all videos for a user, but we'll put that aside for the sake of an example.

Hmm, so I think you're saying that there would still be separate queries, but you're "batching" them, such that client sends multiple queries simultaneously to the backend, and then the backend is responsible for executing them? With HTTP/2 (which supports request multiplexing), you get a similar result using REST endpoints.

> Yeah, seems you edited your post. When I read it, there was 100% a question-mark in it. ;)

I'm not entirely sure if you're kidding, but I assure you I didn't edit my post.


> In reality, you'd more likely have an endpoint like `/videos/user/12` or `/users/12/videos`

Sure and then you do the same with images and friends. And all of those have tags. Friends are also users so they also have videos and images... which have tags. Friends can also have friends of course... it can even be circular!

I wanted to keep my example simple, but here you go...

> Hmm, so I think you're saying that there would still be separate queries, but you're "batching" them, such that client sends multiple queries simultaneously to the backend

Sorry, I used the same term for two different things. Nothing happens in parallel. Frontend sends one request (not one query...) to the backend which contains two graphql queries (one of them a nested query). It is all one http request and one response. No batching here, you can send this request using curl or postman.

HTTP/2 does not help here, because the second query is dependent on the first one (hence a subquery). Because you need to know the video ids first. Sometimes HTTP/2 helps, but not in this case.

> I'm not entirely sure if you're kidding, but I assure you I didn't edit my post.

Not kidding at all - maybe I mixed it up?


I've worked with quite a few REST APIs where you can send a GET request along the lines of "GET /users, include: videos".


It's essentially a facade pattern[0], which in itself can be a good thing.

For example, if all consumers move to GraphQL and away from interacting with all the separate microservice REST endpoints individually, you then have the option of refactoring your architecture (merging/adding micrososervices for example) without worrying about disrupting consumers so long as you ensure that the GraphQL queries still works.

[0]: https://en.wikipedia.org/wiki/Facade_pattern


Look into the patterns of CQRS, event sourcing, flow based programming and materialized views. GraphQL is an interface layer, but you still have to solve for the layer below.

API composition only works when the network boundary and services are performance compatible to federate queries. The patterns above can be used to work around the performance concern at a cost of system complexity.


> API composition only works when the network boundary and services are performance compatible to federate queries.

Isn't that the whole point?

The main reason why individual reaponsibilities are peeled out of a monolithic service into independent services is to be able to unload those requests to services that can be scaled independently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: