Hacker News new | past | comments | ask | show | jobs | submit login
Building a High Performance Data Integration Framework in Go (cloudquery.io)
123 points by sghosh2 on Nov 28, 2022 | hide | past | favorite | 31 comments



Code generation is becoming really important in Go.

Why even use an ORM when https://sqlc.dev/ will generate everything from vanilla SQL?

Why make the frontend team write a Typescript client when https://goa.design on the backend will produce an OpenAPI schema they can just point a https://openapi-generator.tech at?

Why write out GraphQL boilerplate when https://github.com/99designs/gqlgen will take your GQL typedef and generate it all for you based on how you want it to look.

Why write validation rules when you can just define your input struct and let https://github.com/mustafaakin/gongular generate the rest for you?

Honestly, I'm loving this. I want to focus on the entities and business logic - not writing yet another handler/resolver for basic auth + CRUD work.


It's real shame it's kinda bad language for it. Looking at what people did with Rust macros it's shame that Go code generation story is either "just run some random binaries to compile stuff" or "put code instructing the compiler to do stuf in fucking comments"

And I say it without being sarcastic but Go makes me miss C preprocessor and nothing should make anyone miss C preprocessor.


> put code instructing the compiler to do stuf in fucking comments

Funny considering how Rust does stuff.

Both approaches have pros and cons. One either digs through a shitload of macros, or commits explicit auto generated code.


Yeah the annotations in Rust suffer from bit of that but macros don't.

I yearn for ability to make error handling macros...


It’s like metaprogramming in Ruby again, except with the idea that it’s somehow more straightforward because you can inspect the generated code. Except nobody will ever inspect auto-generated code because a) it’s usually awful code b) ignoring that code was kind of the whole point.


>Looking at what people did with Rust macros it's shame that Go code generation story is either "just run some random binaries to compile stuff" or "put code instructing the compiler to do stuf in fucking comments"

Why is it a shame? Rely on macros and this is what you get:

1. Slow compile times

2. Unsearchable (grep, sourcegraph...) code

3. Magic codebases. Longer learning curve

To me, working in a large organization, those are big downsides.


The article is about code generation that is more magic than macros.


but you can inspect the generated code, jump to def, search it... how the actual generation works may be magic, but the code is reviewable, searchable, debuggable...

Its clearly better than macros for my use cases


Check out Ent https://entgo.io/docs/code-gen

Pretty easy to generate GraphQL (most fully featured extension), OpenAPI, Protobuf, etc. from your database schema. Ent also makes it easy to implement your own generators (e.g. OAS, glue logic, etc).


The problem with Ent is you are neither 1) writing your SQL structures or 2) writing your Go structures. Instead, you're writing Ent-specific structures.

It's still a good option, just sad we keep inventing more abstraction layers that are unique in each language. That is a benefit of GraphQL typedefs, JSON, and SQL - it's the same in every language.


It would be interesting for the annual Go survey to include a question about how much code generation is being used in the wild. My personal impression is that it is not widespread at all, but I phrase it that way on purpose.


We've been using it for mocks and protocol buffers for many years. I'm sure we're not alone.


100% ORMs becoming less and less relevant especially in Go with code generation and a bit of copilot help :).

Also, Protobuf is another (not new) but good example of client and server side code generation which passed the time test.


sqlc is great but what I dislike about it is that it forces you into using only the supported underlying drivers, for example with postgresql the pgx v5 driver has been out for some times now but sqlc only support v4 (this might have changed recently I haven't checked).

Overall there are many reasons why I would like to generate code slightly differently from what is done in the lib, I wish there was a way to have a highly customizable code generation experience


It's nice that they exist, but these workarounds are only necessary because Go isn't a very expressive language. I don't really consider having to use a third party tool to generate boilerplate as part of a multi-stage build process to really be a great thing, but I guess it's better than not having those tools if you choose to use Go for some reason.


Gongular looks pretty cool, but it seems like development stopped. Is there any other library that does similar things?

I love learning about new frameworks/libraries in golang for exactly that reason of reducing boilerplate and spending more time on logic.


Interesting response here in the issues.

The project is still active. It seems that feature requests and issues are not being asked of the project.

https://github.com/mustafaakin/gongular/issues/21


Sqlc quickly breaks down when you want to do dynamic select. This is where an orm, or query builder shines


Agreed. I am experimenting with using go-jet instead and it’s going real good so far.

https://github.com/go-jet/jet


Awesome to see this on HN! Founder/Author here. I'll be happy to answer questions.


Great post! I was especially impressed to hear about the 80% of plug-in code being auto-generated through the SDK.


Thank you for putting your time into the project!

Besides this project I like sqlc a lot but in both scenarios I was wondering how one gets started in code generation tooling? So that I could contribute, extend or even create my own (I.e. I would like to extend sqlc for better integration with FastAPI).


Thanks! I think the best place for code generation in Go would be: 1) Go templating: https://pkg.go.dev/text/template 2) Go reflection: https://go.dev/blog/laws-of-reflection (there are many articles and documentation on this).

Then I would try to do something small to experience both libraries hands-on as otherwise it's hard to get the hang of it. Once you did a small project I would go and try to contribute a small fix/feat to sqlc to understand sqlc code structure and from there potentially a bigger feature/extension for sqlc.


Seems like strange choices for "CloudQuery vs Others". Why not compare against FiveTran, Airbyte, Meltano or other EL tools?

Also, It'd be nice to know what the transfer protocol is like. What format is used to transfer between a Source and Destination?


Great question! We specifically started with connectors/plugins to cloud infrastructure providers and other infrastructure vendors - This is why most of our users migrated and/or compared us to the tools specified here (https://www.cloudquery.io/docs/cq-vs-others/overview).

There are no connectors for AWS, GCP, Azure in FiveTran, Airbyte, Meltano but as we extend the number of our source plugins I believe this question will come up more and we will add those products as well to our comparison list.

The protocol is just standard/plain gRPC (https://www.cloudquery.io/docs/developers/architecture). Protobuf is defined here - https://github.com/cloudquery/plugin-sdk/tree/main/internal/...

Basically just the normalized, verified data + metadata, types


Will you adopt the singer interfaces? That's what Meltano does, and it's chefs kiss.


Interesting. We will take a look at this even though we usually build our own connectors due the huge concurrency/speed advantages when using CloudQuery SDK.


understood :) thanks for the reply


Much more of a focus on security and infrastructure than other EL tools. Like these security policies for example - https://www.cloudquery.io/docs/core-concepts/policies


So golang macros when?

Honestly the language is the new JS and just another one to remind us how miserable software can be with half baked enterprizy solutions to just get the job done faster.


This is a lot of confidence for an opinion that doesn't seem very thought through. There are a lot of downsides to macros, and a lot of (apparently under-appreciated) upsides to code generation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: