Show HN: Gogit – Just enough Git (in Go) to push itself to GitHub

diarrhea · on July 29, 2023

The other day I was trying to work with git LFS. I was very surprised to find out git-lfs, as in the binary, CLI application is the only (open) implementation in existence. There is nothing else. And even it itself does not offer itself up as a library; so even native Go code (the implementation language) has to fall back to shelling out to the CLI git extension! Not even bindings are possible. Such a painful loss of interoperability: IPC via return codes and parsing stdout/stderr.

It seems a similar story with the rest of git. I have hopes for gitoxide aka gix, and think the approach of library-first is correct going into the future. A CLI is then simply a thin wrapper around it, mapping argv to library operations basically.

jacoblambda · on July 30, 2023

> It seems a similar story with the rest of git. I have hopes for gitoxide aka gix, and think the approach of library-first is correct going into the future. A CLI is then simply a thin wrapper around it, mapping argv to library operations basically.

It's worth noting that there is currently a push to "lib-ify` git internals and it's a gradual process. I'm not actually sure how much of this work has actually made it into the tree yet but I've been seeing patchsets towards that goal on the mailing list since at least January.

TkTech · on July 29, 2023

> It seems a similar story with the rest of git.

Dulwich[1] is a pure-python Git implementation that's been around for many years, meant to be used as a library. I used it a long time ago to make a git-backed wiki. There's also libgit2 which is exactly what it sounds like and it has mature Go bindings[2]. I'm sure there are more implementations.

[1]: https://github.com/jelmer/dulwich [2]: https://github.com/libgit2/git2go

adrianmsmith · on July 30, 2023

I always respected the fact that the authors of Subversion, right from the start, structured their software as a library, with the CLI being a user of that library.

The way IDEs and GUIs interacted with CVS was to shell out to the CLI, which inevitably had problems with filenames with spaces, parsing of error messages, etc. Subversion understood in 2000 that the things were changing, and that the CLI was only one way you'd use a VCS. People were more and more interacting with the VCS via IDEs, or via right-click menus in Windows Explorer, etc.

I felt happy knowing I'd never again have to deal VCSs via tools just shelling out to their CLI ever again. How wrong I was...

diamondo25 · on July 30, 2023

JetBrains forgot this memo, as it requires you to configure svn.exe in order for it to work. And usually TortoiseSVN is the way to go for subversion...

strogonoff · on July 30, 2023

Isomorphic Git is a Git implementation purely in JS (no WASM). I wrote a minimal library to handle LFS with it, it’s not that hard, the spec is pretty small.

alexhornby · on July 30, 2023

> git-lfs, as in the binary, CLI application is the only (open) implementation in existence. There is nothing else.

There’s at lease one in sapling and Mononoke.

https://github.com/facebook/sapling/tree/main/eden/mononoke/...

coryrc · on July 30, 2023

> IPC via return codes and parsing stdout/stderr

That's wildly different from Go's method of != nil and error strings.

benhoyt · on July 30, 2023

I'm guessing you're being sarcastic. If so, it's not really a fair criticism. Go has properly-typed return values (not just parsing text from stdout) and any type can implement the error interface, allowing for error types with any custom, properly-typed attributes. You use Go's type assertion or errors.Is / errors.As to handle the latter.

coryrc · on July 30, 2023

Maybe it's my job, but I don't see a lot of Go code treating errors as anything but strings. Errors I get from Go programs are similarly vague and redundant and lacking context. ("error from RPC: failure making from call: couldn't complete operation: 7 is not a valid flag" ah thanks, so much better than stack trace)

Also it's the weekend and I'm just bitching.

stouset · on July 30, 2023

Yes, but don’t you value the fact that this error message was painstakingly crafted by hand under the care of skilled golang artisans instead of being mass-produced by an automated exception handler?

ttymck · on July 30, 2023

Yeah because that's the error interface? It's literally a method that returns string.

Unfortunately, I guess after the fact they decided pattern matching was useful, so they did it via reflection (errors.As). But they don't warn you that errors.Is might return false for a true errors.As, it is certainly a mess, but the pedagogy could be improved.

maccard · on July 30, 2023

I'd like to join the weekend bitching. I do think go's errors are often shitty. Stacktraces are easy to get with [0] and you can always do fmt.Errorf("%s", string(debug.Stack()) to get them in an error. I've not profiled them in go, but in native code they're horribly inefficient to gather so you don't want to be using them unless you really want to I'd bet

That said, other applications have exceptions and people still manage to write functions that return a Boolean to indicate success, and log an equally unhelpful message rather than returning anything I can work with. Maybe we need new programmers.

[0] https://pkg.go.dev/runtime/debug#Stack

zaphirplane · on July 30, 2023

> Also it's the weekend and I'm just bitching

You prefer and in a better mood during the working week ? Interesting

Brian_K_White · on July 31, 2023

You jump to a critical interpretation instead of allowing for instance "It's the weekend when I'm off the clock and don't police myself so much"? Interrsting.

benhoyt · on July 30, 2023

I think that's a fair critique. You have to really put thought into good error messages and error handling. When you do, they're nicer and more informative than an ugly stack trace. But a poor error message chain like the one you've shown is significantly less helpful than a stack trace.

_ph_ · on July 30, 2023

The problem with the IPC approach isn't error handling - in the error case a good string is usually fine. The problem is with parsing the output of the sub process. A library can return structured data. Parsing always has the risk of format change or quickly gets very heavyweight. Both in the syntax of the output and the parsing routines. It is another good example of the SVN system, that their commands usually have the "xml" option which creates structured output in xml format. So you can use standard parsers to get robust output. This is also one thing I like about powershell, that commands can produce and pass objects instead of just writing to stdout.

38 · on July 29, 2023

> it itself does not offer itself up as a library

yeah, it does:

https://godocs.io/github.com/git-lfs/git-lfs/v3

diarrhea · on July 29, 2023

I don’t know what that is, but their docs very prominently and strongly say this:

> However, we do not maintain a stable Go language API or ABI, as Git LFS is intended to be used solely as a compiled binary utility. Please do not import the git-lfs module into other Go code and do not rely on it as a source code dependency.

https://github.com/git-lfs/git-lfs#limitations

dmoy · on July 30, 2023

Anything's a library if you're sufficiently motivated

- Hyrum Wright, probably

jrockway · on July 30, 2023

I made Kubernetes into a library once. Like the apiserver and default controllers, running in-process. Copying and pasting were involved. I was not expecting support.

If code is out there under a compatible license, you can do whatever you want with it. If it breaks, you get to keep both pieces.

38 · on July 29, 2023

[flagged]

dang · on July 29, 2023

Please don't cross into personal attack. It destroys what this site is for, and you can easily make your substantive points without it.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

yjftsjthsd-h · on July 29, 2023

That you technically have the ability to call into an internal module does not in any way constitute it "offering itself up as a library", and doesn't make it effectively useful in that way.

IggleSniggle · on July 29, 2023

It is a module. It is available. It does not offer itself up as a library.

38 · on July 29, 2023

> It is a module.

> It does not offer itself up as a library.

in Go code, those are the same thing.

IggleSniggle · on July 29, 2023

A library and a module are the same. Having an open and available module does not make it "offered up as a library," was the point I was trying (and failing, evidently), to make.

Brian_K_White · on July 29, 2023

You made the point perfectly. So did the docs in the first place. It's not on you that someone decides not to agree the sky is blue because they use their own definition of blue.

Smaug123 · on July 29, 2023

Personally I soldiered on to to the point of being able to extract objects from packfiles, before I realised just how monstrously and tediously complex `git rev-parse` is, and gave up. It's foundational to pretty much every porcelain command, so it's not really something you can leave out if you want a semi-functioning `git`. See https://git-scm.com/docs/git-rev-parse#_specifying_revisions and https://github.com/git/git/blob/ee48e70a829d1fa2da82f1478705... ; `git` really is an edifice.

38 · on July 29, 2023

I have been wanting something like this, but with a few more features such as "git diff". I took a crack at it, but the popular (and maybe only) Go Git implementation has some issues:

https://github.com/go-git/go-git/issues/700

anacrolix · on July 29, 2023

Are you 1268? Are you creating identities on platforms by bruteforcing the lowest available cardinal? Because that is a great idea

38 · on July 30, 2023

yes and yes :)

pravus · on July 30, 2023

I love this and will make sure to support this kind of thing in a login system I am building.

evanelias · on July 29, 2023

In my opinion github.com/go-git/go-git is a very high-quality project. Just because it doesn't solve some super-specific use-case that you have, doesn't mean the project isn't good. It's open source, have you tried opening a pull request to solve your own issue?

xyzzy_plugh · on July 30, 2023

In truth it's a bit of a tire fire. You're better off shelling out to git in almost all cases.

I tried using it to automate some simple git workflows and found pitfalls everywhere.

evanelias · on July 30, 2023

Sorry to hear that, but go-git is good enough to be useful to Vault, Pulumi, kubernetes/test-infra, and many other projects which directly import it.

Some folks seem to expect it to be a feature-complete library-usable Go re-implementation of the entirety of Git, despite it being (afaik) largely unfunded / volunteer-driven in recent years. I don't think that's realistic. And yes in many use-cases it may indeed make more sense to shell out to `git`, there's nothing wrong with that.

In any case, in my direct experience, the go-git maintainers do happily accept helpful PRs.

It just seems really weird to me to see folks trash go-git, while simultaneously cheering this new gogit project which is just 400 lines and intentionally by design/scope contains only a tiny fraction of go-git's functionality.

xyzzy_plugh · on July 30, 2023

> go-git aims to be fully compatible with git, all the porcelain operations are implemented to work exactly as git does.

It literally says this on the tin, yes. But it's not. I'm simply providing my own experience and a disclaimer.

Sure, they "happily accept helpful PRs" but the code is so very complex and many paths have horrible performance that it's an incredible uphill battle. To be clear I'm not simply shitting on this library, I'm saying it's not a reasonable choice over just using git directly, and is unlikely to ever be.

I haven't made any comment on TFA so while I'm not cheering it on, I do think the spirit is correct: just do the thing you need to do, don't try to be feature complete, just solve the task at hand.

evanelias · on July 30, 2023

I interpret "aims to be fully compatible" as meaning the operations it implements are intended to be compatible with how Git implements those operations. I do not interpret this statement as saying they implement all features of Git.

They offer a document which directly shows what is and isn't supported, and it specifically notes quite a few things that aren't supported yet: https://github.com/go-git/go-git/blob/master/COMPATIBILITY.m...

The godoc also says right upfront it "nowadays covers the majority of the plumbing read operations and some of the main write operations, but lacks the main porcelain operations such as merges." - https://pkg.go.dev/github.com/go-git/go-git/v5#pkg-overview

> I'm saying it's not a reasonable choice over just using git directly, and is unlikely to ever be.

OK, that's apparently true for your use-case. But again, what go-git implements is directly useful to a number of very popular projects, as well as literally two thousand less popular ones.

I find the exported functionality to be high quality, at least for my own use-case. I'm not commenting on the code quality. If I need a shed for bikes, and someone is giving out free but ugly bikesheds, I'm thankful. I don't complain about the color of the bikeshed.

xyzzy_plugh · on July 30, 2023

It's useful for toy projects or narrow stateless use cases, like testing or fetching something one time, like in Terraform, but as soon as you have to deal with real world scenarios related to actual distributed version control, the wheels completely fall off.

Can you use it to implement a git remote? No. What about to write some commits? Also no. What about a read-only client? Nope.

It only works well if you have a narrow use case, and only works predictably if you are in full control of the repository being interacted with. Otherwise it is simply going to cause more pain than just using git directly. Look at the open issues, the open issues related to it in projects that depend on it.

You clearly have a horse in this race and that's fine, if it works for you, great. But I don't recommend it. And no amount of lobbying on your part is going to alter that reality. If you're doing serious things, use git directly, and if you're not, it's probably simpler to write it yourself.

And lastly, the exported functionality is not high quality. It performs poorly in many scenarios where shelling out to git does not, and it breaks with any sort of complicated set up.

38 · on July 30, 2023

not really. its littered with interfaces, in some cases many levels deep, which in Go is an anti pattern. original author clearly came from Java or some other deeply OOP language. also:

> super-specific use-case

uh, "git diff" is specific?

evanelias · on July 30, 2023

If I understand your issue correctly, it's `git diff --cached` that you're specifically looking for, not just `git diff` in general?

Your expectation seems to be that someone has already implemented this in Go for you, for free, but this is not the case. Why is this your expectation, and what does complaining about it accomplish?

38 · on July 30, 2023

> If I understand your issue correctly, it's `git diff --cached` that you're specifically looking for, not just `git diff` in general?

incorrect. I am simply looking for a normal "git diff", which compares the index with the working directory. shocking this is not available out of the box, hence my issue.

evanelias · on July 30, 2023

object.Tree has a Diff() method. You can get an object.Tree of any commit hash from a Repository with its TreeObject() method. I don't recall offhand how to get an object.Tree of the working directory or index (perhaps from the Repository.Storer?) but worst-case you could just create a new commit in order to get a hash and then run the diff.

The package is more read-oriented than write-oriented; the docs specifically say it "covers the majority of the plumbing read operations and some of the main write operations". If you're trying to diff working directory modifications, that's a write-path use-case since it implies files are being changed / you're not trying to diff two pre-existing commits.

(edit: removed some text based on an initial misreading of your statement.)

38 · on July 30, 2023

> You can get an object.Tree of any commit hash from a Repository with its TreeObject() method.

OK, but I am not dealing with a commit, as mentioned in the issue and my previous comment, I am dealing with THE INDEX and working directory, not a commit.

> I don't recall offhand how to get an object.Tree of the working directory or index (perhaps from the Repository.Storer?)

right, so you can see how its not as easy to hand wave away the problem as you initially thought.

> worst-case you could just create a new commit in order to get a hash and then run the diff.

no, I am not making a commit just to diff the working directory. would you tell someone to do that with the command line tool as well?

> The package is more read-oriented than write-oriented; the docs specifically say it "covers the majority of the plumbing read operations and some of the main write operations".

cool, we are talking about a read operation, no writing is being done.

> If you're trying to diff working directory modifications, that's a write-path use-case since it implies files are being changed / you're not trying to diff two pre-existing commits.

no its not. I am not writing to anything, only reading.

evanelias · on July 30, 2023

> no, I am not making a commit just to diff the working directory.

OK, you do you. I say it depends on the situation: if this is a throwaway clone (especially an in-memory one), creating a commit is harmless. I mean it's certainly not an ideal solution, but at the end of the day it solves the problem at hand.

> would you tell someone to do that with the command line tool as well?

It's not my library, I didn't design it, I'm just trying to provide a solution to the problem you posed. If you don't like that solution then, well, OK? Shell out to `git` and call it a day, or write your own `git` implementation for scratch, or send go-git a PR. Any of these would be more productive than complaining about how a free open source library doesn't provide a solution to your use-case.

> I am not writing to anything

Clearly you've written to the files in the working directory, otherwise your diff would be blank.

Again, it's a read-path oriented library. If you're writing to working directory files, your use-case may not be aligned with that of the package authors.

38 · on July 30, 2023

> creating a commit is harmless. I mean it's certainly not an ideal solution, but at the end of the day it solves the problem at hand.

youre making my arguments for me here. youre essentially saying the library is so poorly designed that the "easiest" solution is to create a commit, rather than actually just diffing the worktree directly.

> Shell out to `git` and call it a day, or write your own `git` implementation for scratch

again, making my arguments for me. youre essentially saying the library is so poor, that is cant support simple use cases and it would be easier to shell out to Git than to write a program that diffs the worktree.

> a solution to your use-case.

I would say its a solution to essentially every command line user. how many people DONT use git diff?

> Clearly you've written to the files in the working directory, otherwise your diff would be blank.

the diff it not writing anything. the issue is not "how do I write to a file", that can already be done with the standard library.

> If you're writing to working directory files

writing to files it done outside the scope of the Git package. the Git package is only needed to handle diffs that were done with some other tool.

> your use-case may not be aligned with that of the package authors.

git diff?

evanelias · on July 30, 2023

I didn't say that was the "easiest" solution, or ever imply it. Don't put quotes around words I didn't say. That's really not cool.

The library is designed for use-cases like getting git metadata or contents from git repos. IIRC, the previous main sponsor (creator?) was a product that let you run SQL SELECT queries against git repos. There's no need to interact with the working tree at all in that type of use-case, so why would they spend a bunch of time implementing diffs for it?

If you're trying to use this library for an IDE, or something else like that where arbitrary modifications are made to working dir files, you're going to have a bad time. The library simply wasn't created to do what you want it to do. That doesn't mean it's bad or useless. It's directly imported by Vault, Pulumi, k8s test-infra, etc because its use-case is aligned with what these projects need.

Personally I think it's cool that I can use go-git to clone a git repo to memory and then perform programmatic read operations on the repo's contents. That's useful to me. It's not useful to you, and that's fine, but clearly we have different opinions and expectations around community-driven FOSS software projects.

cedws · on July 29, 2023

>The verbosity of Go’s error handling has been much-maligned. It’s simple and explicit, but every call to a function that may fail takes an additional three lines of code to handle the error

Putting error nil checks into a function is an anti-pattern in Go. There is no need to worry about the LOC count of your error checking code.

inb4 this ends up on pcj

benhoyt · on July 30, 2023

> Putting error nil checks into a function is an anti-pattern in Go.

I assume you mean into a helper function like I've done with check()? If so, I agree with you for normal "production" Go code. But for simple throw-away scripts you don't want half your code littered with error handling, when you could just throw a stack trace.

> There is no need to worry about the LOC count of your error checking code.

Well, it means some functions are more than half error handling, obscuring the guts of what a function actually does. Even the Go language designers agree that Go's error handling is too verbose, hence proposals like this from Russ Cox: https://go.googlesource.com/proposal/+/master/design/go2draf... (there are many other proposals, some from the Go team)

> inb4 this ends up on pcj

im not shur wut pcj meenz

dgb23 · on July 30, 2023

After they introduced generics I immediately wrote a generic Must function that returns the thing or panics.

There’s plenty of (pre generics) Must functions in the std lib that fo this.

38 · on July 29, 2023

agreed. when I see people talking about LOC my eyes roll. its verbose for a reason, the language designers WANT YOU to pay attention to the errors, not ignore them.

lawn · on July 30, 2023

Being verbose and needlessly repetitive is indeed something they wanted, but it doesn't mean it's a good solution.

If something is too verbose and repetitive your brain will learn to skim over it, making it lose effectiveness while increasing the risk for bugs.

alex_lav · on July 30, 2023

I think the ultimate question for me is, does increased tedium demand/imply/require "attention", or does it just create a new opportunity for mistake? If devs are so frequently writing these "check()" style functions, are they paying attention to or ignoring errors?

jstanley · on July 30, 2023

The language designers aren't the boss of you. Just because they want something doesn't mean you have to do it.

pmarreck · on July 29, 2023

That's exactly why every other language took the wiser decision of actually having runtime errors.

To force you to pay attention to them before your application state went into an unknown configuration, thus making it nearly impossible to troubleshoot or even pretend to be deterministic.

I still have no idea how any programmer thinks this is OK. Nondeterminism and unknown/un-considered application state are literally the source of all bugs. I much prefer (and honestly believe it makes a ton more sense) to do what Erlang/Elixir does, which is to fail, log, and immediately restart the process (which only takes a few cycles due to the immutability-all-the-way-down design).

If you hit my Phoenix application with a million requests in 2 seconds and each throws a 500 error, my webserver will keep chugging along, while every other technology's webserver will quickly exhaust its pool of ready-to-go webserver processes and fall over like a nun on a bender.

pravus · on July 30, 2023

I don't really know how to take comments like this.

I work on a pipeline that processes millions of events per day in a pipeline that contains two pretty busy Go programs. One has an embedded Javascript engine and does all kinds of pattern matching and string manipulation. The other does a ton of database read/write and some crypto-calculations to do data integrity for us.

Millions of events. Per day. Never a panic in production.

It is easily possible to write deterministic, highly performant, and durable applications in Go.

pmarreck · on July 31, 2023

> millions of events per day

I was talking about millions of events per every couple of seconds, not per day. A million events per day can be accomplished with just 11 per second, btw; a very... achievable number.

WhatsApp, which is built on OTP, handled 64 billion messages in a day. Averaging out to 744,000 a second. Almost 10 years ago. (Granted, also on 8000 cores. But thanks to the concurrency...)

https://www.businessinsider.com/whatsapp-64-billion-messages...

tester457 · on July 30, 2023

> One has an embedded Javascript engine and does all kinds of pattern matching and string manipulation.

Why did you decide to use an embedded JavaScript engine instead of pure go?

Kwpolska · on July 30, 2023

> every other technology's webserver will quickly exhaust its pool of ready-to-go webserver processes and fall over like a nun on a bender

Most other technologies don't use one process per request, but instead have a single server process which handles all exceptions that come out of a request by returning 500 without crashing the server.

dgb23 · on July 30, 2023

You can absolutely do this in Go. In a webserver you panic and recover your goroutines for 500’s or other failures that you deem invalid states.

There’s a qualitative difference between what we call 500 errors or 400 errors and it’s a good thing to handle the former by throwing exceptions or panicing and the latter with normal program flow and error values.

Errors that you can sensibility handle and display are part of your domain logic, just values and functions. They should be handled and contextualised right there at the site they come up.

Errors that represent invalid program states should be thrown as far as possible and handled at the edge.

adrianmsmith · on July 30, 2023

> Putting error nil checks into a function is an anti-pattern in Go.

What should you do instead?

Smaug123 · on July 30, 2023

You may have misread this. The OP means "extracting error nil checks into a function is an anti-pattern", not "your functions should not contain error handling".

adrianmsmith · on July 30, 2023

I see, that makes more sense, thanks :)

JaDogg · on July 30, 2023

Nice, I'm liking the python version too. Good stuff

amedvednikov · on July 30, 2023

Looks really cool!

Any chance you could add `git pull` support as well?

Smaug123 · on July 30, 2023

`git pull` is not easy! It implies implementing a merge algorithm, for example. (One could half-ass this by only implementing fast-forward merge, I suppose.)

zdgeier · on July 29, 2023

Nice work!

lopkeny12ko · on July 29, 2023

[flagged]

Chico75 · on July 29, 2023

Nowhere does the author advocates for using its tool instead of git. Not everything is about self-promotion, sometimes it's simply knowledge sharing.

zeroxfe · on July 29, 2023

Feels like you're missing the spirit of the article. Nobody's advocating it as a git replacement -- the author is just posting thoughts about something they built.

tedunangst · on July 29, 2023

I don't think you're expected to use a git client that's missing just about every wanted feature.

lopkeny12ko · on July 29, 2023

Exactly why I find this article so confusing.

Matl · on July 29, 2023

Why do you comment on Hacker News? After all there are many other commenters with insightful comments people presumably want to read.

If you answered; 'for fun', 'to learn something' etc. then you know why this blog post exists.

egypturnash · on July 29, 2023

The second paragraph explains why this exists, and it's not to provide a useful implementation of Git.

> I wanted to compare what it would look like in Go, to see if it was reasonable to write small scripts in Go – quick ’n’ dirty code where performance isn’t a big deal, and stack traces are all you need for error handling.

It's a toy problem that's just big enough to be interesting. Comparing it to Hoyt's earlier Python implementation of the same problem lets him evaluate how Go would fit into a certain place in his development workflow.

patmorgan23 · on July 29, 2023

People build things for practice all the time and then write up their experience and what they learned.