I think if git submodules were architectured differently, they could have been a really interesting solution for package management. In fact my last place of work uses a layer on top of submodules as part of their internal package manager. I think they could be really special but unfortunately are now sort of relegated to history.
Is there anything wrong with the submodule architecture itself? My impression is that the data model is fine, but the interface for manipulating them is confusing even by git's standards.
Your impression is correct: They are functionally identical to svn externals that pin to a revision, but require an odd set of commands to update, instead of just being batched in like how "svn up" includes externals.
I've never had a problem with them and don't understand why so many do.
I won't suggest others are referring to the same issues necessarily, but try switching between branches(/commits) when you have submodules. Let alone rebasing etc... they simply don't "just work". You always get some odd vestige of a submodule hanging somewhere that ends up causing problems, either in the index or the .gitmodules or somewhere in .git/ or whatever. Hell, the fact that there are so many places where git submodules are recorded itself has caused me so many headaches.
The most fun is had when a submodule was introduced to replace what was previously an ordinary directory, then you try bisecting across that commit. Qemu's git repo has at least one of these (the slirp/ submodule/directory) which I curse every time I need to bisect something.
Hm, this sounds like a good example of something I never encountered just by chance. Be it svn externals or git submodules, I never string them around the codebase - I put them in a subdirectory that's just for that purpose.
Switching commits and rebasing works like I expect it to. You need to update after switching of course.
I've always assumed the copy in .git/modules is just a cache so I don't have to re-clone the whole submodule if I switch to a commit where the submodule doesn't exist, then switch back.
I don't know if it's the interface or the data model but stuff goes real sideways when you move across the commit boundary where the submodule was added, basically breaking the fluiditiy of git. I think some of this is the files in the .git directory and the filesystem and some is the tools not handling the edge cases.
There are the other parts where really a git clone should be recursive by default pulling the submodules instead of surprising you that oh crap I have to go submodule init or whatever the command is like 50 times.
Exactly so. The problem with submodules isn't the concept itself, it's the lack of transparency. The submodule state should always be kept up to date with the state of the working tree, so that you never end up in a situation where it is stale and out of sync with everything else.
Git LFS has a similar problem, but there it does at least check out the content by default. But even there there have been too many times I've had to manually fetch things, or push things manually to different remotes.
In practice the biggest issue I've found with git+pinned hashes as dependencies is most public sources of remote git repositories allows the repository to be taken down by the author at any time, i.e. an author can turn a public github repo private or simply delete it at will.
Whereas most public package registries generally don't allow removal of publicly published packages outside of special circumstances, so the references will be more durable.
That's a pain. It's the same problem as a class that constructs a particular object that you want to customize. The solution is also the same: dependency injection.
I wish they didn't. Git clone is very slow relative to downloading a tarball. Glide/Dep would take tens of seconds to download what amounted to a few megabytes.
Completely agree. It's probably possible to create a "monadic" DVCS, but I expect it would have to be designed for composition from the outset, rather than having composition grafted on as an afterthought.
That doesn't make sense. Winter is the best time to invade Russia because you have fresh supplies, you are prepared for winter and you have spring, summer and autumn to secure it.
The mistake is to invade Russia in late summer because the campaign will be prolonged and you will have to secure Russia in winter, with your summer clothing, which is the problem.
*edit: But don't forget about the nuclear bombs and the evilness of war itself. It's never a good idea to invade Russia.
The quote, usually known from the Princess Bride, is actually from Field Marshall Montgomery addressing the House of Lords:
The next war on land will be very different from the last one, in that we shall have to fight it in a different way. Rule 1, on page 1 of the book of war, is: "Do not march on Moscow". Various people have tried it, Napoleon and Hitler, and it is no good. That is the first rule. I do not know whether your Lordships will know Rule 2 of war. It is: "Do not go fighting with your land armies in China". It is a vast country, with no clearly defined objectives.
Montgomery didn't say except in Winter. He said don't do it all.
I’m not sure what you mean by “won”, the Japanese most certainly did not conquer Russia, which is the entire point of this old truth: that the geography and vastness of Russia makes it extremely difficult, if not impossible, to conquer. It’s a very different thing to attack Russia and have them cede territory to you, this has been done many times even by European countries.
Adding and updating submodules feels so straightforward but... removing yeah bring out the big guns, I can never get it done right and have to resort to searching
I started using them again and it seems fine. You're ok as long as you use it for things like external library projects and not splitting your project repo into smaller bites just for the sake of doing it.
> splitting your project repo into smaller bites just for the sake of doing it
That's exactly what lots of people want to do.
Would be nice if it worked like the development mode of Python's packages: I can install a local development copy of the dependency somewhere, the main program will see the changes I make and therefore it's possible to develop both at the same time.
Removing them after adding to a project is still a 6-7 step process (unless this has changed and I missed it), which makes them feel clunky to use, to me.
This please! Submodules are the only time I ever have real issues with git. I manually edit .gitmodules more than I care to admit, naturally this causes its own set of issues.
Hmmmm.... I think the main problem with git submodules is that if you google 'git submodule' you don't find anything useful. When you somehow discover how they actually work they are... uhm... let us say... 'usable'... Maybe someone should write a tutorial about git submodules that is actually informative.
Correct me if I’m wrong but these all seem to fix problems that exist in a local clone only. And that’s where none of my “oh shit” moments happen.
All of my oh shit moments are two flavours: I have pushed to remote, or I have pushed to remote and force pushes are disallowed.
I want a list of as graceful solutions as possible for those scenarios. And certainly they can be real-world solutions like “this cannot be undone in a desirable way so here’s an approach to handle the commit(s) necessary to fix the problem.
This also brings up a question I didn’t realize I had: are people using git without frequently pushing to remote? I feel like having a remote backup of my work in progress is the killer feature. I can’t imagine doing any amount of work and not backing it up.
> are people using git without frequently pushing to remote? I feel like having a remote backup of my work in progress is the killer feature.
I use git without a remote pretty often. When doing personal projects or exploratory work, the problem is either so trivial that I don't care about backup, or already backed up using something like Nextcloud/Syncthing. But having the commit history allows me to easily look at old code, revert changes, and make big structural changes without having to back up the entire project folder in case I end up disliking the new structure.
I guess you're just not working on many things at a scale where version control is useful but setting up an online repo doesn't seem worth the time.
> I guess you're just not working on many things at a scale where version control is useful but setting up an online repo doesn't seem worth the time.
It is literally a couple of clicks to set up a private repo on github. For me, either the problem is too small to set up git, or it is big enough to spend half a minute to create a github repo.
If you have something sensetive or just want to avoid github/gitlab for some other reason, an empty remote container with SSH access works just as well.
You could just set up a private Github monorepo for all your utility projects and experiments. One branch per project, and you check out exactly the branch you want. Since 2012, `git clone` supports `--single-branch`, which makes it download only the commit history of that specific branch.
When I do exploratory research, having a history of commits, days, months, years after the end of the research project is invaluable for many possible future needs.
I’m still not sure I see a strong case for not bothering with a remote. But I appreciate we all do things at different fidelities.
That's not really true. It is pretty difficult to delete any data from git, especially by accident. You may screw up your history but you can always fix that if you know what you are doing.
Yes. Git push does not remove the previous commit, but changes the HEAD of the branch to be another commit.
The old commit becomes unconnected, but you can still get to it if you know its hash. There's also a way to list all unconnected commits, which is a bit involved.
Unconnected commits will exist and take up space until garbage collection is done.
If you pushed by mistake on a remote branch instead of creating a PR from a new branch, then you revert the commit, you're going to have trouble with the PR not showing all the changes because they match the orphaned commit.
Note that PR is not a fundamental git concept, but something a code forge can decide to implement any way they like. I presume you're talking about GitHub, because in GitLab, you will be able to see all changes when browsing commits constituting the merge request.
Is this a general rule? If so, I'd appreciate a suggested best practice for the following case: say I commit, push, then a while later fix some minor error and don't want it to be a separate commit (otherwise my commit history would get unmanageable). So I gc --amend and then gp --force. What would be the correct way to do this?
For the specific thing you want, force push is the only way AFAIK. However, my argument is that you should let the commits reflect the true history of the code. Fix the minor error and explain it in the message.
The downside of the "true history" approach, besides obvious aesthetic reasons, is that it precludes the use of such tools as `git bisect` and `git blame`.
What is the reason? Like are people just not pushing changes to remote branches? Or is there some value in stuff to do in the small time between commit and push? Like are you checking your work before pushing?
Intermediate commits. For example, while building a UI - HTML is done, but you haven't begun styling. You want to be able to revert to a good point in case you screw up, but there isn't much reason to push it to a remote because it's not a shippable state. As long as your computer is being backed up (you are using backups, right?) you won't lose any data by not pushing to a remote.
I try to keep my commits small, and I don't push each time because I have a Jenkins pipeline thats going to update a server for my branch and run unit tests and all that
In the past I've done stuff like not pushing until the end of the day/coding session. Usually though I just end up committing and pushing immediately and then realizing I need to amend something and make a note to rebase later(I totally always do that rebase /s).
If y'all think git is hard, wait 'til you hear about computer algorithms!
Git isn't hard, it might be complex, but how it works and how to use it is simple as far as applications and/or processes go.
I find myself constantly trying to dispel the idea that git is too difficult to understand in depth. The problem has always come down to the missing attempt to understand the issue, over trying to ^c, ^v a solution.
Seriously, if you think git is hard: don't ask for it to be easier, ask instead to be stronger. (Or in this case smarter, something you surprisingly have control over)
It's not that it's hard or complex, it's that I really don't care about it and am not at all interested in learning. It's a weak attitude, I know, but I just can't help it.
So like most other developers, I go about my day totally not understanding git at all and when something weird happens, I talk to that co-worker know understands get (there's always at least one such psychopath on each dev team in existence) and ask him to fix it. He invariably talks about how easy and elegant git is and then finally gets around to giving me a couple commands with non-sensical names and options that fix the problem.
I don't care, and I'm okay not caring. Not learning git might even give me extra time to learn about these magical computer algorithms I've heard so much about.
> there's always at least one such psychopath on each dev team in existence
I'm that psycopath :D
> I don't care, and I'm okay not caring. Not learning git might even give me extra time to learn about these magical computer algorithms I've heard so much about.
I'm fine with this actually. If you don't care, and don't want to. That's actually totally fine. I personally will look at you a bit strange. But when it comes down to it. You should only care about what you want to. But please, for my continued (admittedly borderline) sanity. Don't try to convince others that it's too hard to learn, and/or that they shouldn't try. I want to work with people who are curious, people who want to learn how cool things work. But there's a trend in software engineering, that parts of it are too hard, and that it's a waste of time to learn them. That's a scary thought to me; imagine the next generation of software devs, all convinced that they can't understand something, or that it would be a waste of their time to try?
I'm that psychopath too, but only because I use git 1000 times a day. It's not by choice. Git has a horrendously shitty UI and it's quite hard to learn because the UI is inconsistent, patternless, and uses a different vocabulary than mainstream computer science (a staging area is called an "index", revert is called "reset" and "revert" means something completely different, "fast-forward" is a term from the world of linear tape recorders which is meaningless when applied to graphs, but git uses it for graphs anyway, etc).
Git is worth learning but it's about 10x harder to learn than it would have been had its designer known anything about human interface principles. And its manual is useless because that same designer wrote the manual too.
Git would be completely unlearnable without Google and Stack Overflow, but since those things exist, it's worth learning.
I see it like learning a build system: ultimately a pointless piece of trivia that isn't very interesting or worthwhile. This is in contrast to math or CS, which is generalizable and timeless.
But you're right, we should all understand this stuff. And if it wasn't for that one magic coworker (you're really taking one for the team), I probably would.
Git is a little bit different. It is a really good implementation of the 'history is immutable' philosophy of programming.
Learning the implementation details and reading the mystifying documentation to learn what something is called can be a bit wasteful but understanding the git model is very worthwhile. It is interesting to see how it handles the pragmatic issues of fixing the situation up when the history becomes snarled.
Learning git will make some not insignificant number of people better at concurrent programming. Most build systems don't do that..
> Git is a little bit different. It is a really good implementation of the 'history is immutable' philosophy of programming.
I beg to differ. Mercurial is all about immutable history. Git is explicitly designed to mutate history (that's what rebase does) in the demented (IMHO) pursuit of a linear change log.
Rebase does not mutate. It creates new commits and updates the rebased branch to point at them. The original commits still exist (even if no other branch points at them).
Git commits are never mutated. They are referred to by a hash of their content after all.
What's the UI for getting at those old commits? Is it possible to get a true chronological log from them? Do they stay around forever or do they get garbage collected eventually?
> Is it possible to get a true chronological log from them?
The output of git log with --graph is not chronological, because it does a topological traversal of the commit graph.
> Do they stay around forever or do they get garbage collected eventually?
They get garbage collected eventually. As a rule you should not assume dangling commits are safe. If you need to keep one, assign a trivial branch name to it.
The same as any other commit. You can check them out by referring to their hash, by referring to any tags that point to them, or any branch that points at them.
The other reply gives options assuming the commits have been orphaned (no tags or branches point at them).
As far as I know, commits that were once ancestors of a branch can sometimes no longer be ancestors of that branch. If that's not a mutation, then the word is not as useful as I thought.
Branches are just pointers into a graph of commits. The graph cannot be mutated (it can be pruned by garbage collection but that does never happen related to an oh shit event, it takes weeks even in a heavily modified repo). The pointers can be modified to point anywhere else in the graph, which makes things look mutated without being mutated. Using reflog you can get your back to where your pointers pointed to at any earlier point of time.
I'd say it's a pretty standard use of the word. Branches are analogous to pointers. If a program changes where the pointer points, it has not mutated any values, besides the pointer itself.
I'd wager a lot of developers, good ones, share your sentiments. Git is a well designed, dependable SCM, there is no question. But its command line UI is awful, far too close to the metal. I'm not at all surprised though, it's a UNIX cultural thing -- ignore the 80/20 rule and use the command line for everything. Compare installing anything on Linux vs. Windows. It's ridiculous.
I will never understand this. A couple of hours investment now and you've saved yourself a lifetime of stress and confusion. There aren't many tools that I'm sure are going to stick around for the long-haul, but git is one of them.
The problem with git is not that it's particularly hard to understand, it's that it's the wrong level of abstraction for day to day dev tasks. Git is a great data structure with an arguably bad CLI wrapper. What's missing is a layer of workflow rules specific to each repo. What names should branches have? What are the rules of engagement for PRs? Are rebases preferred over merges? Git doesn't care, nor should it. Yet there's more to creating a branch in a structured dev team than merely typing `git checkout -b new-branch`
Git might not be hard, it might be easy, but for someone who has not spent enough time trying to understand it, it gives options to royally F up if you don't know what you're doing. So I make it a point to never use any command other than cherry-pick. Everything else I use GitHub desktop for. Its a simple ui that only lets you do the bare minimum of git actions thats less F up prone.
Thats my personal story. The other problem I see is a large number of engineers who say git is not hard, it's very straightforward and they understand it very well, but actually they don't. And every once in a while they royally F up the entire repo and cause issues in deployment because of that. So git's main problem from my perspective is that it's an extremely advanced, unforgiving tool that gives a false sense of expertise to many mediocre programmers.
> Git might not be hard, it might be easy, but for someone who has not spent enough time trying to understand it, it gives options to royally F up if you don't know what you're doing.
You've just described nearly every tool, show enough carelessness with a hammer and you'll certainly fuck something up.
> So I make it a point to never use any command other than cherry-pick. Everything else I use GitHub desktop for. Its a simple ui that only lets you do the bare minimum of git actions thats less F up prone.
Yes, when you handicap yourself, limiting to a subset of available features. You'll usually have fewer ways to screw up. But I hope that's not advice you give out to others. That's exactly the mentality I'm trying to encourage people to avoid. "I don't understand it, so instead of trying to learn more about it, I'll just limit my self to a trivial subset of the actual options." That's not how you get good at anything.
> Thats my personal story. The other problem I see is a large number of engineers who say git is not hard, it's very straightforward and they understand it very well, but actually they don't. And every once in a while they royally F up the entire repo and cause issues in deployment because of that.
Other people making a mistake is pretty common. Sorry you've had to deal with it, I've been there myself. Fixing others mistakes is pretty annoying. It's expecially annoying when they pretend like they already know everything they need to about git; refusing to learn anything new.
> So git's main problem from my perspective is that it's an extremely advanced, unforgiving tool that gives a false sense of expertise to many mediocre programmers.
Again, it's not extremely advanced. MS Word has more features/options than git does. And I'd never call git unforgiving. In-fact git has been my best friend; I deleted a live directory with a bunch of code that I (foolishly) didn't have backed up anywhere. Including a good portion of the .git directory. Thankfully, git was very forgiving to me, and I was able to restore most of it from the objects that git keeps. But that's something I didn't know how to do before deleting my repo. It's something I had to go out and learn.
>You've just described nearly every tool, show enough carelessness with a hammer and you'll certainly fuck something up.
True, but some tools are easier to be carless with. A hammer isn't one of them, which is why you don't find it in every kitchen. Git (and now that I think about it, every advanced command line tool) is very un-intuitive for many people. I've been using computers for several hours every day since 1997, so take me as whatever type of use you want, but I don't fully understand how rebase works yet. Or the hard flag. I'm sure if I sit down for an afternoon I'll understand but without that I won't. Which is what I'll define as "un-intuitive".
I chose to handicap myself here because the benefit till now has been very marginal. If I have a full afternoon to study I'd rather spend it on ML lectures or understanding something else, instead of source control.
However, this equation is changing, because my team is thinking of starting a mono-repo style coding architecture. This probably can't be done without full understanding of gits features, so I probably will spend the time needed to understand it's features, how it works, and some of the newer features that can be helpful in the pursuit. You don't have to be an expert in every tool you use every day, and a good tool is one which lets even a non expert not F it up, is all.
> but I don't fully understand how rebase works yet
Git maintains a tree of commits, each commit pointing to its parent. When you rebase, by default git finds the last-common-ancestor commit of the branch you're rebasing and the branch you're rebasing onto. That common ancestor is the base, and the subsequent commits on the branch you're rebasing are replayed on the latest commit of the branch you're rebasing onto.
Good programs aren't that easy to fuck up. Easy when messing with the file system, or deliberately destroying data, but not during normal use. Git is very easy to fuck up even during normal use (multiple people working on a software).
> when you handicap yourself, limiting to a subset of available features.
Yes. I practice very similar approach as GP, but I use couple other commands too, checkout, reset, clear, bisect.
Just because some people wanted a feature, doesn't mean I agree with their estimation of risk/reward. The risk being destroying couple days of other people's work. We were really close one time, didn't happen because my local copy was not yet synced.
> Git is very easy to fuck up even during normal use (multiple people working on a software).
I disagree that git design is responsible for this. If you have enough people doing something, especially when they all have different subsets of knowledge about the tool they're using. The chance someone will do something incompatible with everyone else rises. Git is clearly more complex than a saw, but give one to enough people who don't understand how to fell a tree, and ask them to build you a deck, oh and you're not allowed to talk to eachother. Something's gonna go badly.
> Just because some people wanted a feature, doesn't mean I agree with their estimation of risk/reward. The risk being destroying couple days of other people's work. We were really close one time, didn't happen because my local copy was not yet synced.
There's a story here, that I'm interested to hear. But git was kinda designed to avoid this exact thing. There's a reason that your local copy is 100% complete by default. Which means if you really came close to losing a lot of work. All of you were going out of your way to use git in a very not normal way.
Design of the VCS is almost fine. "Almost" because I would prefer to track renames, but I can live without that.
The CLI is horrible.
> I'm interested to hear
Someone used wrong command-line switch, and the merge commit was made in wrong way, destroying changes made in other branch. Then feature branch was deleted I think.
I don't know or care who did that or what was the command-line switch, the point is, the thing is too fragile.
People aren't perfect, they make mistakes. Good interfaces are designed in a specific way, to reduce the probability and consequences of these mistakes. In some areas like flight control I think there're even laws about that.
What you described can't happen. If you merge one branch into another, both branches must exist in the repo. Deleting the feature branch following a merge, wouldn't delete the commits it contained. But even if that weren't the case. That means multiple people would have needed to all delete their feature branch without checking the code they wrote was merged and working. rm can delete code too, would you blame rm for having a confusing interface you just couldn't understand?
Come to think about it, git does warn you when you tried to delete a branch that's not merged into any other branch.
> That means multiple people would have needed to all delete their feature branch
That switch might have done something about rewriting history. Or maybe it was rebase, I never do either of these things. Multiple people (except me) have synchronized after that happened. If interested how we fixed — I’ve copy-pasted couple commands from a chat, sent by a co-worker who knows more about git. These commands pushed my local branch under a new name, then people managed to sort it out with a merge.
Just make sure everything is checked in and drop a tag on a known-good commit before any risky operations. You can easily recover to that tag with git reset.
There's also the reflog. The github interface probably has a friendly representation of it.
In some grand sense i think you're right, but your comment fails to appreciate the economics of the issue. I put computer algorithms to work on problems I want to solve; it advances my goals. On the other hand, Git puts me to work on problems that are a necessity of some external requirement or some other purpose in the category of "distraction": I get paid for the computer algorithms part, not the understanding Git part.
You are right it's not hard, it's complex. My problem is that my actual problems can be both hard and complex... and putting a complex distraction in the way detracts from my goals: not enhances it.
I think many in technology fail to appreciate this distinction. I often hear open source advocates respond to complaints about software quality jumping up to suggest opening issues or contributing fixes... which is fine in its own right: but if those issues/fixes aren't reasonably in line with my own goals in trying to use some class of tool... and some other solutions such as a closed source solution meet the goals more readily... I'm probably not going to spend my time advancing someone else's goals at the expense of my own.
" constantly trying to dispel the idea that git is too difficult "
If that's the case, why wouldn't you concede that it is?
"The problem has always come down to the missing attempt to understand the issue"
Isn't that always the case with misunderstanding something?
That 'it's common' and 'people misunderstand the issue' is essentially proof that at least from a certain level, 'git is hard'.
- Git is very powerful. Nobody doubts this.
- Git is sometimes easier to understand when the model is well understood - yes, this is a common refrain.
But - it's still a very complex model, and the UI is bizarre + counter intuitive and as a command line, without a lot of visual reference.
Even standards and practices are not well established as there continue to be wars over idiomatic usage, arguments on HN often by 'experts' who actually make some mistakes in their argumentation.
I suspect a lot of people who think they are 'good at git' 1) don't realize how much they don't know and 2) maybe lacking in self-awareness as to how much trouble if often causes.
I've seen very smart people with very strong git skills befuddled far too often for me to believe the tool does not have problems.
The fact that 'Oh Shit Git' even exists, is in some ways problematic - because it's obviously not clear to many how to truly 'undo' things they've done. OSG could be 100x longer, easily.
This is really scary. It's not a complicated question necessarily, maybe a little specific - but - there are just so many answers, so many variables, incredible complexity in the answers.
Git has a kind of 'unbounded complexity' that allows for so, so many situations and uses cases. There really wasn't much 'Product thinking' rather, just a decade long cadence of 'let's add this feature with this flag' over and over again.
It's such an interesting case study from a Product perspective.
It may only seem that way because there are parts of the model you don't understand. There is a finite (relatively small) number of concepts necessary to mastering git.
"There is a finite (relatively small) number of concepts necessary to mastering git."
This is only partly true. The degree of variation and options in the command line actually fiddles with that 'simplicity' quite fundamentally.
Even if one has a 'clear understanding of the concepts' - the commands don't necessarily map clearly to that model. And there are so many commands, so many little variations.
Maybe you can expand a bit on this? Or point to a resource for learning the concepts?
I've tried a few tutorials about git internals before, but always get confused in the middle for some reason. I go just after git being a list with pointers where each element having a hash and what not. But still something seems to be missing in the end to make it all coherent.
>> "The problem has always come down to the missing attempt to understand the issue"
> Isn't that always the case with misunderstanding something?
No, some things are difficult. Git might be one of them, but I have no reason to believe that. It's the attempt that's missing. They most often, don't understand git, not because they tried to, and failed. But because they've never really tried to understand the problem. Skipping the part where they find the root cause, actually figured out exactly what's happening, and what's gone awry. They've only searched Stackover flow for the `git reset --hard` command they need.
> But - it's still a very complex model, and the UI is bizarre + counter intuitive and as a command line, without a lot of visual reference.
As someone who's used a lot of command line tools, I reject this out right. Git is a very simple command line tool. The difficulty is more akin to learning how to use a CLI, not necessarily git itself. Try teaching MS Word to someone while also trying to teach them to write an essay. The missing visual reference will throw people off, but that doesn't make learning how to use it harder. Instead of clicking around for the button that might do something. They have to read something, git --help is super useful. Just like F1 or clicking the help dropdown at the top.
> Even standards and practices are not well established as there continue to be wars over idiomatic usage, arguments on HN often by 'experts' who actually make some mistakes in their argumentation. I suspect a lot of people who think they are 'good at git' 1) don't realize how much they don't know and 2) maybe lacking in self-awareness as to how much trouble if often causes. I've seen very smart people with very strong git skills befuddled far too often for me to believe the tool does not have problems.
I haven't had these experiences, so I really can't comment. I have to say, I don't really believe people when they say they know and understand git. If someone says they know how to use a chainsaw, do they also know you have to oil it? Knowing how to use something isn't the same as understanding it. Which is what I want people to do....
> This is really scary. It's not a complicated question necessarily, maybe a little specific - but - there are just so many answers, so many variables, incredible complexity in the answers.
I'm not scared by this at all, so I have no idea what you're talking about.
> Git has a kind of 'unbounded complexity' that allows for so, so many situations and uses cases. There really wasn't much 'Product thinking' rather, just a decade long cadence of 'let's add this feature with this flag' over and over again.
Because git is feature rich doesn't mean much. Name a software engineering tool you think is simple/easy, and (if you try to be reasonable,) I'd be willing to bet it has nearly the same, if not more features than git does.
" I reject this out right. Git is a very simple command line tool. "
That's great, but most people would disagree, using the argument that there are rather a large number of variations of commands.
Can you think of another 'command line tool' with more complexity?
Git questions are amply popular on Stack Exchange. It's not that hard to use most other commands. All you need is -help.
"I'm not scared by this at all, so I have no idea what you're talking about."
It's 'scary' because it's ostensibly a very basic question, with apparently quite a number of varied solutions, each of them fairly complicated, and apparently not without controversy.
The second to top answer starts with: "Lots of complicated and dangerous answers here, but it's actually easy:"
If you put on your 'Product/Manager/Investor/I want to make a thing, not tinker in intellectualisms' hat for a moment, it is 'very scary'.
How can answers to simple questions about a tool be 'dangerous'? Of course, it's because some of these operations will screw up a repo, or create undue confusion. That's not good.
Why are there so many variations in answers?
Why is someone making a diagram? If someone has to 'get out the whiteboard' to answer simple tooling questions, then there are added levels of abstraction - which could be fine - but why on earth is that necessary?
Why are people having to go down rabbit holes of 'possible dangerous operations' to do something simple?
There are literally 62 addendums and comments to the top answer!
This is the 'opposite' of a 'good product' and the opposite of 'easy' - it has red flags all over it to the point wherein I'm thinking that I wish there were administrative options to make sure bad outcomes cannot happen, to start.
Complexity is costly, expensive, risky. Git is already a 'secondary factor' in making a product, the software being the 'primary factor'. So we're having to invest in knowledge, risk, problems etc. with 'a tool to help the tool' ... and when we run into serious problems, who do we call?
I don't want my devs to be tripping over themselves, falling into holes, possibly unrecoverable situations, spending time solving problems they should not have to.
Does the problem we are trying to solve imply this kind of necessary complexity? I don't think so.
"Name a software engineering tool you think is simple/easy"
So the question really is - what kind of complexity (magnitude, necessary complexity vs. arbitrary, required abstractions etc.) does a tool provide vs. the value provided - that is the question.
Git provides quite a lot of power, but its constrained by its own complexity - for very standard operations, it's quite easy. But beyond that it's a mine field.
There are literally people on this thread indicating that they don't even allow themselves to use commands they don't know well. There are ample stories of people getting them into problems they can't get out of, loss of code etc.. Where else does this happen for tooling?
There is a 'Golden Rule of Git' of something never to do - this to me is a huge Red Flag. Why the hell is something that 'should never be done' - even doable?!? This to me is a perfect example of the 'unbounded' nature of the complexity.
Finally - I have actually never met a Git expert. All the smart people I know who are 'strong in Git' are often befuddled in various situations. Yes, that happens in software, but it doesn't need to happen with our code management systems.
Since you asked for an example: awk may be more complicated than git.
> Why is someone making a diagram? If someone has to 'get out the whiteboard' to answer simple tooling questions, then there are added levels of abstraction - which could be fine - but why on earth is that necessary?
It's an inherent aspect of distributed version control that the history of your repository forms a graph, and therefore drawing diagrams can be helpful.
Please take the following as the friendly advice it's meant to be: the fact that you even ask this question will suggest to every advanced user of DVCS that you really don't understand what you're talking about. Which is fine, if you have the humility to accept it and improve.
Please take the following friendly advice: if you don't know the difference between a 'command line tool' and a 'programming language' - then maybe you don't know what you're talking about. It's ok, if you have some humility, you can improve and overcome.
FYI I'm well aware that systems such as CVS have 'abstractions' that diagrams could possibly help. That said 'diagrams' are relate the the 'inherent abstractions' in many things, and yet they are used very seldom in Stack Exchange.
The fact is - git is quite obviously very complicated, more complicated than it needs to be, and poses risks that most other tools do not.
The bizarre myopia of those not willing to concede that, and the inherent risks is problematic.
It speaks to a specific kind of intellectual arrogance within CompSci and it's a huge blind spot for the profession.
Every git question - even the most mundane - is a rabbit hole:
"What's the difference between Git Pull and Git Fetch?" [2]
That should be easy enough - but no - there is a 35 comments (only one of the many answers) to argue about the specific nature of what is happening, and no full agreement on some material things.
This is all exemplary of a 'bad product'. A 'good product' would provide a definition, and a few comments on possible corner cases - and that's it.
Git is what happens when you take a 'very powerful, but complicated concept' - and then don't manage it as a product, rather, just lazily throw complex command lines on the fire and let it stir. We are wasting a lot of time on Git.
The reason there's so much discussion is because while the question seems simple, it's ill-posed and there are several distinct things they could actually want to be doing: do they want to make a commit which undoes another commit, do they want to check out an older version to either test something or make a branch, or do they want to remove some commits they made accidentally? All of these are reasonable things to want to do and the only reason the answer may seem easier for other systems is because for some of those options either can't do them or they are so difficult and risky you're only going to do them if you really have to. That's why the top answer gives three different options based on different possible interpretations of the question. The rest of the discussion is either people who think that some of those options shouldn't exist for mostly ideological reasons or people who haven't taken the time to understand what each of those options actually mean.
If all you want from a VCS is a snapshot of each part of history, git can do that (and will actually do it better than a more traditional option like SVN because it doesn't hide merges by doing them on your working directory before you commit), but ultimately there's a lot of value in doing more and if you want to exploit that you should understand how the VCS represents your code and how to use it, not paste together advice from other people on the internet who may not fully understand it themselves. I agree the git UI has problems, mostly relating to naming things and the fact that many commands have convenience options which mean they invoke actions which are different from their base operation (like git checkout -b <name>, which is basically a way of doing git branch <name> followed by git checkout <name>. It would make more sense if this was just the way git branch operated by default, or at least as a flag on git branch instead of git checkout). The underlying model git presents is really not hard to learn (and it helps a hell of a lot when googling how to do something because you can be precise about it) but it's where people tangle themselves most in knots because they think it's either too difficult or not worth understanding, and it's wrong on both counts because it will cost them hugely in time and stress.
" there are several distinct things they could actually want to be doing"
Yes - I get that - but this is my point.
It's such a complicated tool (or rather, specifics are not obvious) that people don't even know how to ask questions.
This is part of my point.
Ambiguous questions, ambiguous answers, a real propensity for people to be giving really bad answers.
This is endemic with Git, more so than I've seen with anything else.
I sometimes wonder if Git should should actually be used with some kind of visual representation as a default standard - and that our propensity for 'command line interface' as the de-facto means of real interface with computers may not be appropriate to communicating 'what is going on'.
Much like people fuss desperately over road signs, human factors in automobiles ... perhaps we should be thinking in these terms much more with this tool.
I agree on the GUI front: git is a lot more comprehensible when you have a graphic representation of the commit graph at all times. This is my default way of interacting with git: even if I'm using the CLI to mutate the repository, I have a GUI open to view the repo.
Code repository tools should be transparent, unfortunately Git very often stands between my changes and the repo. It's funny how other utilities (build pipelines/dependency management/deployments) became more user-friendly in all languages yet the step from SVN was a step in the wrong direction and all thanks to the 'cowboys' with an attitude of "let's use it as if we are a facebook/google or if Linus is using it - it's super cool" without understanding the context where they operate. I can hardly think of a dev tool that became mainstream yet brought so much inefficiency on such a scale into our everyday lives. And when they say 'oh just use these beautiful GUI tools' - oh, wow, we got the same set of tools we had 15 years ago - what an innovation, indeed!
Maybe cowboys contributed too, but despite horrible interface I think VCS part was actually better. Before everyone switched to git, I’ve used cvs, svn, sourcesafe, clearcase. They all required 2 things to work well:
1. You must be in a room with a low-latency 1GBPs link to the server.
2. Dedicated sys.admins on site whose job is supporting the infrastructure, VCS included.
When people don’t screw up with the interface too much, git can be better: doesn’t care about network latency by design, handles conflicts better, branching/merging is very easy…
The issue is that many developers don't care about learning to use version control effectively (as a biased git user would see it). They just see it as a means to an end, necessary evil for team projects, but overhead, mostly. They certainly don't want something they need to learn how to use (despite the fact that it's still relatively easy for novice users to do basic things).
Its not that it's conceptually hard, the interface just sucks. Like how is "checkout" both for switching branches and also for reverting files? And also it's completely counter to what a "checkout" means in every other VCS. It's just bizarre stuff like that. (I love git, but mercurials interface makes way more sense)
git checkout isn't designed to revert files. It's designed to allow you to check out a particular version of a file without changing the branch you're on.
"Reverting a file" is a higher level concept you can build on top of that, but that's not the fault of git checkout. Also arguably to truly revert a file you'd have to follow it up with a git commit.
I would also point out that recent versions of git have added a git switch subcommand because of some of these complaints.
Well TIL after using git for 8 years. And yet 90% of the time what I'm typing is "git checkout --" which has nothing to do with the branch of the file, im just getting rid of something. It's a bad UI when the obscure case is favored over the common one, and believe me I can pick out 100 other cases.
This is interesting. I find myself using checkout to check out individual files extremely rarely. It feels like a bit of an anti-pattern to me. What I do instead is commit the changes I intend to keep, using some interactive form of commit (usually git gui) followed by git reset --hard to remove whatever I don't want to keep.
The underlying rule is never to make destructive changes on a working directory that also has changes I intend to keep. This massively reduces the risk of data loss by accidental fat fingering of a command.
I’ve almost been exclusively using the GitHub Desktop client as my primary Git Tool and I’m much more comfortable with git now and helping people out. Merge conflicts were super scary, but with VS code markers, it’s a breeze. The desktop tool also makes sure to fetch origin first. If my commit is incorrect, I just use the “revert” button and it lets me fix things. There are a few things like renames and branch deletes that I use the shell for, but I’m happy and more confident in using git since the desktop tool is effective. One thing I hate is the pesky .DS_Store on Mac. No matter how many times I ignore it, it just wants to be a part of my commit.
...but this shouldn't be a problem with a good git client such as Fork (git-fork.com) or SourceTree. They show you the changed files and let you select which ones to commit.
I would never commit anything without at least skimming the changes first.
In the GitHub client - if you point it to a folder and say “add repo from existing folder” it automatically makes an “init commit”. So you don’t get a chance.
.DS_Store is mostly a problem for me if I do my first commit without a .gitignore file, on a machine where I don't have a global .gitignore. In those cases you have to explicitly remove already-committed .DS_Store files from the repo before your new ignore settings will take effect.
> Git documentation has this chicken and egg problem where you can't search for how to get yourself out of a mess, unless you already know the name of the thing you need to know about in order to fix your problem.
This is exactly why I have been running "git diff HEAD" instead of "git diff --staging" to diff my changes that have already been added to staging. I didn't even know that flag was a thing. I really wish there was some sort of an auto complete that could go from me describing the problem to the thing suggesting commands. Oh well, until then thanks for this!
Note that the two commands are not in general equivalent. Try `git add some-modified-file`, then change some line of the working file just "added". `git diff HEAD` will show the new change, whereas `git diff --staging` will not.
In general the command line is like navigating a cave with a flashlight. I love the idea of some multi-line hinting system that shows you common paths.
Huh. Neither was I. I've been using -C HEAD which has the same affect (reusing the commit from the branch head) but --no-edit makes it more obvious what's going on...sort of. It's still not clear what isn't being edited.
Do you mean the target of the editing? It could also be the commit author or committee perhaps. Maybe the message is the only logical choice, but for long flags, I think more explicit is still better.
A little bit off topic, but I hear a Pijul release is coming some time within the next week (its been a year since the last one). I've never really understood Git and always hoped for something else to come along, at least as an alternative.
Imho, just start using one of the desktop clients. I really like GitKraken but I've heard good things about tower too. Cannot emphasize how easy GitKraken has made my life when I eventually get into these scenarios.
"The Power of Unix Pipes Except for The Zillions of Times Per Day Developers Awkwardly Drag a Mouse Over a Human Unreadable String Just to Make Trivial Git Commands Work"
If you mean generate completion entries for you own tool, my answer will be 'I am not that kind of a guru ;D'. Go lookup creating zsh completions
If you mean the interface with tabs, I am just using zsh-grml config (config file you download for those who don't care for a zsh plugin manger) (I believe very similar behavior is available in bash)
Especially for handling branches that were rebased and force pushed.
90% of the times that I needed to help someone with git were due to a branch that was rebased and (force) pushed to the remote, and the person I was helping didn't know how to handle that and align his local copy with the remote.
--force-with-lease is almost always the flag you ought to use instead.
"--force-with-lease alone, without specifying the details, will protect all remote refs that are going to be updated by requiring their current value to be the same as the remote-tracking branch we have for them."