Better alternatives to git submodules

juped · on March 3, 2023

For avoidance of confusion, a submodule is just:

- committing a commit hash from a foreign repository in a text file (.gitmodules), along with

- some convenience tooling to check out that repository's tree at that commit in a subdirectory.

It is useful exactly when you want both of those things at once. Wanting the first is common enough, it's wanting the second that's more rare.

For example, you wouldn't submodule a Rust dependency, because you get what you want by committing the hash to a text file (Cargo.toml), and tooling to check it out in a subdirectory of your project gets you nothing.

I think the error the Git project made with submodules is trying to make them transparent, i.e., allowing you to use Git inside the submodule checkout. This is basically never a good idea; it is understandable that people get confused trying it (the thing they are trying to do is inherently confusing; if you have already been traumatized by submodules, imagine doing it with Cargo.toml or similar - it would be a mess!).

You probably don't want submodules. But they're useful, and not at all broken or poorly specified.

michaelmior · on March 3, 2023

> This is basically never a good idea

I find this to be one of the most useful features of git submodules. I can easily just cd into the module and do whatever related work I need to do there and then commit to both projects when I'm ready.

ChrisMarshallNY · on March 3, 2023

Same here. Like I said above [below?], I generally despise submodules, but that is one thing that makes them easier to deal with, than packages.

For example, in the project that I'm working on now, I have refined the app's "business logic" to an SPM module that I integrate, through GitHub.

The idea is that I don't change the logic, and I enforce that, by using it as an SPM module.

However, every now and then, I encounter a bug in this module, or I find the need to add something to its API. When that happens, I need to exclude the module from the dependency list in the project, include it directly, as a local link, and work on it that way. Committing is a separate task from the main app (Xcode allows aggregate commits, but Xcode's Git integration is so poor, that I don't use it).

It actually works well. A submodule would also work well, with more Git integration, but the drawbacks far outweigh the advantages.

jeffreygoesto · on March 3, 2023

Alone is ok. You know what you want and own that work flow. North of maybe five people this tends to break down fast.

galangalalgol · on March 3, 2023

I've had it work fine on large teams. The rule for me is that if contributors to the repo are as likely to change the submodule as the repo itself to resolve any given issue. More succinctly, submodules are for internal dependencies we might change. The reason being that with some package managers and languages doing small incremental changes to a dependency as you try to resolve an issue is a lot of extra typing vs it just being in a submodule.

tough · on March 3, 2023

Its just confusing if you don't expect it and are thrown to submodules without any guidance

Izkata · on March 3, 2023

What confuses me is that people complain so much about git submodules, but I can't recall ever hearing complaints about svn externals. They're pretty much the same thing.

Co-workers who have moved from svn to git have had no trouble understanding and working with git submodules.

Macha · on March 3, 2023

1. Git is much more common in the wild, so more people have opinions on it.

2. Git submodules are occasionally used, but svn externals is something I'd never seen used back when I was on svn using teams.

To be complained about, something must be known about, and I think for most developers, svn externals isn't known about

luc_ · on March 3, 2023

Why wouldn't you just keep a local version? Do you have a "top git project" for checking out others? Seems like it could be convenient, I suppose.

michaelmior · on March 3, 2023

I usually do this when I'm developing one project that has a third-party dependency which either doesn't have a release, or I need to make modifications. That way I can work on the dependency and easily keep track of what version is used in the parent project at any point in history.

baby · on March 3, 2023

I don't think I would agree with that statement. I would say that both are actually what most people want when using submodules, otherwise you would just vendor your dependency.

I would also agree with OP that submodules are horrible. Source: I have to deal with them everyday in my company's repo and they cause issues pretty much every time I pull, switch branches, merge/rebase, etc.

IMO, one of the MAIN issues with submodules is that there isn't an actual FILE, checked in the repository, and understood by Git, that lists the hash of all the submodules used. So diffs and PRs on Github, git, etc. don't show the actual changes in a way that we're already used to with files.

solarkraft · on March 3, 2023

I have a convenience project that sets up several others repositories for ease of development and the things you call rare/a bad idea are perfect for this use case. It's working very well so far.

dezgeg · on March 3, 2023

For that use case, how often do you want the other repositories to be fixed to some particular hash, however? At least every time I've wanted such multi-clone behaviour I've wanted to have all the repositories on HEAD of some branch, and very rarely some fixed commit.

k0k0r0 · on March 3, 2023

I do generally want a fixed commit. I often want to improve the code in the submodule possibly (voluntarily or accidently) introducing breaking changes without worrying about breaking repos that contain it as submodule.

solarkraft · on March 3, 2023

During development I don't really care, but every once in a while I commit them so that there's a known-good configuration to start over with.

stefantalpalaru · on March 4, 2023

> committing a commit hash from a foreign repository in a text file (.gitmodules)

No, that file is just for submodule URL, path, branch and options.

The submodule's target commit hash is stored in the superproject's ".git/modules/[path here]/HEAD".

juped · on March 7, 2023

well, no, it's an entry in the tree object

ChrisMarshallNY · on March 3, 2023

I’ve used submodules (a lot), in the past, but I hate them.

I feel as if they are a “half-baked” solution. In order to be useful, they’d need to be much more tightly integrated into Git. Instead, they are sort of “duct taped” to the outside.

But they are an ironclad way to ensure that you have an exact version of a foreign repo. It’s actually a pain stay at the head.

I like using package managers, but they can be quite dangerous, so I tend to write my own packages.

What was really cool, and possibly the only good thing that VSS ever did, was the ability to create “aggregate symbolic repos.” I think I remember how it worked (been a long time):

You could define a repo that was composed of “symlinks” to files from various other repos. Submitting a change into the repo would submit that change into the portion of the other repo.

Perforce had the concept of workspaces. I don’t think you could integrate other repos into them, but you could define a workspace to be a “mask” over your repo, so it would only integrate the files you want. That could be useful to me, as my testing code usually dwarfs my implementation code. Many of my packages[0] consist of a single Swift file, but the test harness might be an entire app.

[0] https://github.com/RiftValleySoftware

michaelmior · on March 3, 2023

> I like using package managers, but they can be quite dangerous

I'm curious to hear what you think is dangerous about package managers. I'm assuming you're referring to left-pad and similar situations.

ChrisMarshallNY · on March 3, 2023

It's because it's possible to obfuscate the provenance of some packages.

You can do the same thing with submodules, but it's more difficult.

Like I said, I'm a control freak.

Packages are fine, as long as they're mine.

michaelmior · on March 3, 2023

I'm curious what you mean about obfuscating provenance. Most package managers download the package from a specific source and will allow you to store the hash of that package in your repository.

ChrisMarshallNY · on March 3, 2023

The whole deal with packages, is to make it really, really easy to discover and integrate them, without having to worry about where they came from, or who has had their fingers in the pie. It's [theoretically] possible to find out, but I have never met anyone that admits to vetting their dependencies in anything near complete fashion. Most look at "buzz" around the package, and at how many stars it has.

That's wonderful. So is going out clubbing, and "getting to know" a whole bunch of different folks that you meet randomly.

Both can have unfortunate side effects.

It's entirely possible to do so safely, but Christian coffeehouses might not be your idea of a good time on Saturday night.

But nothing is perfect, and a dedicated blackhat can leverage just about anything.

pancrufty · on March 3, 2023

You're using them wrong. I've been using them for years and I have no issues with them, however I treat them as a read-only dependency.

They have their use and it's not that hard once you figure out the exact sequence of git commands to use (but that applies to all of git)

frizlab · on March 3, 2023

Same here. I’ve used them a lot. Yes they’re not trivial and feel half-baked, but used correctly they do work.

Anyway, any git article saying “throwing everything away is the only way to recover” I know it’s not a good article…

globular-toast · on March 3, 2023

It is a great way to tell if someone knows what they're doing with git. Even joking about deleting the repository and recloning is enough to let me know. Unfortunately this includes 99% of people I've ever worked with.

fanf2 · on March 3, 2023

I had awful problems with git submodules in the Ansible repo. When I wanted to change branch, I couldn’t use `git checkout` as usual: I had to blow away the submodules, switch branch, then reinitialize them again. Appalling failure to leave submodules so unfinished that branching doesn’t work properly any more.

michaelmior · on March 3, 2023

I haven't had a problem in general with switching branches and submodules. You just change branches and then `git submodule update`. Or `git checkout --recurse-submodules`. You can also set this as the default behavior so that checkout automatically updates submodules.

`git config --global submodule.recurse true`

fanf2 · on March 3, 2023

I think the worst problems happened when switching between branches where the same directory changed between being a submodule or being part of the parent repo.

Having to set a configuration option to make submodules work is another example of the feature being unfinished.

michaelmior · on March 7, 2023

They work just fine without that option. But I can see why having that as a default might be a better choice.

baby · on March 3, 2023

I had that command set previously, and OMG it made every checkout slow as hell! There's like a dozen submodules in the repo I work in, it's a nightmare.

chongli · on March 3, 2023

By read-only dependency do you mean that you’re not a developer for those repositories? What if you do develop a library and then want to use it in an application?

oleganza · on March 3, 2023

What is being meant (i presume) is that even if you are a developer of those repos, do not edit them within the host repo. Work on them separately, as if you were an independent developer, and then bump their revision as a submodule - the same you do with bumping a dependency version in your Makefile/package.json etc.

worksonmine · on March 3, 2023

I assume read-only from the perspective of the dependent. Any fixes belong in the modules upstream repository, then pulled in to the dependent once pushed.

rusticpenn · on March 3, 2023

I consider microservices and submodules to be mainly a solution to organizational problems.

usr1106 · on March 3, 2023

Having had no issues with them sounds unreal.

Everyone has to learn what they can do and what to avoid.

We use the to build our whole system out of one commit, although we have several repos. We made the artificial rule that a commit that updates a submodule must not contain any other changes. It has reduced the number of problems especially related to rebasing.

pancrufty · on March 3, 2023

I had no issues that were directly attributable to the way they behave and not to my own ignorance.

In short: once I learned and documented the checkout/update/reset commands, I was set.

The other “issues” were that they require extra config in some cases (CI), but again it’s just 2 lines in most cases.

baby · on March 3, 2023

I don't trust GP, I deal with them every day and they are a nuisance every day. Perhaps they are not working in these submodules and just set them to a hash once and never had to make changes there. Then sure, but as soon as you're working actively in these submodules it's mayhem.

baby · on March 3, 2023

for read-only dependencies why not just vendor the dependency then? Or better, use a package manager at this point.

0xy · on March 3, 2023

The fact some developers hate them is reason alone to never use them. Using them means you don't care about developer productivity.

You will incur thousands or tens of thousands of dollars in additional costs by using submodules, via wasted developer time.

isaacremuant · on March 3, 2023

You can make that incredibly reductive argument about anything. Even unit tests.

If the tool is right, you generate buy-in. If you don't, then either it isn't that good or you're not good at generating buy-in.

eru · on March 3, 2023

Yes. And buy-in doesn't necessarily mean that people will love the chosen tool. Just that they understand and accept the compromises made.

mcny · on March 3, 2023

> Yes. And buy-in doesn't necessarily mean that people will love the chosen tool. Just that they understand and accept the compromises made.

I think we are saying the same thing in different words. I don't hate most things. I don't have an opinion on most things. I am pretty indifferent about those things. I don't hate git submodules. However, I think we should at least hear out the concerns of people who hate git submodules. If you still want to use git submodules, then at least you've made an educated decision.

eru · on March 3, 2023

I'm not sure. Going by that reasoning, you couldn't use anything at all ever.

Really silly example: not indenting your code is universally seen as bad. But there are both people hating tabs, and there are people hating spaces.

cupachabra · on March 6, 2023

How would you quantify impact of something like that? Using a tool like Jellyfish, linearB, Adadot etc or just hope people would see enough difference to justify investment?

nly · on March 3, 2023

Developers time isnt that valuable. It's a nice myth, but it isn't.

codetrotter · on March 3, 2023

I found Git submodules excellent for certain use cases.

Here are some real world examples:

- I have some software I’ve written that runs on a single board computer. Inside of this repository I have a sub module pointing at the commit from which the Linux image is built, that my software runs on. This is extremely useful because if I ever have to rebuild the Linux image there is never any doubt about which version of it that I have been running my software on. So I can rebuild the Linux image from the same commit as before, and then add my software to it. In the future I might also automate more steps and again then I will get even more benefits from this. But already today it is hugely useful for me to have a sub module like this.

- I generated some Rust code for the types of a third-party API that we use. I generated this code from their OpenAPI spec which they have in a git repo. I added their git repo as a sub module in our repo and committed the generated code alongside the sub module pointing at the commit from which the code was generated. Perfect!

Git sub modules are super useful for several things. They can be a bit confusing and frustrating at times. But there are ways like what I mention above where they are perfect to use.

moffkalast · on March 3, 2023

I've found that vcstool[0] is the better solution for your first example and have been using it extensively both in personal and company projects without any issues, except on occasion forgetting to commit some subpackage changes as git sometimes doesn't indicate untracked changes correctly.

The principle is similar but with explicit repos and branches defined in a config file that can then be pulled or cloned as one.

[0] https://github.com/dirk-thomas/vcstool

sligor · on March 3, 2023

How about using gitsubrepo ? https://github.com/ingydotnet/git-subrepo

>This command is an improvement from git-submodule and git-subtree; two other git commands with similar goals, but various problems.

ihnorton · on March 3, 2023

In practice it is POSIX-only, which is not viable in an organization where Windows needs to be a first class development and deployment platform. Running under cygwin or WSL is a non-starter for a number of reasons (I've done that myself for tooling in the past, after which I would never impose it at an organizational level).

nopurpose · on March 3, 2023

It is a great and more ergonomic alternative, can't recommend enough.

polski-g · on March 3, 2023

That's what I switched to. Mainly because Ansible Galaxy collections don't pull the submodules when fetching the collection.

keshet · on March 3, 2023

git subrepo is what I chose to include a "core" library of functionality in a number of repositories. Works fairly well, the main thing is that git treats all the files in the subrepo the same as any other files, so no surprises.

soraminazuki · on March 3, 2023

> Provide an ad-hoc in-tree script to download the dependency

> Yes, really, git submodule is worse than ad-hoc Makefile runes

Please don't. Submodules are much better.

Projects using submodules and standard build tools are easy to build. They're easier to customize. They can be packaged downstream with few or no patching. On the contrary, projects using ad-hoc scripts are often hard to build, have no convenient methods of customization, and resists downstream packaging.

That's because with the former approach, build descriptions are standardized and (often) declarative. Standardized build descriptions are easier to build and customize because they're predictable and provide uniform methods to configure builds. Declarative build descriptions are easier to customize because they only specify the end state. It doesn't matter what changes you make to the build process as long as it doesn't create conflicts with the described end state.

When building a project, submodules can be viewed as declarative way of specifying dependencies because it only specifies a Git URL, a commit hash, and a target subdirectory. It doesn't really matter how you fetch them. This is convenient for things like package managers because it allows them to fetch dependencies beforehand. With ad-hoc scripts, dependencies are specified imperatively. Or as the article puts it, the script is "in precise control of when/whether the download occurs." If package managers want to prefetch dependencies, it would have to patch that part of the script out.

mmis1000 · on March 3, 2023

The concept is not the problem. The submodule's problem is it has the most suck ui ever. The default 'git clone' don't ever setup the submodule. The checkout, rebase... etc works half of time. And some left submodule intact unless you add some weird parameters. It's pain in the ass even you are not the one writing submodule but just the one using it.

In practice, a lot of repo add shell scripts just to… setup and update the submodule. This should be git's own task to do. But it really isn't done well.

It will be much better if git just treat submodule as a readonly subdirectory and force sync everything unless I tell it I want to edit it.

Package managers on the other end don't give you such pain. A 'npm install' will just correct every dependency to proper state without bother you about anything.

cabirum · on March 3, 2023

Looks like a rant from someone never bothered to add submodule.recurse=true to their git config.

Git requires knowledge and manual configuration. It is a low-level tool that is not user-friendly. Nor it is expected to be. See, early on, git was a toolkit for building VCSs, so it historically contains many plumbing utils not designed to be used as-is.

crabbone · on March 3, 2023

> Use git subtree

No, god, no! I see where the author is coming from, and I'll give him that submodules aren't a well-designed or well-implemented feature, but dear lord, subtrees are so much worse...

I work on a project that uses them. In my particular case, the "genius" who set it up decided to create this kind of setup: a "framework" part, and a bunch of subtrees which are taken from a separate repository having several branches for specific versions of the program, and each version is made into a separate subtree in the framework repo. This leads to immense duplication of commits, totally worthless history, no ability to go back without a humongous effort... It's the worst Git repo I've seen in my life. And it was created by someone with a decent knowledge of Git, which, unfortunately, didn't translate into making useful things...

utunga · on March 3, 2023

One of the most pernicious problems with submodules is way that anyone who finds them problematic thinks its their own fault, and people who have finally figured out how they work (sort of) are so dang pleased with themselves they are now fully indoctrinated into submodule cognitive dissonance. Meanwhile git submodules themselves are leaving massive footguns all over the place, are an absolute nightmare during complex merges and are absolutely terrible for code transparency.

WirelessGigabit · on March 3, 2023

We use Git submodules for Actions because our security has disabled 3rd party Actions. So we add them as a module and update the module regularly.

It's faster than having to fill out 3 sets of paperwork, 2 meeting with some guy just out of school in a time zone opposite mine and a 3 week delay.

All while it's all good to adopt npm packages left and right.

And this way it's checked for security issues, so it's even better than going through the paperwork hassle and get it adopted like that.

cube00 · on March 3, 2023

What the "security" team doesn't realize is by blocking actions in this way they've opened up a wider hole.

If a third party action had a problem it can be blocked centrally, instead these modules now live scattered throughout the code base relying on individual development teams to keep them updated.

Those teams will not always be fully funded to keep an eye on every vulnerability in every module and keep everything up to date the same way a centrally managed dependency system can.

rany_ · on March 3, 2023

The only thing I dislike about submodules is that older repos have remotes that no longer exist. So I'm left with an incomplete clone because I can't find the submodules anywhere. Otherwise I don't mind them and will continue to use them.

In any case, this is no longer an issue with GitHub monopoly.

ranting-moth · on March 3, 2023

Submodules are amazing until you start using them.

The whole idea on paper looks brilliant, but the pitfalls are very painful.

Perhaps if you're working solo on a project it's OK. Part of the problem is that developers aren't used to submodules and it behaves in a way you don't envisage.

okamiueru · on March 3, 2023

> Part of the problem is that developers aren't used to submodules and it behaves in a way you don't envisage.

I would say you correctly identify the problem, but draw the wrong conclusion.

Developers should be able to work with the right tools for the right problems. I've used git submodules in teams, professionally, for more than a decade. Neither I nor my colleagues have ever had a problem with them except when used without the proper understanding.

The solution isn't to replace the tool, but teach developers how to use it.

hobofan · on March 3, 2023

> The solution isn't to replace the tool, but teach developers how to use it.

Or to improve the tool to have it teach developers and be more helpful in case of common errors. I think in many non-submodule parts git has improved on that front, though that has never been its strength.

ofrzeta · on March 3, 2023

Yeah, it works. You just have to remember, if it doesn't, in doubt use "git submodule update", or if this doesn't cut it use "git submodule update --init --recursive". Or maybe you should have run "git submodule sync" first? :)

However I agree with others in this thread that it's easier to develop a bunch of modules that are used in a top-level project. Just change to the subdirectories, change the files, then when you are done commit the subprojects then commit the submodules commit ids to the parent project.

baby · on March 3, 2023

> Neither I nor my colleagues have ever had a problem with them

It really is like we live in different worlds.

junon · on March 3, 2023

I love git submodules, mostly because I don't pretend they're something they're not.

Treat them as read only tags.

conradludgate · on March 3, 2023

I occasionally (less than once a month) work on the rust project. They probably rightfully use submodules to pull in llvm and cargo and other dependencies. However, every time I change branch or rebase, there's some kind if conflict or breakage caused by the submodules.

Just last week I rebased and the cargo submodule was changed to dirty? I didn't touch the folder. I deleted the folder and reran my submodule init/updated dance to no avail.

I had to fix it by going into that folder and running `git restore .` because somehow everything got removed?

So yeah. I don't like submodules. They probably make sense but they are not at all intuitive if they are so fragile like that

baby · on March 3, 2023

Same issue here. I've never tried git subtree though, I'm wondering if people who have can chime in.

rwmj · on March 3, 2023

I too loathe git submodules (which I have to use daily).

But there is one case where they're not too terrible: It's where you have some kind of "meta project" that needs to coordinate other large projects together. An example is where you need to combine specific versions of the Linux kernel, glibc and gcc (eg that you have tested and know work together). A git project with one submodule pinned to the tested commit of each of Linux/glibc/gcc seems to work well. You can, for instance, test a new combination of submodules together and if they work push an atomic commit to update them all together.

isaacremuant · on March 3, 2023

I don't think any of these arguments are good enough to not simply put git submodules as yet another option to consider. Probably not your first choice but it could be a perfectly acceptable one (think components).

Git subtrees are definitely not a replacement for what submodules can do. It's actually closer to monorepo.

Package systems are absolutely more robust but could be more painful depending on context.

Git submodules, if correctly understood (which is not trivial, even in git terms) can be just the right tool for a certain scenario.

Qt has used them quite well but it's definitely not as friendly to newcomers as it is to "core Devs"

eru · on March 3, 2023

> Git subtrees are definitely not a replacement for what submodules can do. It's actually closer to monorepo.

Git subtrees allow you to convert between monorepo style and multi-repo style, and keep both working alongside each other.

IshKebab · on March 3, 2023

Never is a bit strong.

They definitely have flaws, and as usual Git goes out of its way to make the UX extra awful (why isn't --recursive the default??!). But they are pretty great for third party dependencies that you might need to patch a bit, especially if you can't use something like Cargo (or don't want to set up a private registry etc.)

But git subtree has massive flaws too. You squash by default so you lose all git blame support. It's way more janky to pull from remotes and submit patches etc. You have to manually ensure you separate changes to the subtree out into different commits.

Honestly I think no existing solution is very good. A monorepo plus submodules for external dependencies is probably the best option at the moment.

Really Git could do a lot to make submodules better, but I guess - like LFS - if the kernel developers don't need it then screw you we're not putting any effort into it.

bartq · on March 3, 2023

How submodules should be done properly:

  1) define somewhere in the repo that it's a "compound repo", a "workspace" etc - name is arbitrary, This is our repo "R",
  2) for certain paths in the repo, mark those paths as an aliases to other repos identified by a repository URL/path. These our "r" repos.
  3) for every git command executed inside repo "R", run appropriate commands in the background for each "r" repo only if repo "r" was affected by changes initiated in repo "R".
  4) If you made changes to repo "r" directly and then returned to "R", after "git pull" you should see nothing else than standard git diffs, conflicts etc. You should not run anything like "sync"/"refresh" etc. Only git pull/rebase/merge etc.
  5) Commit in repo "R" which is only responsible for bumping repos "r" should be handled by git submodule system transparently for the user of "R". I'm not a git expert to tell what kind of commit should be used here. Any ideas? You should commit seeing diffs of course, not some commits hashes.
  5) THAT'S IT.

Everything should work recursively, e.i. you should be able to do 10 layers of "r" repos. Each n-th "r" repo acts as "R" repo for n+1 level repo. Ten fold commit should work like transaction, e.g. if any of layers between 1 and 10 has failing precommit hook - the whole operation should fail.

Please help me find potential problems. thx!

rahoulb · on March 3, 2023

This is timely, as I used submodules years ago and got very confused with them.

But now I have an Ask HN:

I'm building a ruby gem (package) that will be shared across a few of my applications. The apps are separate and proprietary, but I want to open source the gem, so a monorepo doesn't fit.

When the gem is stable, I can just publish it and then reference it in my Gemfile like any other gem - rebundling every time there's a new release.

But at the moment, I'm actively developing the gem and making frequent changes to the API - so I want to reference it directly from the container app and edit both together as different edge cases arise.

The standard Ruby on Rails way to do this is to stick it in /lib or /vendor - then work on it and once done, extract it and publish the gem. But I'm working on several apps that share this gem and will all influence how it works. So I don't want multiple vendored copies scattered across different source trees. And I don't want to be rebundling the app every time I make a minor change to the gem - so referencing the git repo is out.

On paper a submodule sounds like the perfect solution - main app is a repo, with a submodule repo in vendor. I can make changes to both, then push them separately to their own repos. If I start working on app 2, I just fetch the submodule to get the latest version of the gem without having to rebundle everything.

As it's just me working on this set of code at the moment, are submodules a good fit?

huntedsnark · on March 3, 2023

Can you not use the `path` option in each Gemfile for local development? https://bundler.io/v1.12/man/gemfile.5.html#PATH-path-

rahoulb · on March 3, 2023

That's what I have been doing, but I wondered if there was something I'm missing with the submodules

adsteel_ · on March 3, 2023

You should be able to switch to a local file path reference for the gem source in your application's Gemfile while you are developing.

You can also set up a dummy Rails app in the gem's test environment, allowing you to write automated tests against a Rails app.

lenzm · on March 3, 2023

In Python, I've done this by installing the package in editable mode from a side project. Something like:

  /package-project/
  /main-project/

Then in the main project

  pip install -e ../package-project

zorr · on March 3, 2023

Since it is mainly for ease of development for your proprietary apps, you could immediately create/develop the gem in a separate repo and create a symbolic link to your apps vendor folders.

bmurphy1976 · on March 3, 2023

What about using symlinks?

tatersolid · on March 7, 2023

That’s what made svn:external so useful; it was just a “subversion symlink” that could point to any path in any repo at any commit or tag.

alex7734 · on March 3, 2023

This is what I use to avoid the pain of submodules:

    [alias]
    box = !cd ${GIT_PREFIX:-.} && git config --get remote.origin.url > .gitboxinfo && git rev-parse --abbrev-ref HEAD >> .gitboxinfo && git rev-parse HEAD >> .gitboxinfo && mv .git .gitbox && git add -f .gitboxinfo && true
    unbox = !cd ${GIT_PREFIX:-.} && mv .gitbox .git && true

I just clone the "submodule" inside the main repo, then use the aliases above to rename the .git folder to .gitbox and save the current remote, branch, and HEAD commit to a file. The .gitbox folder is ignored.

If I want to update the submodule or commit changes on it upstream, I just use the unbox alias to get the .git folder back, cd into it and pull/commit/push/whatever as normal, then use the box alias again.

Since the folder is no longer named .git it doesn't interfere with the main repo and you can use the main repo as normal: your team doesn't have to know about weird submodule commands or fight the damn thing to do normal stuff which is important if it was hard to get them into git in the first place.

Plus you can make changes to the submodules in your repo without committing them upstream (or having to create another repo to commit them to) since the submodule contents themselves are committed to your main repo. You also won't lose the files if the submodule remote dies.

I don't have many of them so I don't have any automation to recreate the .gitbox folders from the .gitboxinfo files if necessary but it wouldn't be hard to make.

sam_lowry_ · on March 3, 2023

I used this once to convert git submodules to subtrees. Not sure if it still works, though.

    cat .gitmodules |while read i
    do
    if [[ $i == \[submodule* ]]; then
        mpath=$(echo $i | cut -d\" -f2)
        read i; read i;
        murl=$(echo $i|cut -d\  -f3)
        mcommit=`eval "git submodule status ${mpath} |cut -d\  -f2"`
        mname=$(basename $mpath)
        echo -e "$name\t$mpath\t$murl\t$mcommit"
        git submodule deinit $mpath
        git rm -r --cached $mpath
        rm -rf $mpath
        git remote add $mname $murl
        git fetch $mname
        git branch _$mname $mcommit
        git read-tree --prefix=$mpath/ -u _$mname
    fi
    done
    git rm .gitmodules

stabbles · on March 3, 2023

I've had too many issues trying to (recursively) reset all submodules after switching branches and cloning. Once submodules get in a limbo state it's a massive pain to get back to normal.

drewcoo · on March 3, 2023

In this thread I read "you're using it wrong," "they just work," and "they're half-baked" and I think "the blind men and the elephant."

https://en.wikipedia.org/wiki/Blind_men_and_an_elephant

I've used them. I've migrated off of them more than once. I think they're too easy to use wrong.

VLM · on March 3, 2023

Zephyr project uses west and its interesting to read its justification for developing a tool to avoid using git submodules.

https://docs.zephyrproject.org/1.14.1/guides/west/why.html

dezgeg · on March 3, 2023

Android has similar tool called 'repo': https://gerrit.googlesource.com/git-repo . I have seen it used outside Android and I think it's mostly ok (Definitely better than git submodules). My main beef with it is that it's impossible to google anything related to the tool. Some people do seem to hate it though.

I wonder has anybody used this 'west' outside of Zephry?

unglaublich · on March 3, 2023

I'm not going to humiliate myself as a "cow in natural environment detector" to satisfy a captcha.

vencx · on March 3, 2023

We're using submodules to share internal OpenAPI specification files between repositories. These are used for code generation and automated testing of APIs.

It requires some manual checking that branches and commits are up-to-date.

Does anyone have any recommendations on how to do this with a multi-repo setup?

bvrmn · on March 3, 2023

Looks like you need a monorepo.

poseva · on March 3, 2023

We are using git-repo [0]. Having a manifest (default.xml) with all the used packages and their version in a single file brings a lot of benefits.

[0] https://gerrit.googlesource.com/git-repo

_gabe_ · on March 3, 2023

It sounds like the author of this article is trying to use git submodules to replace a package manager. Of course you'll get annoyed with git submodules if you have something like cargo available. Unfortunately, there are languages (C and C++) where package management sucks and git submodules are a godsend.

> Provide an ad-hoc in-tree script to download the dependency

Add this to the list of things I hate. The last thing I want to do when adding a library as a dependency is figure out that it requires Python to download some source, then uses a makefile to build everything, and has a "nice" bash script to tie everything together. If I see that crap, I'm not using your library. This (contrary to what the developer may be thinking) does not make it easier to integrate the library into my project. It makes it so so much more difficult.

Please. Don't.

> Yes, really, git submodule is worse than ad-hoc Makefile runes

Please don't make monstrous makefiles that do a million and 1 things and do them all terribly and don't have any modicum of support for cross platform functionalities. This is terrible advice.

If your language has a package manager, use it. But in my experience, with a language like C or C++ git submodules are the perfect tool for the job and definitely more preferable than hacked together makefiles that do everything except build the source.

Tldr; this sounds like a blog post from somebody that tried to use submodules to replace a package manager. If you have a package manager, use it. Otherwise, thank God for submodules.

baby · on March 3, 2023

In general you can't just use a package manager in many cases. Or you can, but it'll be even more painful.

Imagine that you're working in some sort of mono repo, except that some internal libraries are submodules. Then you'll often find yourself working outside and inside submodules to test changes. You can't do that easily with package manager.

That being said, I believe it should be possible to augment (some) package managers to handle local path only when in dev mode.

iamflimflam1 · on March 3, 2023

The number of people who clone my repos and forget/don’t read the instructions to clone recursively makes me never want to use submodules again.

ihnorton · on March 3, 2023

> Use a package management system, and explicit dependencies

vcpkg strongly recommends using it as a submodule. But at least that would only be one submodule.

dustingetz · on March 3, 2023

Git subtree – can anyone vouch for this as an alternative to submodules for an open source business? Any hidden gotchas or friction points?

joeyh · on March 3, 2023

Again with the gratiuitous title changes eh HN?

Nasreddin_Hodja · on March 5, 2023

I prefer keeping dependencies outside of project tree and using symlinks instead.

zoobab · on March 3, 2023

Submodules were broken in 2008, we are in 2023, they are still broken.