Hacker News new | past | comments | ask | show | jobs | submit login

A question for people here saying "use a monorepo", and coming from a different direction than that of the article. Say I want to use a monorepo for all code I write for personal use and development, but I'll have a folder with dozens of projects cloned from github often with small tweaks and custom branches. Is the solution submodules? Saving patches or just code-snippets in the monorepo and keeping the random misc repos isolated? Hardlink specific files of interest?



I prefer storing the source code of third-party dependencies directly in the monorepo. Treat third-party code as your code, since it is used in your project in the same way as your code. It may contain bugs and security issues, which is easier to spot and fix if the source code is available in your repository. This also prevents you from disaster when third-party code becomes unavailable for some reason (missing internet connection, temporary outage at the third-party code hosting, permanemt deletion or corruption of the third-party code).

Apply custom patches third-party code in the same way as you apply patches to your code. Prefer posting these patches in the upstream project you depend on - this will improve the quality of the upstream prpject and also will reduce the amounts of custom patches on your side.

The hardest part is to update heavily patched third-party code to new releases with significant changes. Fortunately, this happens rarely in real life. Bugfix and security releases for third-party dependencies are usually easy to apply even to heavily patched code in your repository.

Real-life example: we at VictoriaMetrics store all the code for third-party deps inside "vendor" directory. This is very easy because Go provides easy to use tooling for this - just run `go mod vendor` and that's it! It is also very easy to upgrade third-party deps via editing "go.mod" file with subsequent `go mod tidy && go mod vendor` command. It is easy to inspect changes to third-party code for security issues and bugs after this command with `git diff` before commiting them in our repository.


If the projects were my own, I'd consider a monorepo. We use this approach for Steampipe samples - https://github.com/turbot/steampipe-samples

If it's a collection of changes, small improvements, etc to existing projects and repos then personally I'd go for separate forked repos. Then you can track your changes relative to the original project source code and (hopefully) contribute back PRs etc more easily.

As always - there are pros & cons to both - just a matter of choosing the approach that feels best 51% of the time :-). Of course, it's minor in general compared to the value of just keeping on moving on your projects and work!


For a C++ or similar project I would recommend fetching and building the depended-on repos using CMake's ExternalProject feature, with patches that your version in your main repo. I would never recommend submodules.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: