Why is it that every time there's an article criticizing semver, the author never seems to have bothered looking at the spec?
> 1. Software using Semantic Versioning MUST declare a public API.
That's the very first thing it says. Semver is not meant for general apps or websites or the like, it's for APIs against which dependencies can be declared.
If your software doesn't need to tell package managers about its compatibility, then it doesn't need semver. Complaining that semver is bad for human readability is like saying hammers are an antipattern because you don't use nails.
First thought when reading this article. I fell into this trap too when developing my software stuff and then realised that I was just trying to force something that didn't make sense. My apps don't contain anything that has breaking changes for anyone else. And since then I've become more pragmatic about versioning. Now I don't really bother when it comes to web apps I build. The git commit id hash is good enough for me.
If you need a version number then a very pragmatic thing to do is to just tag the repo something short and memorable once, e.g. "v1", and then use the output of "git describe --first-parent" as your version number from that point forward.
The benefit is that you get the object ID along with an automatic counter of the # of commits since the last tag, which is basically a nice, automatic version number that you only very rarely need to update (by creating a new tag).
$ git tag -m'app v1' v1
$ git describe --first-parent
v1
# hack hack hack
$ git commit -m'subsystem: some new feature'
$ git describe --first-parent
v1-1-gab2fcd9
# hack hack hack
$ git commit -m'subsystem: some other new feature'
$ git describe --first-parent
v1-2-gc37ed42
The "v1-<number>-" prefix makes it much easier for a human to understand when read in logs, since the numbers are typically incrementing, and the strings are understood by Git as valid object IDs so it's easy to lookup the commit.
I came here to mention exactly this. Being able correlate artefacts with code through embedded versions that contain the output of git describe is worth tons.
I mean, yes, sure. But the spec is not the world, and there is a general trend towards literally forcing things to adopt semver even when it makes little sense.
If you write a project in rust, for example, you must give it a semver number in your cargo.toml. You can give it one that makes no sense and maybe never change it or change it arbitrarily, but the implication of that field is always that it's semver. It's right there in the docs: http://doc.crates.io/manifest.html
This is a real procedural problem that is being created by a "silver bullet" approach to versioning that the semver people can deny they're trying to create, but is happening anyways.
Given that the article did a poor job of actually stating the problem of semver, could you list the issues you encountered when dealing with rust's use of semver?
I worked at Amazon for 1 year. They use semver for everything. I think that's one of a few technical superiority I missed since left. I am curious about its problems.
The article did not say much. As for the suggestion given in the article, I cannot help to think that one can always release a number based on a semver to external customers. That does not prevent semver being used internally at all.
> a general trend towards literally forcing things to adopt semver
This statement assumes that semver is bad if forced universally, which may or may not be the case.
Your comment include no fact that this is bad. When you mentioning rust, you also did not say anything why semver is bad.
I mean, if someone put an arbitrary number there, and users are screwed because of author fails to maintain the correct contract, at least people can go to there and point out that the contract is broken. Removing semver does not do anything good. It just give the author an possible excuse, i.e., if users are screwed, the author can claim that he/she never meant to make its software backward compatible. For that, semver also provides the same mechanism, i.e., bump a major version number. At lease users know that they can expect a breakage.
The article is poor, I'll agree with that. But the project I work on that makes me agree with it (that does not currently use rust, but it is something we're looking at) is a low level system component that is continuously deployed across entire clusters. We only ever have two versions active in a cluster at a time (either upgrading or downgrading when something goes wrong). People do not really 'consume' us except in so far as the system as a whole does expected things. We do have some "API"s to our system, and we do version those, but those protocols change extremely infrequently and are either only for consumption by the same or adjacent versions.
Our release version numbers are literally jenkins build numbers. Anything else is not worth the administrative overhead for our needs (even were that 'small', it's also weirdly political in a bikeshed sort of way).
And I don't think this is really all that uncommon even for higher level things. I don't think semver makes any sense for a rails app, for example, but in rubyland you only version things you publish. Rust makes you version things you will never publish.
Don't get me wrong. For the things we consume from outside our team or abroad, semver is an absolute boon. Sometimes, though, when you point out that people are treating something as a silver bullet, you're saying it's outright bad. I'm not saying that. It's not a problem with semver that semver is not universally applicable.
For this scenario you seem more opposed to the idea of version numbers in general then Semver. Does your build process tag the build numbers in your code repository? The benefit of Semver (and any sensible versioning system) is that whatever you deploy in production gets a (human readable) version assigned to it, and that version is tagged in your git/hg/svn repo.
The nice thing about version numbers such as Semver is that whenever you do need to maintain more than one branch of your library/tool/application, its just a matter of branching the code, and assigning a higher major version to the newer branch, whilst keeping the older major version for the stable older branch (which will still receive fixes).
With build numbers you lose a way to distinguish these two, with Semver you have an old branch (v1.0.0, v1.0.1, v1.1.0, …) and a newer branch (v2.0.0, v2.0.1, …) that can both receive new updates and version numbers. Because you usually can't predict which internal project will by necessity get split into two parallel development branches, you might as well just use version numbers everywhere.
I don't quite get the administration argument you mention. A lot of modern build software (Rust is mentioned, but Java does this as well with Maven) use some sort of versioning to facilitate this out of the box, so why go out of the way to not use them?
Version numbers seem to exist purely so proprietary vendors can obfuscate their source code a bit.
If I have the commit hash I can go straight to the exact source code version.
Taken to its logical end goal, really I want three things: commit, upstream branch, and where to find the code.
With those 3 and a little bit of tooling I should be able to go from a version number to the commit log of the current compatible upstream (I guess I'm implying a branch == backwards compatibility but just getting the commit log of really what I want)
We don't need semantic versioning in open source - we need a common immutable web which lets us search on the things which matter.
A commit hash indicates no sequence or logical ordering. With a version number I know that libfancy-1.0.1 was released after libfancy-1.0.0, and in the release notes (as well as the source code repository where the corresponding commit is tagged with the version number) I can easily answer the question “What changed between version 1.0.0 and 1.3.4?”.
Just the SHA1-hash makes this all needlessly opaque — I'm a human, not a machine.
> Version numbers seem to exist purely so proprietary vendors can obfuscate their source code a bit.
Whatever makes you think that? From what I have seen free software projects tend to apply their version numbering schemes with a lot more reliability than proprietary software vendors. And of course version numbers don't obfuscate; any developer worth his salt using version numbers will tag the corresponding commit in the source code repository, and any build environment worth working with will provide a 'release' command that does this automatically.
Ordering is very different to what semantic versioning is. Semantic versioning proposes that those numbers have meaning and it's important.
Whereas again, what you and I are both saying is "who cares, I want the commit".
Commit + Branch means I can answer the question "is my problem fixed (or are things broken) in the newest version of the code" - which is the actual question semantic versioning proposes to answer but can't.
Adding a release number for human readability is fine (i.e. imply some quick ordering so you can go "oh, this isn't the newest version"). But even then: when putting together systems, that's not a question I ever really find myself asking - the one I'm asking is "are these things exactly the same, and what's the latest I can upgrade to without breaking it (or conversely: which was it working fine on).
Which is why I point out the need for an immutable web at the end there: if we accept version ranges are useless for doing reliable ops work (I believe they are - i.e. you will be testing everything anyway), then in order to know anything about what does and doesn't work, I need the source code and commit logs. I probably don't want to include all of those in every release and I do want to be able to get to new versions, thus the three things I need: upstream location, upstream branch, and commit hash. Everything else is irrelevant, or sugar.
EDIT: And we can derive simple ordering numbers anyway - just count commits on the release branch since the product started. Accomplishes the same thing.
EDIT 2: It's also worth considering though, that incrementing numbers introduce subtle bias - people always want the highest number, and assume it must be better. There's always an urge to upgrade to version 2, even if version 2 might be a complete rewrite and less developed then 1 with an uncertain future. How different it might be if we only talked about branch names and commits - we're on the "beach" branch, but the company has a got a new one for the product called "forest". How different might discussions about "upgrading" be, and consideration of security fix deployment and status.
> Whereas again, what you and I are both saying is "who cares, I want the commit".
I (not the grandparent poster) usually don't just want the commit. I want to be able to look at two versions and have a good idea if they're source and/or binary compatible.
If I do need to deal with the source of the app, hopefully they're doing me a solid and tagging commits with the version number.
> Commit + Branch means I can answer the question "is my problem fixed (or are things broken) in the newest version of the code"
Sure, if you want to sift through a VCS log, which I don't want to do. I want a changelog file that tells me what's in the version I'm running, and all versions previous. That answers that question too, and maintaining a changelog (even if you just dump the output of git's shortlog into it every release) is just a responsible part of release engineering.
> ...which is the actual question semantic versioning proposes to answer but can't.
No it doesn't. Semver exists to help answer the API/ABI compatibility question. It has nothing to do with whether or not your bug is fixed or not.
>> ...which is the actual question semantic versioning proposes to answer but can't.
> No it doesn't. Semver exists to help answer the API/ABI compatibility question. It has nothing to do with whether or not your bug is fixed or not.
But it has everything to do with whether other things are broken (i.e. incompatible) which is what determines whether I can upgrade or need to triage. So yes, the question is "whether my bug is fixed and I can use the fix".
The problem is compatibility is a useless theory when you actually need to deploy to production - it either works or doesn't, and if going from version 1.0.0 to version 2.0.0 requires the same amount of testing as going from 1.0.0 to 1.1.0 then I don't know what's been accomplished. Same story with changelogs.
If your issue is with the format and the contract that semver enforces, then you can easily number your builds <jenkins build number>.0.0 That's valid semver, it follows the contract. Major versions may break compatibility or may not. There's no contract that enforces that non-breaking changes can only go into minor or patch releases. The only baggage you'll be carrying around will the the two trailing zeroes. You're unlikely to run out of available major version numbers any time soon.
Er, I don't know anything about rust but cargo is a package manager, right? And the version field you're referring to is what will get checked if someone declares a dependency against your package, right? And you're saying semver makes little sense for that?
Yes, cargo is the package manager, but it is also the build tool. Whether it's a library or a terminal project (one that has nothing depending on it) you use cargo to build it and it has a Cargo.toml. Even if nothing will ever use your package you will always have to have a version field defined in it.
Which makes sense. First of all, cargo can't know that nothing will use your package. And second of all, it is possible that this will change once someone sees how great package you've built. OTOH it costs you nothing to put a number there and never change it if there is no need.
Are you suggesting that the field should be optional, or that having the field conform to semver should be optional? If the former, then fair enough but that's orthogonal to the thread. If the latter, that would destroy the package manager's ability to resolve dependencies, right? (Unless new extra metadata was added, or somesuch.)
Either way, requiring semver formatting here sounds like its entirely justified.
I agree with you, semver doesn't make sens for every project. But it still make sense for a lot of executables, and not only library, let me illustrate that.
Let say you create a cli tool, with a given set of command-line parameters. These parameters are the API your programming is exposing. I can use your program from a shell script for instance, and I will use the existing interface. If, in the next release, you change the options your cli program is accepting, you are actually breaking compatibility with existing integrations including my shell script. Hence this is a breaking change and you should upgrade the major version.
I didn't say "it doesn't make sense for executables". I said it makes sense for terminal projects on which nothing has a versioned dependency. This is usually (maybe always) executables, but not all executables fall under that heading.
edit: Oh I see. I can see how you'd interpret what I said as implying that [all not libraries] are terminal. I wasn't being exhaustive. :)
Are you saying one can't write a Rust program that is just a simple, unversioned OS native executable without any Rust-this attached, or being embroiled in Rust-that?
Yes. To be clear, I am not saying you can't do anything without cargo, but if you depend on any crates it makes very little sense to try to not use it. It only really makes sense to try if you have no external rust dependencies, which I think usually either means you're writing a kernel or a toy.
Also, btw, even though I'm saying something vaguely negative about cargo I love that both rust and cargo exist and I've been aware of the work you've been doing since your rails days and always liked what you've done and had to say. Just thought I may as well throw that out there. :)
If knowing every detail of something was required for criticising it, writing a massive, incomprehensible document would be an effective way to dismiss any criticism out of hand.
This post completely ignores one of the most important features of semver, which is dependency management. Being able to do this is really great:
some-library>=1.0.0,<2.0.0
It means I can include a library and get non-breaking changes and all security updates until the next major release without worrying that a change is going to break me randomly. It doesn't mean I don't need to do testing, but it makes it a lot more likely that I won't accidentally break in some way I didn't expect due to a change upstream.
When I want to upgrade to a breaking change release I know I need to go look at the release notes to see what the breaking change is, and then make modifications to my application.
If a breaking change is made in a minor or point release I can open a bug with the project so they can revert this change in the next minor or point release.
(edited from <=2.0.0 to <2.0.0, as this was my original intent :) )
Possibly if you have very few dependencies, and those also have very few dependencies it will work out OK. But in a lot of circumstances, this is a recipe for shooting your foot off. For example, if you are building a server application in Node and the total number of included packages can reach into the thousands. And then if you deploy to your server and for some reason it has to update NPM, suddenly you have a potential of any and all of those thousands of packages to break, or interact poorly, or whatever. Since you have specified that anything between 1.0.0 and 2.0.0 is perfectly fine, you have absolutely no idea what secret sauce is necessary to get a working application again. So your website is down for hours while you struggle to figure out a working configuration.
If you deploy unknown and untested packages to production, semver is not your problem. The utter lack of testing is. When you actually push to production, your packages should be locked to what you tested.
This doesn't prevent the use of version ranges during development or testing.
Edit: To be clear, I do not work with Node. If this is standard practice for Node shops, that's terrible.
This often results in security patches not being pushed. IME most companies are awful when it comes to maintaining dependencies, things are all to often locked until someone adds a package.
I understand that view but I think that the reality is that for many customers, downtime is worse than a potential security breach. Customers would rather downtime than a definite breach, but the chance of a breach is not always as bad as definite downtime.
I'm not saying that view is necessarily good, but I think it's common. I also think it's quite common that people assume this is the client view and act accordingly.
I definitely think skipping testing with the goal of getting security patches out sooner is a terrible plan except maybe in the case of an active exploit. You can get both frequent patches and high availability.
There is also lots of Node software running in environments where downtime is unacceptable. And having uncontrolled dependency churn is a great way to break your CI and demoralize your team. I know I've spent way more time than I care to admit debugging failures in test and production because for some reason there was no subdependency freeze. It's the kind of problem that makes you very frustrated.
This discussion is based on a false dichotomy. Yes, shops can be bad at updating dependencies. That's a matter of culture. Technically, I shouldn't have to constantly scan my dependencies for updates, much less security updates. That's what CVEs and release notes are for. Not freezing the dependency tree in CI and production on account of security updates is a bad pattern.
and your application breaks - then some-library's developer isn't following SemVer anyways, and you are in dependency hell. If some-library's developer doesn't care about compatibility, then it doesn't matter what versioning scheme you use as you will have to whip up that secret sauce anyways.
However, that your issue isn't a problem with SemVer, but a problem with NPM/node. SemVer is working as intended, but getting the "secret sauce" to get everything working can be confusing as you try and figure out the differences between "devDependencies" and "peerDependencies". If every module just managed their dependencies, it could be easier (on your development experience, but not on your hard disk and build times).
> If some-library's developer doesn't care about compatibility, then it doesn't matter what versioning scheme you use as you will have to whip up that secret sauce anyways.
It does. Semver implies semantics arbitrary versioning is not.
Your problem has nothing to do with semver. The solution to your problem is to use your package manager's dependency freeze utility (npm shrinkwrap in your case), and also to select your dependencies carefully, so you don't end up with thousands of subdependencies (I know some npm packages suffer from dependency incontinence - this is a good reason not to use them).
IME this problem is particular to NPM, due to JavaScript's historically-anemic standard library which made a culture of microlibraries inevitable. The lesson is that programming languages should provide rich operations on their standard types so that users can avoid having to transitively depend on ten different versions of left-pad.
Dependency hell isn't unique to NPM, but it's also not unique to semver. The primary driving feature of semver is that it provides the framework for a social contract between library author and library user about how automatic upgrades can be performed. This works great until you the number of transitive dependencies is "in the the thousands", which I've never seen on any project save for Javascript projects.
This. You define your loose dependencies, then you compile them down to a lock file that you use to test your service and you deploy with the completely locked down set of dependencies. When you want to upgrade your deps, you re-run the compilation and get updated libraries, which you test again. If you have floating deps in production you're doing it wrong.
> And then if you deploy to your server and for some reason it has to update NPM, suddenly you have a potential of any and all of those thousands of packages to break, or interact poorly, or whatever. Since you have specified that anything between 1.0.0 and 2.0.0 is perfectly fine, you have absolutely no idea what secret sauce is necessary to get a working application again.
That's not how you're supposed to use npm. You should never perform an npm update directly in production exactly because of what you describe.
The proper way to manage dependencies update in npm is to perform the dependencies' updates in your build process when building your pre-production/staging build and then use the shrinkwrap[1] command of npm to generate a file with all dependencies pinned for production. This way, for a given production deployment, you know exactly what version of which dependencies was being used and you can rollback easily if an update break something.
My biggest beef with semver is that it is pushed by people encouraging this concept that additions are always safe. Or the lie that people will do security and other updates on older versions.
By and large, especially in an org, if you make a change that requires another place to change, you should go ahead and make that change. Period. Thinking people can stay on the old version till they want to update is BS. They will update when they have to. And it is rarely easy, then.
It used to work that way, though; software in the 90s and early 2000s did have sane semantic versioning, updates were backported to earlier major versions, and revision updates generally didn't break anything in production.
I bet semantic versioning is mostly championed by older veterans who have worked in maintenance or systems administration roles and are trying to get the industry to return to a versioning system that made sense once upon a time.
For my part ... while I'd dearly love to see semantic versioning come back, along with a few other good habits from a long time ago, I've given up on that and now just accept that updates will break things sometimes and there's always something to update and so something will be broken most of the time and that's just how it is so charge accordingly.
I half-agree with you, because at some point, software versioning and the notion of stable apis went all post-modern (especially with open source).
But we used to pay software vendors a lot of money to care about boring things like backwards-compatibility, legacy support, and etc. Semantic versioning as a standard is a way of applying social pressure to authors to do the 'dirty work', regardless of whether they are being paid to do so. And if they don't feel like it, the result is usually "too bad for you". There's no real good answer for this that I can think of. Nobody wants to pay for leftpad.js Enterprise Edititon.
I think there used to just be less things versioned. Conway's law is probably hiding here. If there is an org to maintain it, it will be maintained. If you have more things to maintain than you do org structures, something is not getting maintenance.
A lot of people are still doing that, including Linux distributions.
Keeping earlier major versions alive and backporting security fixes is very useful.
I'm not sure I understand your argument. If someone's not going to backport a v2.1 fix to v1.6, why would they backport a v20170106-1009 fix to v201610-1005?
I'm not sure what you are saying. My argument is that it is a myth that most things will get only security updates.
That is, the version scheme States the point is to somehow be in a position to get only security updates. However, unless there is an org specifically for maintaining a library, your only hope of getting updates is to take all updates. So, to stay on the latest.
Have you got an example where pure addition breaks the existing compatibility?
> the lie that people will do security and other updates on older versions.
Semver does not say anything will be supported. You still have to know what's going on with your dependencies as far as big releases and patches go. It simply says that is old versions are still supported, they're updates will preserve compatibility.
> Have you got an example where pure addition breaks the existing compatibility?
Surely they're mutually exclusive terms. I think GP was talking about cases where the developer releases a change thinking it's just an addition, but it actually breaks compatibility.
Of course, that boils down to saying that software updates sometimes have bugs. What that has to do with semver, I have no idea.
Sorry for the very late response. I should have written it as "additions are safe." I am more reflecting on additions being more liability (commonly called "tech debt") that you have.
I think others covered the edge cases where additions literally break. The sibling that said I was likely referring to just plain bugs hits it pretty well, I think. I was not going for esoteric scenarios.
And yes, semver doesn't say anything about if a version is supported. However, it is a strong indication that it is. In a social expectations sort of way. If you don't plan on supporting old versions of what you are doing... use a new name for the new stuff. Nobody expects google guava to support apache commons, even though they are doing a lot of the same things. We do expect, however, that apache commons will make sure any bugs in old versions are dealt with.
> Have you got an example where pure addition breaks the existing compatibility?
With c/c++ additions can break binary compatibility can't they? Adding a field to a struct for instance will mean existing code that does a malloc for that struct won't allocate coorectly. I think there are a million other cases to consider too.
Normally in a library you'd provide functions that allocate and free structures defined by that library. That's how successful C libraries do it.
I think semver covers this pretty well actually. The public API is not just the function signatures, but basically anything that makes your API - whether it's code or documentation. So either:
- your structs in the library are opaque and the library handles all the memory management itself - addition of fields is backwards compatible, or
- your structs are open and your functions expect the structs from outside - addition of fields is not backwards compatible
The second one is still falls under "Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API."
To make a comparison in a dynamic language: you've got public function `f(d)` where `d` is some dictionary with elements `a` and `b`. Now in the next version you start to require `c` - same signature in the code, but it's not backwards compatible. It's still a public API, even if defined by the documentation.
your structs are open and your functions expect the structs from outside - addition of fields is not backwards compatible
Sometimes it's possible to handle this case (if you've planned ahead) by including unused space within the original struct defintion that you can then repurpose later.
I don't do C++ normally, but the situation seems similar. There's http://wiki.c2.com/?PimplIdiom for opaque objects, so you don't have to make their members part of the public API. But for things you do want to make public it's the same solution - make it a your public API by documenting it. If you know it breaks compatibility, it requires major version bump.
And no, after compilation, I don't believe the symbol order matters.
Qt uses it extensively. For big, complex objects like QTextEdit it's probably the least of your concerns, but they do avoid it for simple value types like QPoint.
A good chunk of these questions I'm asking are because I half remember Qt's binary compatibility guidelines from when I did an internship at Trolltech many years ago :) A similar list is here: https://community.kde.org/Policies/Binary_Compatibility_Issu... . It looks like the virtual function restrictions are what I was talking about, that's where pure additions can break compatibility.
It's an issue if the methods on that object are short-lived enough that the overhead of indirection is significant.
It's also an issue if very performance-sensitive routines need to read data from the object. In which case they already know exactly what's in there, so "pimpl" hiding has no benefit in the first place.
Implementations should be hidden if they are likely to change. In other words, there are are non-obvious data or method members and/or it's likely that functionality will be added in the future. These are the high-level architectural situations, not the leaf level ones where you optimize out every cyle possible.
Linux's support for ELF symbol versioning makes maintaining binary compatibility of C libraries easier than perhaps any other platform or language. It's patterned after Solaris' ELF symbol versioning, but more powerful.
glibc uses it extensively, but alas few other people know about it. Theoretically languages, like Rust, which compile to ELF executables could leverage the capability. But none do.
There are a series of posts from a libvirt developer which explain some of the mechanics:
I personally don't bother with semantic versioning at all. Where ABI compatibility matters, I'll commit to maintaining backward ABI compatibility in perpetuity using symbol versioning.[1] And maintaining backwards API compatibility is something I try to commit to as a matter of course; if I can't then I make sure builds will break loudly.
I don't begrudge people using semantic versioning, I don't just don't think it's the best approach for most cases. It's just one of several crappy choices. But at least with C on Linux, ELF symbol versioning is a slam dunk when you're serious about the matter.
[1] I usually avoid depending on proprietary platform features. But this is the one area I'm comfortable and eager to lean on Linux.
I agree, SemVersion is a technical solution for what is often a human/corporate problem.
A good solution to this is to have an "edge" build that will upgrade dependencies and let you know about breakages early, but that still requires companies to acknowledge that all software needs maintaining.
It's more of a technical tool for implementing a certain human/corporate behavior. If your organization is actually following a versioned process, then semver-aware tools are really handy; if you're note, then those tools are at best irrelevant and at worst just get in your way.
IIRC, Amazon deprecates old versions after some time period. That solves the "no one wants to update" problem. It also shows clearly that when a major version will not be maintained regularly.
For a smaller org, sure, especially if you have a monolithic build that would break if you made a change like that.
For a larger org, it's basically impossible to do that if you want to make forward progress.
But very much agreed that it's usually not easy to upgrade farther down the road when many things have changed. I've been trying to spread a culture of forethought and not changing things for the sake of "making things pretty", but it's an uphill battle.
> Or the lie that people will do security and other updates on older versions.
counterpoint: rails. At least two versions are supported at any time, sometimes older versions still receive fixes for critical security issues. Rails follows semver and they're doing fairly well. I rarely see breakages from upgrading anything but major releases.
> Thinking people can stay on the old version till they want to update is BS. They will update when they have to.
You can just about get away with this within an organisation, but you cannot do it to your customers if you have a competitor which doesn't impose this workload on them.
I'd like to kill off a decade-old toolchain I have to target, but people are still using the hardware it's required for. They don't want to spend the money on replacements when the current hardware still works, and why should they?
I realized too late for editing it in, but I do want to add that I do not think it is solely pushed by people encouraging this. I honestly think that most proponents of this are incredibly well intentioned.
Which just highlights, to me, that semver is ultimately good intentions masqueraded as a mechanism.
> Being able to do this is really great: some-library>=1.0.0,<2.0.0
No, no, no, no it isn't great; it's terrible.
It's makes the version used non-deterministic, and that is a recipe for build hell. The whole reason we have lockfiles, shrinkwrap, etc. is because people are specifying version ranges.
You're still allowed to use lockfiles and shrinkwraps.
Anyone deploying an application to a server or distributing an application to end users should have the dependencies shrinkwrapped and the application tested with them.
Upgrading dependencies to get bug and security fixes in minor and point releases is as easy as deleting the shrinkwrap, installing the dependencies again, and making a new shrinkwrap. Then you put your application through all of its tests with the new dependencies, commit the new shrinkwrap, and then you can release/deploy a new version of the application.
You may lock library versions for a build pretty strictly. But the more loose is the coupling, the more elbow room you have in this regard, so you can upgrade your ngnix without breaking your wsgi apps, or your database without breaking anything — when done within a reasonable range of versions.
Let's just say it: a lot of web development is done by producing a blob of code and frameworks of the week, getting paid, and not caring about security and maintainability.
The poor practices and tooling are a consequence of this mindset.
Having a non-deterministic version when you build is terrible. So don't do that. Having deterministic build dependencies is also a goal, but a different goal.
Having a non-deterministic version, so you can easily upgrade all your dependencies and get the latest and greatest compatible version of everything, is great.
to defend the author here, I think they were very clear that semver may not make sense for a very specific type of CD, always-on application. The article supports the thesis as stated in text, but the title is too broad.
Ehhh... at best it argues that it's over-complicated for something that only ever has a single version running everywhere. Like a web service that you control, for which there will never be a way to request "use that old version of the code instead".[1]
Which is true. But then why even bother with YYMM.xxxx? Just use a single incrementing integer. Or none - nothing can ever go backwards, so callers are required to ignore the version in practice.
[1]: an API presenting an old external interface is something different - there's still only a single "version" of the code that's actually running.
Yeah okay. I think there is a very large group of developers who work in just such an environment: developing web applications that have a "production" version whether they are public internet or internal to companies where the previous version would literally _never_ be consumed. The reason to track a increment in that kind of environment is as a code to align feature requests to a particular release (usually by date), this is useful for all sorts of internal tracking questions like: was this feature I requested live by 4/1/2016.
I personally dislike using semvar for these instances (my own opinion only) because it feels overcomplicated as you stated. I also learned a long time ago that _any_ kind of date designator was a recipe for answering why the "April" release was actually live on June 1st ;). Recently, I've dropped back to Git SHA-1 which has alignment to a specific version of code released AND can be tracked by date and deploy tools.
So I agree with the thesis, disagree with the solution, but still think ryan_lane did not effectively refute it above.
Yeah, the blog's context isn't about libraries. And agreed, tons of developers work in environments like this.
I pretty much exclusively deal with SHAs too, since tons of tooling understands it, and there's no implied order just by looking at it - which is a good thing. A SHA might be out, but waiting to ramp up to 100%. It does tend to mean "end of (nearly) all dev work" though, which is the actually important part for day-to-day developing.
In the past I've used YYYYMMDDXX as a version number. In my experience there's no real problem with confusion about what the version number means as long as it's automatically generated so there's a clear definition. For me it was "feature merges to master, and master has passed tests". Nothing about actual releases at all, because the version of the code says nothing about what you do with it.
The version deployed is 20160215.2.142001 (on 2016-02-15, there were three deploys, with the latest build affecting them being #142001). There's your date, which is additionally free of the uncertainty whether you meant mm/dd/yyyy or dd/mm/yyyy. How is that incompatible with semver? (Nowhere does it say that the numbers have to be single digits. Integers.)
semver conveys information about a released product. If you're running an internal project that isn't being released to the outside world and that project isn't a library being used by other services in your environment, then you can number the project any way you want. In fact, you don't even need to number it (you can just use the git SHAs, for instance).
semver is useful even if you're releasing something that isn't a library because it can be used to provide information about backwards incompatibility in things like database schema, cache incompatibilities, configuration file incompatibilities, that'll cause a service to break if updated.
That's a fair counterpoint to the article but your example above only demonstrates the library case. You should provide an example that demonstrates the value of semver in a CD, always-on top-level application to give information about schema, cache and configuration incompatibilities.
To use the web-service example: if you present an API for others to call, even if you never version yourself, you probably still version your API. Other posts on that blog mention doing so.[1]
Well, if you release a ton of versions of your API, some of them are going to be easier to transition between than others. As soon as that occurs, you conceptually have at least major and minor versions, but in a url-based API instead of in code. Or you choose to not hint your consumers if moving from v23 to v24 is likely to be simple or not.
The author has actually no clue what semantic versioning is. He keeps on saying that with small, frequent and well tested changes you constantly produce "stable" builds and therefore don't need the complexity of semver, but he doesn't seem to understand that this has absolutely nothing to do with semver.
Even the smallest and most stable change can break backwards compatibility or fix a bug or add a new feature and semver's only purpose is to meaningfully express what kind of change it is. You build a 3rd party package or a library which is used by other people? You better use semver. It makes everyone's life easier and not more difficult. You build some REST API which is consumed by other applications? You better use semver. It not only helps to visually display what kind of change it was, but it also triggers an additional thought process during development and CI. For example if the major version doesn't change, then the CI system can reliably replace the existing API during the deployment process. However when the major version changed, then you might want to think about a side-by-side deployment to support backwards compatibility, at least for a certain grace period. It makes life in many ways easier.
And again, this has nothing to do with how stable a release is. Let's say I have a public facing REST API and I make a 1 line change where I decide that some field of a JSON object must be int64 because in some edge cases an int32 wasn't unique enough and would have caused a bug. Now this change is fixing the system, making it even more stable but essentially breaking existing integrations. Semver helps me to easily communicate the change to my API consumers, setting up the right expectations and helping them and myself by reducing frustration, tech support and unwanted side effects. It also helps to deploy it in a way that allows a smooth transition to the newer API.
Furthermore his suggestion of using a date is useless. What does a date tell me? Nothing... if I want to know the date I right click the files of a build and look at date + time or I look at the commit time in git or the build time in the CI tool. Having it in the version is stupid.
I read the article and feels it is very poorly-written:
unfounded claim everywhere, no context, no reasoning, no logical deduction, no clear conclusion ...
> Even the smallest and most stable change can break backwards compatibility
This i care about.
> fix a bug or add a new feature
Not sure why i should care about these tho.
> essentially breaking existing integration
Not sure why you need 3 numbers to express that. Am i right in that your consumers dont get bugfixes by default because api patch version is changing? How is this working irl?
> suggestion of using a date is useless
Dates are just easier to scan by humans that just monotonically increasing numbers.
>> fix a bug or add a new feature
> Not sure why i should care about these tho.
I'm using feature X. Feature X was introduced in version X.Y. Therefore I can't use any version less than X.Y.
>> essentially breaking existing integration
> Not sure why you need 3 numbers to express that. Am i right in that your consumers dont get bugfixes by default because api patch version is changing? How is this working irl?
No. When you fix a bug you increment the patch version, unless the fix changes the API in which case you wouldn't want it automatically updated anyway.
> But it would work the same way with monotonically increasing version. no?
Kinda. If I can use 1.4 then I can use 1.7 but I can't necessarily use 2.1, because 1.7 only added things to 1.4 but 2.1 changed something about the public interface.
> I am just trying to understand how semver applies to REST in your case. Never seen anything but single number versioning for endpoints.
Sorry, I didn't realise we were talking about REST.
Based on reading this it seems like you haven't worked on a large application where there are lots of modules and libraries developed by independent teams all coming together to form a final product (shipping with CD or not).
I work on a platform team developing many different libraries used by many different teams throughout the company. If we didn't leverage semver (or some versioning scheme that at least differentiates between breaking and non-breaking changes) I don't know how we would do it. We either a) wouldn't be able to release 'patch' updates to a particular library without consumers getting it for free/automatically without changing anything or b) wouldn't be able to release breaking changes without it automatically breaking consumers builds.
Semver may not be useful for the final build or end product that you end up shipping. But it is a very useful tool for all the parts (dependencies) that make up that final product.
Semver is a sweet spot for libraries (tools, utilities, APIs, etc.). When a library I use isn't on board with Semver, I am often skeptical, and almost always disappointed.
For programming languages it is less clear to me that there are benefits. The same goes for living production end-services, which are the topic of the article. Said production system might (should?) be separate in version churn from any API built on top of said system, or client libraries used to interoperate with said API.
It is likely that I agree with Mr. Gillespie on the topic with respect to "always on" systems, but I do not feel strongly about that agreement? I tend to look for and focus on Semantic Versioning in its sweet spot, that is, libraries.
Having said where I agree, there was one quote of the article that stood out to me:
>it feels a little bit overly pedantic
In my experience, that's the best part, perhaps even the entire point, at least with libraries. Semver is just a method of communication, but its standardization and lack of ambiguity (relative to other systems), means that communication is all the more clear. I need a new feature and so am bumping a library to a new minor version? I'll be sure to check for other new things and for any deprecations, but won't need to worry that existing behavior was removed. Very clear.
When I am engineering a system on top of various libraries/tools, then growing and changing said system over time, I want to be able to pedantically assert what is known (as well as what is not known!) about the system's dependencies. Semver is a good tool for this job.
The clarity also makes it possible to build linters that check whether you accidentally broke your server guarantees (at least for detectable API changes). That's awesome.
@williamle8300
>He's making good points, but what alternative is he offering?
Your comment is dead, but to answer the question - the alternative proposed is to encode the contract of your library in itself, and never break it, only extend it.
If you need to change the API, you don't change behaviour or signatures of your functions while preserving the names; you just add a completely new mainspace, while letting people use the old functions if they still have reason to. You don't have to keep actively supporting the old namespaces (i.e. no new bugfixes), but just need to not break the stuff that already worked.
There's some scenarios where this is hard - for example it's a web api and supporting it is costing you $$. But for things like that devs are generally aware that what they're consuming might get shut down at some point; so doing a new api, in a new namespace, and shutting down the old one at some point is not completely out of the question.
I feel like semver works great in the right context, for example if I'm building an API/Library which other deveopers are using, it helps them to know if there's been a major version bump, breaking change or just general patches.
However if it's just software for the end user, such as a web browser, it doesn't make as much sense, what is defined as a breaking change for someone browsing the web? And will they care? I've seen it be used for websites, ok great, but who's on the receiving end of these version change numbers?
At the moment everyone is trying to do semver even in situations when it's not needed.
As a developer I really like to know which version of a browser a user is running to be able to reproduce a specific bug. It does matter if they are using a five year old browser or the latest evergreen update of Firefox. Having the version number available somewhere (Help » About …) facilitates communication a lot.
Also, for corporate customers it is sensible to agree on the version range supported for a web browser if you are offering a SaaS solution, especially if security matters.
Sure, I'm not saying remove version numbers as a whole, more that semver isn't needed for something like a browser.
What you've described doesn't need semver specifically
If you are working in a really shitty software shop with no discipline or process control, this can be a great way to start establishing both, so I say no.
If you are developing your own personal website with a static page and bootstrap, probably over kill.
We don't need to have a polarized opinion about everything.
This is punishingly stupid. If you are working on a large enough project (many libraries, many developers, many organizations, daily or nightly builds because it takes 4 hours to run all the unit tests even in parallel due to dependencies) you use semantic versioning.
If you're some pissant shop writing CRUD apps then yeah, don't bother. As soon as you outgrow that stage, there won't be any more questions.
Monolithic standalone apps or tiny projects don't need semver. Practically all collaborative software in between benefits massively from it.
The author has no understanding of semantic versioning. You use semantic versioning for dependencies. You DO NOT use it for your deployed products. I mean, sure you could apply it to your deployed products but that's just an internal number to help you keep track of things.
Semver is all about creating reliable builds from dependencies.
> With modern CD systems, we want to simplify. Changes should be small and frequent, which leads to stable and gradual improvement without much pain. But this frequency itself becomes a counter pattern to what semantic versioning is all about, which is the plodding software development process of old.
What. No, that doesn't follow at all. This is the equivalent of "hate leads to suffering".
Just increment the last number on each deploy. X.Y.123456789 is perfectly acceptable. Or if that strikes you as a distasteful abuse of the "patch" value, add another layer to get X.Y.Z.12345689. Semver systems will be perfectly happy with a 4 layer option, because none of them are stupid enough to be useless when handling edge cases.
I'm a big fan of really simple versioning schemes.
<rewrite number>.<release number>.<commit number>
Any version ending in a "0" is stable. IE:
- 1.0.0
- 0.1.0
.....
- 999.999.0
Whenever you are doing a "release" you have tested every feature fully to make sure the software is up to spec. Whenever you are doing a "rewrite" you can assume that the software's APIs have changed and you will need to readjust compatability (ideally APIs never break but this is one example).
With this you get the following:
- Is this stable to use in production.
- What software can work with what other software (looking at the major version it requires)
- Between which two commits a functionality broke. (If 1.1.157 works and 1.1.158 doesn't then you know commit 158 broke something)
With the author's versioning system:
- The month, year and # of commit you're on
- If someone else gave us this code (pr-)
- If there is active development on the code (dev-)
I prefer the former but I can see some of the benifits of the latter. I'm just more of a fan of using flat and simple numbers and not relying on "dev-", "pr-" and the infinite amount of tags that will follow if you start allowing them.
I don't see an advantage over SemVer- SemVer is an accepted standard while this is not and you still have the human error component arguably beeing the weak spot for SemVer. And to be honest: this is by no means simpler than SemVer.
I like `git describe --always --tags`. It gives me a string like "v0.4-10-gd7b8" assuming I have tagged my repo with semver like "v0.4". More precisely, that string means 10 commits on top of tag "v0.4" and you checkout revision gd7b8 to get it.
The only purpose of SemVer is to express whether a change is breaking or not. It fails at that. Nobody can predict whether a change is breaking. Because code is complicated, and code with dependencies is exponentially complicated.
At best SemVer communicates from one human to another the _expectation_ about whether a change is breaking. The expectation can, and frequently is, broken.
All changes are breaking changes. Vendor your dependencies by hash. Humans cannot be trusted.
Software development is about communication. Quality is about discipline. So yes humans are imperfect, but SemVer works better than anything else i have encountered so far...
Sometimes being in a silo warps your view of the world. While semver isn't necessarily the most optimal for a given type of development, and there are many alternative ways to version, it works pretty well for most all kinds of software development which is why it is a pretty good standard, no matter what software you end up doing it should work for you.
The author seems to be completely missing the boat here. SemVer isn't meant for Continuous Delivery production systems. Why would you even attempt to use it for that? SemVer is for libraries, and specifically for making dependency management easy. If you're not writing a library that gets distributed to other people, then use whatever versioning system you want.
I'm not sure what you're trying to say. I explicitly said SemVer is for libraries and specifically for dependency management. So yeah, package managers use SemVer for dependencies. In theory you could also extend this to programming languages, but most programming languages don't do SemVer. I suppose other build tools also qualify.
In any case, none of the things you use SemVer for are Continuous Delivery always-on production systems. They're software releases.
What does a package manager "using" SemVer for apps look like in any meaningful way? The package manager never has to do any dependency resolution against the version number of an app, so, I would assume (and know, for the package managers I'm familiar with), the version number is just an arbitrary sequence of three numbers. A version number only becomes semantically relevant the machine when other things depend on (aka, use the public API of) it.
"package managers" as in apt, yum, npm, pip, etc ? They do not "use semver" for anything. Your versioning scheme is completely irrelevant to package managers. (as long as the format and separators match of course)
Just because a package manager can use dotted numbers for versions doesn't mean it's relying on SemVer. Rubygems and apt, among many others, allow a wider format than X.Y.Z, and predate the SemVer spec.
Author is obsessed with CD and this is the source of his misconceptions about SemVer. He probably has never dealt with really complex multi-component systems, library development or desktop off-the-shelf software.
CD is good for code you control. But for your dependencies and platforms it's the opposite. Just imagine CD of kernel patches right onto your production servers. Or CD of Java updates (or Python or whatever your app is powered with). And guess how long your system will stay "always-on".
The comparison doesn't really make any sense - in what way is the Doge election protocol comparable to semver? It also seems weird to call something bad by comparing it to an election protocol that has been shown to be incredibly robust and was used in a prosperous city state that was successful for a very long time.
Yeah, I didn't get it either. It feels to me like he had this Italian election analogy in his pocket and couldn't wait to use it. It feels very forced.
Funny you should mention Windows - while only the major version is used, there's always a build number associated with a RTM version: 7.0.7601, 8.0.9200, 8.1.9600, 10.0.14393
;)
I agree that an average user only cares for the major component.
Tangentially relatedly, this recent r/haskell thread [1] contains a lot of discussion on the differences between Haskell's Package Versioning Policy and SemVer, and particularly downsides of SemVer.
Recently I discovered a problem I was completely unaware of with dependencies management using semantic versioning patterns. My invalid understanding was that if I specify a dependency pattern:
foo-lib >= 1.0, < 2.0
I will always update to the newest 1.X.Y version of the foo-lib.
This is not the case, such pattern can silently install one of the older 1.X.Y releases due to the requirements of other libraries that my software uses and that are out of my control.
This is pretty dangerous, the pattern treats all 1.X.Y as equally good for me, but X and Y updates can contain important, often security related bug fixes.
A safer way would be to be able to say: I want to have the newest 1.X.Y, fail if it is not possible, so I can investigate.
I don't understand why some developers insists on using semver and not follow the spec, in my opinion that is the only 'problem' with semver.
If you need a version and don't care about following the spec just use a timestamp or datetime, replace it where required with a sed command and move on but please don't claim to use semver if it's just an arbitrary number for you
And if you're in some context where you're forced to use SemVer (npm, cargo, etc) but you can't be arsed to think about backwards compatibility, just bump the major version with every release.
i don't like the scheme described in the article, but i do think that X.Y.Z is overkill. X.Y provides plenty of information and doesn't make me second guess if i should upgrade or not. Z often creates a false sense of confidence that shit wont break. changes in X always mean breakage, changes in Y mean potential, but unintended, breakage.
The patch version is there for very minor things. Adding docs, adding tests, changing metadata, _maybe_ minor bug fixes. Hopefully no one has to depend on a patch version, and hopefully they're usually ".0". I've seen arguments that patch versions should go away, but revving minor versions for some things is often too much churn - people try to understand what's in the new release when it's not even worth their time.
Minor versions are things you reasonably might depend on like a new feature or significant bug fix. You _should_ be able to upgrade seamlessly though one person's bug fix is another's breaking change. I don't think it's reasonable to get rid of these and treat everything as a breaking change.
So I don't know what's simpler for libraries. Semver seems to have boiled it down pretty well.
Semantic versioning works great for software like Java. When you deliver an application with java, you want to ensure that the vast majority of your users have a compatible Java version. You also want to make sure that a new version of Java doesn't break your software. Semver encodes both these properties. Since the first number in Java's semantic version is always one, you know it is always backwards compatible. Whenever the second number increments, you know new features have been added that require a new VM. The last number changing is mostly irrelevant to most people, but it might signify a bug fix or optimization.
When both those pieces of information - backwards compatibility and the need to upgrade - are important, semantic versioning should be used. If you are comfortable removing features without leaving an out for users, and you can reliably upgrade all users at the same time, encoding this information is much less important.
Given that continuous deployment doesn't have anything like traditional releases, so why bother with semantic versioning at all. You're typically just dealing dates or monotonically increasing build numbers. There's no reason for anything else. Public APIs (including libraries), have traditional releases, and so the distinction between backwards compatible and incompatible versions really do matter.
That said, a three part number isn't really useful. Just increment the minor version number in all not incompatible cases.
Also, there is an antipattern with releases, it's naming them. It's complete bullshit because opaque with rarely any rhyme or reason. It's branding plain and simple. If you need to upgrade anything, you end up having to googling what the actual version number or release history is and then going off of that. It's a complete waste of time, and so twee.
SemVer has nothing to do with CD. It's for package management. With a package or lib I don't need (multiple) daily increments as I would not update a lib I use that frequently.
It’s tricky. On the one hand, if you could RELY on sane definitions of “breaking changes”, it can be quite simple to set up flows that evolve quickly and safely based on little more than the versions of dependencies. On the other hand, by nature software has too many combinations to test completely so you can never be sure that a change is going to be non-breaking for YOU; and ultimately you are responsible for anything that does break.
I find the right thing to do is to set up a completely parallel “beta flow” that mirrors production almost exactly, allowing you to freely dump in new changes and see what happens. Have a way to swap working beta-flow changes into production, and a way to swap it back. Then you can do anything you want in beta.
SemVer is sometimes very annoying (pre-release and metadata rules specifically) but also just solves versioning. You have to jump through some hoops to use `git describe` in a sane way with it, but it's worth it.
Maybe I'm missing something, but to me more frequent builds make proper semver more important, not less.
An easy way of knowing which versions you should be able safely update to without breaking changes seems hugely valuable.
He talks about APIs versioning, but APIs are only a small part of the problem. How about the hundreds of libraries that my site is built with for example? Semver allows me to specify rules in my package file about whether to take only the exact version that I want, the latest patch, or the latest minor version. I didn't spot anything in the article that showed a better way of doing that.
The reasons that Rich Hickey tore into SemVer seem to be issues SemVer weren't concerned about solving and his alternatives doesn't solve the main issue SemVer was created for. The problem of "can I reasonably upgrade this library without breaking my own code, and if possible to so in an automated fashion" isn't solved by chronological versioning. If I need to upgrade my application due to a security vuln, SemVer lets me know if I can just "drop-in" the upgrade, or if I need to work more.
The problem of knowing whether 1.3 vs 3.7 was written 10 days ago or 10 years ago doesn't seem useful. Rarely has software I have written depended on how recent the library was released.
Ultimately, the talk tears into version numbers in general, but not semantic versioning - he doesn't discuss anything particular to SemVer. He could have been talking about Postgres' versioning conventions and the talk would still hold.
I think this talk makes his opinion on how breaking changes in libraries should be handled (in the context of the JVM ecosystem) very clear:
A) avoid breaking API changes
B) if you have to do large breaking API changes, that will probably affect alot of dependent projects, make it a new artifact / namespace / library that can live side-by-side with the old version
B is actually pretty common in large java library projects (rxjava -> rxjava 2, retrofit -> retrofit 2, dagger -> dagger ‡ all have independent maven artifacts and I think also packages) and imho this approach makes a lot of sense. It's also the more important part of this talk compared to his critique of semver.
Isn't semver the best way to do what he's advocating, then?
I mean, it's not like people delete 1.3.0 from their packaging system when they release 2.0.0. Incrementing the major version number is semver's way of declaring a new namespace, and once declared, it lives alongside the old ones.
It is about treating new versions with major API breakage as if they were completely new libraries, not as a "maybe, possibly drop-in replacement, iff we use just the right feature subset". E.g. RxJava changed their maven artifact ('io.reactivex:rxjava:1.y.z' -> 'io.reactivex.rxjava2:rxjava:2.y.z') and the package name where in which their code lives ('rx' -> 'io.reactivex'). This makes it possible for both versions to be used at the same time while transitioning, without breaking dependencies that rely on the older version and without having to bother with automatic multi-versioning in buildtools.
With that in place it is questionable what the actual advantage of semver vs a simpler monotonous versioning schema (like build number or UTC date) is.
> Ultimately, the talk tears into version numbers in general, but not semantic versioning
He does talk explicitly about the trouble of making major changes in SemVer[0]. The gist of his argument was that minor changes in semver are relatively useless while major changes have a high probability of breaking your software. Major changes in semver are backwards incompatible and update the program's API in place. This leads to dowstream breakage and fear around doing upgrades.
> If I need to upgrade my application due to a security vuln, SemVer lets me know if I can just "drop-in" the upgrade, or if I need to work more.
I think the point he was trying to make was that upstream developers could change internals of a library but keep the API consistent so that downstream devs would never have to worry about scary updates. As you said with SemVer, if the security upgrade is a significant change, then you can expect breakage in the library. What he was advocating was patching issues like security vulns under the hood while keeping everything backwards compatible. Major upgrades could even add new namespaces, functions, and arguments but there's no real point to deleting or mutating old code, that just creates breakage. He wants software libraries to be immutable to take care of dependency issues and versioning to better reflect the changes made in the code.
> As you said with SemVer, if the security upgrade is a significant change, then you can expect breakage in the library.
In practice that's very uncommon. If someone is actually doing security releases, then either they release a minimal change to supported versions, or the distributions do that for them. Actual security upgrades are normally single patches which take great care not to do any API or behaviour changes.
Create a new namespace for your new API, or your new Interface, or your new version, or whatever you want to call it. But don't delete your old namespace. If your code is someone else's dependency, you should leave the old namespace in place for those who need it.
Left unresolved is how to deal with the accumulation of old code. I assume there will eventually be automated ways of turning it into a library, so the old code no longer needs to appear in the code that you are actually working on. Some automated solution seems like the kind of thing that the Clojure crowd will come up. I think it would only take me a day to write a macro to recreate namespaces from a library. Zach Tellman has some code that does half of this (Potemkin).
But I want to argue one point: that versioning APIs would be simpler than versioning a component/lib.
If a lib exposes one API seperating the version numbers would result in two.
If we look at exposed API and differentiate by purpose (a good idea imho) we would have multiple version identifiers- one for each API and one for the component.
I thought about API versioning and testing (a special kind of unit test to verify the contract an API promises) but I don't think we are prepared for such complexity.
I never really liked semver as its a mixture of release number and an abstraction/categorization of risk that nobody can agree on, making it hard to tell how stable an update will be. It was fine back when you only had a release or two a year, but now the abstraction is no longer good enough.
I prefer the first number to be a simple release number, then the second to be the amount of risk from the previous release. That way for projects with a dozen or two releases one can normalize the risk values between projects to get everyone more or less on the same page without them having to change their behavior and it can introduce the ability for you to manually correct the risk for different libraries (eg: Library X is super critical for me, so *2 the risk).
Automated tools for calculating risk are possible with this scheme and additional risk can be added for the update (eg: tool sees you are using method X or API call Y that was changed, so the risk is increased).
With release.risk you can say "Automatically update when total risk is <X" and actually have it be tolerable to you. Even when manually updating, you can get an idea of how much trouble its going to be, and possibly have a tool generate a list of areas you should put extra attention towards (eg: library X has a risky update, check to see if it breaks your corner cases).
What do you mean by risk? How would you quantify it?
As I understand it, all semver is trying to tell you is when backwards-compatible changes happen, and when backwards-incompatible changes happen.
If the project developer wants to add some sort of indication that "this package contains changes that are alpha quality, and may not respect semver for the next few releases" then that developer can append a pre-release identifer to the version, as described in the semver spec. Once that identifier goes away, the risk that a package violates semver should be gone, and you should be free to update based on the semver relation to the previous release.
Ultimately, it's up to the project to verify that their releases don't violate the semver spec by being diligent with respect to their public API. If you find that a project isn't disciplined in documenting API changes (semver or otherwise,) then coming upw ith new rules to convey the information they're already not conveying isn't going to help anything.
Well, backwards-compatibility is sometimes a probability.
For example, a change in the public signature of a function to accept a wider class of arguments might be considered backwards-compatible since the previous calls to the function will still work.
But if your language has type inference, now the previous calls have to actually specify the type of their arguments since the type inference algorithm can no longer resolve all the types.
So you can break SOMEONE'S build by accepting more types of arguments. It will be a one line fix, but if that fix has to be made upstream...
That's an interesting point. If you're using a language that adds more rules to the API compatibility than the language that a module is released in, it's probably not possible to rely on semver in the general case. Unless the author is aware of those issues and versioning accordingly, any minor release could be breaking.
Even in the same language, how do you prove that your changes didn't break things like type inference? Any change to the signature of a function (even accepting MORE types than before) can potentially break someone's code.
If you can't definitively prove that a class of changes won't break things, then you probably have to treat that class of changes as breaking and version accordingly, no?
Risk is simply an integer expressing how likely the project is to work if you use the previous release. Anything else is left up to the project to determine what things get a specific amount of risk. Or another way of putting is that the only requirement is that the risk number be correlated to the probability of running into issues running that release over the previous. Risk can even go negative to signal that the previous release had a showstopping bug that was fixed and that the new release should be used in all cases over the previous.
Also, some of the risk calculation can be pushed down onto tooling and the community. As an example, if more than the normal amounts of issues are reported against a release, the risk for the release can be automatically increased without anyone needing to be diligent in updating it and things can get automatically rolled back if the change was found to be too risky.
Semver requires a project to be more diligent than many projects are willing/able to be, and requires all changes fit nicely into the semver categories. Too many changes don't fit neatly into semver, especially when libraries are used cross platform and in different programming languages. Say a project switches from GCC to Clang, semver has no official way to express that. Release.Risk can handle it by assigning a amount of risk to the update and pretty much any other unforeseen corner case too as it doesn't try and categorize the risk. Also different projects have different risk profiles which semver completely ignores the concept of (glibc is more risk adverse than the JS library of the day). Semver also doesn't indicate any difference between a new feature that is self contained or one that touches a core piece of code, the former is much less risky than the later.
So it sounds like you're trying to quantify the runtime stability of a library with the risk metric: e.g. if I update this package, what are the chances that the foobaz function I'm using in that package will start giving me the wrong answer, or start crashing with the inputs I'm providing. You're right that semver does not try to solve this problem: by design it is only concerned with how to quantify changes to a public API.
If there's a risk that a package's functionality could break between releases it can be signaled through other channels. For example, you can keep unstable code in an "experimental" module, or hide unstable functionality behind a compiler flag. The binary compatibility (ABI) of a compiled package can be signaled with a SOVERSION or with symbol versioning.
I guess I would hope that the packages I use that do publish a public API don't go changing the guts of the package in possibly unstable ways, at least not without lots of testing and maybe a few alpha/beta releases to let the experimental changes stabilize with early adopters. If I was using a library where core functionality was breaking every few releases for unclear reasons, I would argue that it's probably better to just find a more stable library to use than to put the effort into quantifying instability somehow.
No, but the use case he criticizes it for (a first-party service) is outside the scope of SemVer, so applying SemVer there would be an antipattern. That's generally true of apply a good pattern outside of its explicit domain.
You use SemVer when you are distributing a software package supporting a public API to third parties. If you are just running a service thst you develop.In house supporting a public API, you don't need SemVer. Arguably, SemVer is less useful if the software supports multiple public APIs for which independent, versions specs are available, though it is still meaningful and may be worthwhile in that case; though simple sequential versioning with a statement on API version compliance of each product version is probably more valuable.
Ultimately, it's a matter of choosing the right tool for the job, and SemVer is well-suited to the job it is intended for.
I think both approaches are fine, depending on context.
This will definitely work fine for software that's delivered to the users. I don't really care about Chrome's or Gmail's version, especially since the update process is transparent to me. The APIs should be versioned on their own and everything is fine.
On the other hand, I really appreciate when semver is used by libraries (jars, gems etc.). From developer's perspective it really helps with updating these dependencies in your own code - a quick look tells you whether it's just a simple bugfix/patch that probably doesn't affect you or something more serious that you should be careful with when bumping dependency version.
Because of that, I wouldn't call it an anti-pattern. I'd say it's rather one of possible approaches that will work fine in some cases and not so well in other.
For dependencies semantic versioning is great because it gives a reasonable basis of compatibility. I say "reasonable basis" because just because it's a minor/patch change doesn't mean it won't break your code. It just means it's not supposed to!
For end user software, i.e big $$$ enterprise software, there's no point in using it. Unless the app itself is a platform for other apps (in which case you're really an API/library anyway), it's beyond overkill. Heck it's downright confusing to most business users. Arguably any sizable UI change would be a "major version" because you moved a button across the screen. For that type of software I go for <major>.<minor> which are chosen to coincide with marketing campaigns.
Sometimes a "software package" encompasses multiple concurrent "API versions", which means that depending on whether you care about the API in question, the package may or may not be changing something you care about. When it's unified to a single package-wide version then you have no idea. As Rich Hickey put it in his talk on the subject[0], all you know is that "something changed."
IMO, semantic versioning (at least the x.y part) makes sense for libraries but not for applications but this distinction is rarely made and sometimes blurry.
you can also just git tag 1.0.0 and git push --tags. Then when you git describe you get a 1.0.0-###-gHASH where ### is the number of commits past 1.0.0 and HASH is the short hash of the last commit (I think). Its really useful for tracking versions across branches and also always having a unique version which points to an exact commit.
In the end adhering to semantic versioning indicates a commitment to maintaining parallel lines of development. One that breaks APIs, one that maintains them (but may introduce other changes), and one that involves only bug fixes.
These days though the mantra seems to be "push to prod", and thus you may as well increment the major number on every push...
The problem with semver is the actual lack of tooling built around semver. Humans aren't robots, a bug fix might actually be a breaking change, without the author noticing it. What to do in that case. Yet semver is better than nothing, all other versioning schemes are not that useful.
I'd say that if you can't anticipate something being a breaking change or not, then you have an architecture problem. Likely violating principle of single responsibility.
Your software should define what the inputs are and what the outputs are. It should be clear when you're breaking an interface.
I feel like this is a bunch of hand waving and does not really address anything but saying because we have CD we do not need to care about versions. This shows a lack of experience in a solid development process in large scale code bases.
CD itself posits that a lot of what was previously assumed to be a necessary part of large scale software delivery is just cargo culting that gets in the way.
I started using semver about two years ago in two of my projects---one a library and another an application. For my library, the versions run (output from "git tag -n"):
6.3.0 Bump version number to 6.3.0.
6.3.1 The "Remake the Makefile" Version
6.3.2 Bug fix---use $(RM) instead of /bin/rm
6.3.3 Bug fix---the "all" target does not depend upon "depend".
6.3.4 Bug fix---add restrict to some parameters.
6.3.5 Bug fix---fix compiler warnings from CLang.
6.3.6 Bug fix---guard against some possibly undefined defines.
6.3.7 Bug fix---update dependencies in Makefile
6.4.0 The "Trees For The Nodes" Version
6.5.0 The "PairListCreate()" Version
6.6.0 The "Go for Gopher URLs" Version
6.6.1 Bug fix---use c99 instead of gcc
6.6.2 Bug fix---use $(DESTDIR) when installing
6.6.3 Bug fix---replace malloc()/memset() pair with calloc()
6.7.0 The "Christmas Cleanup" Version
6.8.0 The "Breaking Up Is Hard To Do" Version
6.8.1 Bug fix---potential buffer overwrite.
6.8.2 Bug fix---add missing headerfile.
No APIs have changed, but each X.Y.0 has added new functions (with the exception of 6.8.0, which changed the source layout but not the API one bit), and each .n release has been a bug fix. I don't have much of an issue with semver for library code.
For the application, I've found semver not to be much of a win. The versions:
v4.6.0 The 'XXXX FaceGoogleMyTwitterPlusSpaceBook' Version
v4.6.1 Bug fix---double free
v4.6.2 Bug fix---if not using email notification, code doesn't compile
v4.6.3 Bug fix---don't use _IO_cookie_io_functions_t
v4.6.4 Bug fix---potential double free (yet again).
v4.6.5 Bug fix---encoded entries via email mess things up.
v4.6.6 Bug fix---unauthorized person posting via email leads to double fclose()
v4.6.7 Bug fix---a NULL tumbler crashes the program.
v4.7.0 The 'Tumblers Reloaded' Version
v4.7.1 Bug fix---date checking on exiting tumbler_new() was borked.
v4.7.2 Bug fix---previous and last calculations were borked.
v4.7.3 Bug fix---check tumbler date(s) against last entry, not the current time
v4.7.4 Bug fix---current link was wrong
v4.7.5 Bug fix---the assert() when comparing dates was wrong
v4.8.0 The 'Constant Upgrade' Version
v4.9.0 The 'Unused API Paramters' Version
v4.9.1 Bug fix---getline() called with incorrect parameters.
v4.9.2 Bug fix---dependencies got out of whack.
v4.9.3 Bug fix---used the wrong name when generating the tarball.
v4.9.4 Bug fix---removed compiler warnings under different compiler options.
v4.9.5 Bug fix---assert() was too assertive.
v4.9.6 Bug fix---I was a bit too assertive in the callback code.
v4.9.7 Bug fix---fix header guards.
v4.10.0 The 'Spiffier Email' Version
v4.11.0 The 'Go For Gopher' Version
v4.11.1 Bug fix---potential memory leaks fixed
v4.11.2 Bug fix---notify_emaillist() was borked
v4.11.3 Bug fix---memory corruption
v4.12.0 The "Somewhat Arbitrary Christmas Release" Version
v4.13.0 The "Target Advertisers" Version
In actual use, the "version numbers" could very well be 6.0, 6.1, 6.2, 11.1, 11.2, etc. for all the meaning of "4" has (largly---this is the codebase after the 4th major reworking of the code---it's a 17 year old code base). I could see a separate semver standard for applications---basically an X.P model---version, bug fix. Or perhaps a D.X.P model---data format, version, bug fix. If the saved data format changes, change the D number.
> 1. Software using Semantic Versioning MUST declare a public API.
That's the very first thing it says. Semver is not meant for general apps or websites or the like, it's for APIs against which dependencies can be declared.
If your software doesn't need to tell package managers about its compatibility, then it doesn't need semver. Complaining that semver is bad for human readability is like saying hammers are an antipattern because you don't use nails.