Hacker News new | past | comments | ask | show | jobs | submit login
Post-Commit Reviews (medium.com/copyconstruct)
46 points by henrik_w on July 20, 2020 | hide | past | favorite | 35 comments



I've worked with pre-commit code review, post-commit code review, and no code review, and I strongly prefer pre-commit. Surprisingly often, there are issues that come up in code review that require the change to be completely rethought, in which case you really don't want the change to have already been submitted. People, especially junior reviewers, are also much less willing to ask for substantial changes post-commit, because it's so much additional work.

It's a good fit for situations where code reviews are just a quick check that there's nothing seriously wrong with the commit, but if you're using your code reviews as the excellent teaching / mentoring tool that they can be, code reviews aren't a rubber stamp.


> code reviews aren't a rubber stamp

I still remember my first "real" job. The team was on TFS and my code was basically

if (condition) { if (same condition) { // do something } }

and a senior developer laughed and said mcny is testing whether we actually look at the code and not just rubber stamp his changes.

It was actually an honest mistake.

At the same place though, we had F5 load balancers and actual integration with AS/400 systems but development/test environments didn't have a load balancer in front of them :(


> testing whether we actually look at the code and not just rubber stamp his changes

I've actually put in some er, "brown M&Ms" in my code before sending it off to review, because I was sure the people reviewing my code were like "oh, he wrote it so it's probably good". I think what would've helped in that case would be anonymized code reviews & comments, where the names of the author / commenters only pops up after it has been approved & merged.

Actually I think that should be a feature everywhere, e.g. in Github issues. That way it becomes harder to judge e.g. comments / issues / commits based on the person who made it. Pseudonyms are nice, but you still build up a 'reputation' over time. But it'll probably have to be an opt-in option per user.


> At the same place though, we had F5 load balancers and actual integration with AS/400 systems but development/test environments didn't have a load balancer in front of them :(

That sucks; I really wish people would make a better effort at staging-production parity. Like, it's one thing to run dev on a smaller cluster or older hardware (although still potentially risky!), but changing out things like how connections are passed to the environment seems like asking for incompatible changes to get pushed through and break prod.


>That sucks; I really wish people would make a better effort at staging-production parity. Like, it's one thing to run dev on a smaller cluster or older hardware (

I can't speak for everybody, but, honestly I've always fought for parity of dev and prod; usually it comes down to cost though. If infrastructure is 10% of the operating budget of something and you need a dev environment for every team (or, ever dev!) then that cost is going to explode.


But how will our developers unlock a sense of pride and accomplishment if they can't rely on the underlying infrastructure to be consistent between PROD and QA...


Saving commits are good for code archeology. Like initial implementation did not work because


Those commits do not necessarily need to be in the history. The commit with the working implementation could explain that X alternate solutions were considered but those solutions didn't work with the details of why they didn't work. Of course, that's assuming that those other solutions were tried, tested, and found not to work prior to them getting merged and deployed in production.


On the flipside, git is based on a full DAG. There's a lot of power within that data structure to capture a deep, nested understanding of the workflow/process/history/archaeology.

Often the issue is not that the DAG can't usefully hold all the alternates and piecemeal sub-solutions, in useful ways that we can come back to and revisit as necessary, but that our visualization tools on top of the DAG aren't great about letting us meaningfully interact with that (showing it when we need it, hiding it when it is noise to the signal we are looking for). (And we have the power to build better defaults for our visualization tools and/or better visualization tools.)


Code is more precise then prose. Someone could say "we already tried X" then you can test it yourself. Repeating an experiment could be as easy as a git checkout.


Initial versions, and the review comments that led to their abandonment, can be saved in your code review system.


What if you need to change code review systems? Do you just abandon all that information? What if your code review system becomes a central point of failure for information that you need about your process/workflow?

To use a DVCS in a distributed fashion, doesn't it make more sense to store that data in the DVCS itself?


> What if you need to change code review systems? Do you just abandon all that information?

You choose one of: (a) migrate the data, (b) leave the old system running, (c) archive the data. Same as if you need to change version control systems, or any other important system.

> What if your code review system becomes a central point of failure for information that you need about your process/workflow?

Could you elaborate? I thought we were talking about needing to answer questions like "what were they thinking when they decided to make a specific change, at a level of detail too fine for the change description?" What would be critical to your workflow, but only available in past review discussions?

> To use a DVCS in a distributed fashion, doesn't it make more sense to store that data in the DVCS itself?

Nowhere I've worked has used the version control system in a distributed way, beyond someone being able to work while they're on their laptop without wifi.


> Could you elaborate? I thought we were talking about needing to answer questions like "what were they thinking when they decided to make a specific change, at a level of detail too fine for the change description?" What would be critical to your workflow, but only available in past review discussions?

Those discussions are an "oral history" of the project. While we may hope for 100% documentation, it isn't uncommon that everything from implicit best practices to "hidden" decision processes to even customer complaints only ever get recorded in code reviews. "Why did we decide to do it this way?" is often the most common question for an archeology dig, and the answer sometimes is critical to a given workflow. If you want to fix the bug without breaking the user scenario that introduced the bug, that can be very important information to have.

> Nowhere I've worked has used the version control system in a distributed way, beyond someone being able to work while they're on their laptop without wifi.

So? Just because few do it today or think they "need" to do it other than what they believe to be rare offline scenarios, implies it isn't work striving for more distributed systems as an ideal?


> it isn't uncommon that everything from implicit best practices to "hidden" decision processes to even customer complaints only ever get recorded in code reviews

That sounds like something to fix. If something is important enough that people will want to know it when working on the code later, it belongs in either a comment or a change description. When I have to reference old code review discussions it is extremely helpful to have them, but ideally you do need them only very rarely. We both recognize that these are valuable, where we differ is that I don't think they are so valuable that they need the kind of uptime you get from including them in a DVCS.

> striving for more distributed systems as an ideal?

Yes, I don't think this is very important. If it later becomes important it would be reasonable to integrate it into the DVCS.


> If something is important enough that people will want to know it when working on the code later, it belongs in either a comment or a change description.

Which is indeed what I'm arguing for. I think a lot of what gets done in code review should be directly reflected in the source control change comments. So much to the point that I think a good code review system should be not much more than a thin veneer over a commit log.

I agree it's probably a distinction of degree where we disagree. I think the goal of a source control system is to encode source changes in all their forms (including reviews/revisions). Whether or not that is for DVCS "uptime", I see pushing things to an exterior code review system, at least to me, seems a failure in your source control system (or your usage of a source control system), because I see it as such a fundamental part of what constitutes "source control".

Git's DAG as it exists already provides a lot of tools to give both "shallow visualizations" at a high level and "deep visualizations" into more complicated back-and-forth processes such as code reviews. Rather than solve these "visualization problems" by outsourcing components of it to external systems, I'd rather see that invested into the DVCS itself and/or the tools we use with the DVCS. (To some extent I think the tools are already there for git, they just have the wrong defaults to encourage more general usage of the DAG to communicate than a lot of projects tend to use.)


I... don't think I like this idea. "The code works and the tests agree" is not high enough a bar for me for merging.

I want a human to read through it and ensure that the code is readable and written in a way that makes it hard to write bugs. Even if there are no bugs (well, no bugs that the tests picked up) right now, there are a ton of code structure errors I see all the time that will make it easy for someone to introduce bugs later.

I want a human to verify that the code change isn't spaghetti. Any new constructs or interfaces should be vetted by someone with an eye toward future maintenance and extensibility.

It's also possible that the entire approach taken by the committer is wrong, and someone with better context and domain knowledge will catch that. Ideally an issue like that is caught well before it gets to PR/code review time, but often that's not the case.

While I like the idea that issues can be fixed in future pull requests, that's a hard culture to push for. Many times people are pushed to get things out and then move onto the next thing. If a manager or product owner asks me why a particular thing isn't out to production yet, I can say "it's still in review, and there are issues we're working through to get it into shape to be merged" and that's usually sufficient. If they instead ask (after the feature has been merged), "why haven't we moved onto the next thing?", and I answer "we had some issues with the prior feature that need to be cleaned up post-merge", the response I will get is, "we don't have time for that; you already shipped the feature, so if there are cleanups, create tickets for them and we'll find time to deal with them later". And of course "later" never arrives, at least not until those unfixed issues inevitably cause production downtime, and then suddenly, big surprise, they're a huge priority (and the engineering team looks bad regardless).

You might say that much of what I described in the previous paragraph sounds like quite a dysfunctional organization, but, well... from my experience and from what I hear, most of us work within dysfunctional orgs, and we have to do things to work around some of the garbage... like requiring pre-merge reviews.


Same; post-commit reviews about things like style and naming will add churn to the repository, whereas pre-commit / pre-merge reviews allows the developer to fix all issues before merging, keeping history clean and code churn low.


Thats pretty much my experience aswell. We use code reviews in pull requests from feature branch (not deployed anywhere yet) to the staging branch (deployed) and the PO/PM gets that these are important and fixes take time. If something is deployed and not utterly broken, well theres more important stuff to be done. Additionally we have caught some really nasty stuff in code reviews that had the potential to activly break the staging system which would have, at least, caused some hectic work to fix the db there. Some things are hard and impractical to test, e.g. if a migration just destroys your data. I have worked in a company before that did not have code reviews and when I tried to intruduce them they basically where "I need you to press this button" rubber stamps where most of the approvers never even looked at the code. Not going back there...


I always thought, effectively, commit to master === intent to deploy that code (assuming automated tests pass)

If you want to commit unreviewed (or even reviewed) code to a pre deployment branch, then make it just that, no? Have developers commit to a pre deployment branch and merge accepted PRs from there into master.

At some point you have to say “this code is going to trigger deployment”.


Two developers merge to master, the second gets their code reviewed and now wants to deploy, however the first's code is not yet reviewed.

How is that not a problem in preventing logical bugs that tests didn't catch in the first developers code?


I don't know what kind of products these strategies work well on. I've only had experience with larger, older systems where multiple devs are making changes to fix specific issues that the other devs are not completely in the loop on. We might know dev A is working on this ticket, but we are busy with our own tickets and can't always have collaborative design reviews for every item. Maybe that isn't ideal, but it is reality for a lot of teams.

The pre-commit code review is our opportunity to stay in the loop and a bulwark against technical debt and silly mistakes before they ever touch a core branch.


Post-commit review worked quite well when I worked on a product with quarterly release cycle.

The master had to stay green (or blue, actually) and all commits should have been reviewed before release. Each feature had its own short lived branch. Architectural issues and larger refactorings were of course discussed in advance. Because the code wasn't released that often, it didn't matter that much if the review was done in master or in the feature branch and this way minimized conflicts.

I wouldn't recommend it for a system with frequent deployments and more than a handful of developers.


Some of the pros as listed in the article taken point by point:

> It also encourages the developer to keep pull requests small, ensuring a feature is developed in many small pull requests (or at the very least comprising of many atomic commits).

I am of the opinion that pre-commit reviews also encourage small pull requests, if only to make it easier on reviewers -> easy review hopefully leads to fast approvals.

> It also allows the reviewer to batch reviews, so they are reviewing a number of changes in one fell swoop, as opposed to being constantly interrupted for code reviews.

This I can see being a potential benefit depending on the workflow. You could technically do the same pre-commit by working on a longer branch and putting that up for review as well.

> In my own experience, this reduces nitpicking on the reviewer’s part, and in the event the reviewer does nitpick, it allows the developer to address the minor concerns at their leisure.

Nitpicks are what they are, and shouldn't block a PR in the first place, so not sure how much benefit this would be.

> If a design document has been clearly fleshed out and agreed upon before the code is implemented, the implementation will not comprise of any major surprises to the reviewer.

That depends - the design may have been agreed upon by someone, but not necessarily by the reviewer. There may still be major surprises, which will be harder to fix post merge than pre merge.

> This means that the reviewer can have a fair degree of confidence that the change is functionally correct and has undergone extensive testing before they review it.

This seems brittle when there are lots of changes going on at once. It may be hard to narrow down if something works or not.

> Building trust and learning how to work well as a team takes time.

I think this line makes it clear to me where my difference in opinion is. I don't even necessarily trust my own code!


> Nitpicks are what they are, and shouldn't block a PR in the first place, so not sure how much benefit this would be.

I will often make a hedged critical comment, but approve the PR. I am capable of identifying my own "nitpicks" that are not functionally relevant. This allows me to provide my opinion, and track examples of where my opinion on a particular subject could have been applied, while not mucking up the process with a minority view.


I try to just make it explicit. I'm quite happy to something and say "I don't like how you implemented A but that's a personal preference. Section B is a code smell but non-blocking. You need to fix C because it'll break prod in a way that tests don't yet catch. I'm willing to let you leave D as-is if you can explain why you did it like that."


I also split my comments by issue to allow for more nuanced discussion. I try to make nitpicks clear as nonblocking in when there is a milieu.


I'm not sure I understand how post-commit reviews handle the case when two people are both working on the same repository. For me, the reason to invest in testing and reviewing before merging to main is that once any person has merged, the rest of the team is subject to any issues in their code.

Nor do I understand the argument that delaying reviews lets a developer iterate faster. If you are going to continue iterating on your work without waiting for feedback, why does it matter whether the work is merged in already or still on your side branch?


> once any person has merged, the rest of the team is subject to any issues in their code.

Put another way, problems are more likely to be found before release.

Actual example: Plenty of times things have passed code review on my previous teams, and only after merging to trunk and being forced to use the changes have the other devs encountered problems that needed to be dealt with.

It does slow down the other dev, but unless it was something really obvious the two of them would typically pair on it until everything was good.


I always assumed that part of the reason you mandate pre-commit code reviews was to protect the business from a rogue developer. If you have a developer who is empowered to merge code that is continuously deployed to production before a code review is done, that developer could code something that exfiltrates data and could do a lot of harm before it's rolled back.

This insider attack is clearly viable since something like this happened with the recent Twitter hack of verified accounts.


I think this point is addressed in the article: they mention that you can (should?) implement reviews post commit, but before the code reaches production.


This is basically git flow except peer review happens before cutting a release instead of before integrating changes.

Even if I worked with a hall of fame team of developers, I don't want to integrate your crappy first draft that hasn't been peer reviewed or tested into my work. Generally, if you wouldn't deploy it to Prod as-is, I don't want to build on top of it.

When you allow merging of half-done work, it places additional cognitive overhead on each developer they now all have to keep up and be aware of the timeline of each other's changes and are potentially blocked from releasing until all the merged work that isn't yet released is done.

There is also the thinking that "developer velocity" = how fast you can merge stuff into a shared integration branch. This is dated, low-ownership thinking that I wouldn't want any team I am on to take part in.

The developer is not "done" until it's running in Prod. So you want process that makes that as frictionless as possible, while still having all the peer-review goodness.

The best flow I have experienced is dead simple and it's:

feature branch -> create PR -> peer-review/manual testing/business sign off -> merge to mainline -> CI/CD runs tests and deploys to Prod (from mainline)


This is a bad idea that does not bring any benefits.

All code that goes to the master branch must have been vetted (code reviews and tests).

It is perfectly possible to submit code to a review branch and then to merge it onto the master branch once reviewed. This can even be automated when used with code review tools, e.g. Gerrit.

Btw, dupe: https://news.ycombinator.com/item?id=23870194


It feels to me that it's pretty much the same. You still have to do review at some point, means engineer will be distracted, and probably it will require immediate attention since it's already merged, other things can depend on it, reverting is an high velocity team can be a challenge on its own. Not saying that it's bad per se, it just feels that you get the same amount of work and "distraction" anyway.


The only time I have seen WIP merges reviewed is when requested by the author or for a more junior engineer who may not understand the spec or codebase well. They might actually improve developer velocity in this case with _paired programming_ until the engineer is up to speed - that or a more involved planning phase.

Other than WIP reviews, I don't see any difference between pre-commit or post-commit reviewing - especially if they're both pre-deploy as described.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: