A lot of this is (at least aspirationally) table stakes for modern software development, but for those of us who have seen the sausage get made, a) a lot of the Well-Known Best Practices (TM) are not actually observed at a lot of companies, even with arbitrarily high levels of sophistication and resources available and b) they really do save metric tonnes of effort at scale. One of the reasons why seemingly trivial, clearly beneficial changes don't get adopted at some companies is that, at certain scales, it's like trying to turn an aircraft carrier. (Any process improvement which takes an engineer one day worth of productivity to adjust to costs Intuit several million dollars. Something which a startup could decide on in a week -- "We should switch to git and then have per-team branches with integration at..." -- is probably an $X00 million project there and if it failed would be catastrophic. You might blow the shipping schedule on Quickbooks, and if that misses tax season, "oh dear.")
It's always interesting how the larger Fortune 500 software companies adopt practices associated with teeny-tiny little software firms, and vice versa.
I fundamentally agree with your point: changing processes at large software teams is really, really hard, risky, and expensive.
But if I may pick a nit about something I think you were describing as a potential process improvement, per-team branches have been tried at scale many times before, both with and without Git. Many smart people I know who've had to clean up subsequent messes believe, counterintuitively, that the larger and more complex you get, the more branches hurt you, because they encourage you to delay integration, and integration is painful. Git merge capability, while excellent, can't save you if two teams independently decide to rename a method that both rely on; that's still a conflict. I've heard of several other large shops relying successfully on toggles with a single mainline.
I'll buy that -- I was just picking an example of a process tweak which sounded like a potential improvement that, absent scale, would be accomplished in two days with a project roadmap that would fit in a blog post.
I've been at workplaces where we have multiple epic branches that diverge for three or four iterations at a time. In this scenario, it doesn't matter what VCS you happen to be using, integration is going to be painful and error-prone.
More recently the branch and pull-request into master model has been working really nicely on a per-story basis, as long as people are considerate and mergeback from master before initiating the request.
Overall though, I think the only branch that gets deployed anywhere should be master (or trunk if you prefer). Like you say, it's a terrible substitute for a system built of decoupled components.
Nice article btw. The approach to integration recommended at the end seems exactly like something that might make the management of such a large code base easier.
Emphasis on the "aspirationally" -- it really appears not to be how most companies do things, despite overwhelming evidence that doing these things work.
Totally agreed. Incidentally, on evidence of working with savvier smaller companies, talking to one man shops, and hanging out with startups, I think the community feels that there is a body of "Everybody does X" practices which are actually quite rare.
Examples I'm intimately familiar with include A/b testing, using usage metrics to drive decisions, customer development, lifecycle email marketing, etc etc. Or for more dev focused stuff, unit tests, Selenium ("the best technology that I'll never use"), code reviews as a routine practice, reproducible server setup/deploys, etc.
> A/b testing, using usage metrics to drive decisions, customer development, lifecycle email marketing
Yes, but I've been pleasantly surprised at the clueyness of of more than a few big-name clients recently.
> Or for more dev focused stuff, unit tests, Selenium ("the best technology that I'll never use"), code reviews as a routine practice, reproducible server setup/deploys, etc.
For deployments at least I've been blown away by the range of configurations I've seen. Everything from "git push heroku master" to "ssh into these three machines, run svn up, restart apache, merge dev couchdb to live, copy paste these 80 sql patches out of a file into the terminal, sacrifice a hamster to Amon Ra and cross your fingers."
> A/B testing, using usage metrics to drive decisions, customer development, lifecycle email marketing... unit tests, Selenium [testing], code reviews as a routine practice, reproducible server setup/deploys, etc.
Many companies favor short-term win or Money over Technicality because there aren't many people who have the skill to put the numbers on Excel spreadsheet the amount of money the company can save had we done things "Right" at the first time during the meeting between Business and Tech team (Product Manager, CEO vs Team Lead/Dev Manager).
Basically it comes down to quantifying these quality works over developing new features.
I had the pleasure to work in a very small tight-knit team where you have to have the bulldog to bark at people when they're slowing down their defense (i.e.: writing unit-tests, making sure build script is up to date).
Results: on-time, within-budget, 1 Saturday, 1 Statutory holiday, 2 days OT until 10.30PM (all re-imburse afterwards) for a rather ambitious project to be shipped in 6 months from zero to live in 2 months afterwards.
Oh, and we have 1 performance bug after the release (easy one to solve) and 1 requirement bug 3 months after release. The rest (been live for about a year) was as smooth as the baby skin. We rotate the BB phone (on-call) but we never had it buzzed.
Our saving graces were:
0) Management Support
1) Continuous Integration
2) Unit-Testing
We use these unit-tests as REPL mostly, so we don't have to re-compile and re-deploy the _whole_ app to GlassFish.
Basically the people above the developers trust our judgment to work on these non-feature tasks: fixing build script (we use Maven but when it first setup, it didn't handle deployment to multi-environment that well), improving the component (we were newly created team, thus the reference of bulldog that bark) by either re-writing it or adding more unit-tests.
The management almost never questioned us except one or two situations.
I have seen some of this code (worked at Intuit briefly) and it is some of the most horrible Java (for web products) and C++ (desktop apps) code I have seen in my life. Admittedly this was a few years ago, but I've never ever seen such convoluted code anywhere, with (literally) thousands of cut and pasted fragments and poke-your-eyes-out code. You could look at a fragment of code and discover >1 in error every other line. I once asked my boss to select a random file of code from one of the flagship products and discovered 40 errors in 5 minutes (he was not happy, long story). It is a wonder these apps work as well as they do.
At the time they had this weird home brew mix of PSP and "agile" as their 'methodology' and while it generated a lot of meetings and paperwork, it didn't improve the code one bit. I am glad to know that they are moving towards better practices, but Intuit is the most technically inept organization I've seen, so I wouldn't be too hopeful of the end results. OTOH they do understand their customers (and marketing to them) really really well. I learned a lot from Intuit about these aspects of product dev. Which goes to prove you can make billions with totally screwed up engineering.
There's no way 10M LoC can be improved without taking a loss in the balance sheet.
No Way.
Because in order to do that, you need to stop writing new code effectively and just heads down and improve things. The thing is, if the codebase is tightly coupled, you've got tons of work to do as typically you can't improve one thing without changing the other. In the absence of unit-testing, nobody has the guts to refactor the code without high-risk of breaking the product.
I haven't even mentioned the cultural (human) issues that need to be fixed/changed as well.
So people truck forward, like in any software development shops :).
Speaking of which, thanks for sharing your insight working for Intuit. The article never mentioned the actual quality of code, but instead focusing on things that he improved :) (not a small feat, but probably don't improve the quality of the code).
The cultural issues are usually the largest issue (qv. Conway's Law). You don't need to stop writing new code, but you do need to stop adding crap code (or at least stop adding it faster than you can clean it up).
It's why I'm a big fan of code ownership as a negative reinforcement tool. Despite it being a really bad idea, it does tend to rein in the cowboys - because they're too busy fixing the bugs in their code to add new ones.
Absolutely they have to take a loss to fix it, but they are already suffering from a massive lose to software engineering productivity and are unlikely to retain any talent. This is technical debt at it worst.
Having lots of lines of code isn't a badge of honor, nor is it a particularly good measurement to begin with. I wish that publications would stop stating this number like it means anything. I could probably think of 10 ways off the top of my head to skew the lines of code that a function would require without even affecting the size of the binary (heck, just change the style of braces...). I could change one line and create a buggy mess, or remove 1000 lines and create a brilliant masterpiece of efficiency.
Something that might mean more is growth in code complexity over time. For instance, if measured every month for a year without changing the compiler version or platform, how much bigger or smaller did the compiled binaries become? How much more or less memory and runtime were consumed? How many "severe" bugs were found? When overlaid with information on the features that were added or removed during that time, it should become a lot clearer how large the code base really is and how maintainable it will be.
Saying that a company has 10,000 lines of code is just as relevant as saying they have a design that's 10,000 pixels big. It could hold more data, or it could be something super simple just made really poorly and very bloated. "Lines of code" is not a good unit of measurement for complexity.
LOC is not an exact measurement, but it can be (with some context like which language the code is in), a useful approximation of complexity. Your analogy with image size is flawed - it's not a black/white case, 'useful' or 'not useful'. LOC is somewhere in between - useful in some cases, even if not exactly quantified, because other measurements are unavailable or are too complex for a first-order assessment.
Let's compare with a movie, one that is constantly updated and worked on by many people, if such a thing exists. Is length of such a movie an exact measure of the complexity of that undertaking? No, but managing such a process for a 10 hour movie will probably be harder than that for a 2-minute one, even if it's just because there are probably many more people working on the 10 hour one.
LoC isn't a perfect metric but it is very easy to relate to. Given some extra context like the type of application, the size of the company or the age of the codebase one can mentally account for some of the weaknesses of using LoC as a metric.
If this were a study using LoC as a sole metric with no other context I'd agree with you but LoC seems perfectly adequate in this case.
Halstead complexity? Cyclomatic complexity? Those are just a couple of famous, old-school ones.
Software metrics is a major research area, and dates back to the 60s. The paper "Software Metrics: A Roadmap" has a good summary of the state-of-the-art as of 2000.
Could you, off the top of your head, give a rough idea of what cyclomatic complexity or Halstead complexity corresponds to a "large" project? In fact, given even a very simple code snippet, could you state on a cursory examination what these complexity values would be? Could most people reading this discussion?
If not, your alternatives don't serve the required purpose. Everyone gets that a project with 10 million lines of code is big. Whether it's bigger in any useful sense than another project of 8 million lines isn't really the point, and any alternative that doesn't have an immediate intuition for people reading the discussion isn't helping much.
You're right; in this case it probably is adequate. I'm by no means an expert programmer (alas, I come from the business development/marketing side of things), and I used to measure things in LoC as if it were a be-all end-all metric. Then I started programming and realized how ridiculous that is.
It's probably the best metric to use, I just wish there were something better.
Sounds a fair bit like the code base I work on. We have in the neighborhood of 5 million lines of (mostly C++) video game middleware that my department works on, all of which must be backwards compatible, all of which can't be significantly overhauled easily. It also builds on 15 different target platforms with 2-5 configuration variations per platform.
Similarly, we use Perforce. Similarly we have a CI system that builds our code on a farm of VMs, though ours does fairly extensive unit testing on every build. We also have a significant number of version permutations we test on a fairly regular basis as well.
We regularly, automatically, run Valgrind and it's on my to-do list to hook up the Microsoft static analyzer and investigate code coverage tools.
One saving grace is that the code base is very modular and has decently quick iteration times in general. Sometimes the legacy and the degree of cross-platform support can be frustrating but for the most part the infrastructure works so well now that it's no big deal.
"Unit testing isn't a big part of QuickBooks for Windows — the bulk of the codebase was written before unit testing was acknowledged as a best practice."
If there was ever an application that would benefit greatly from unit testing, this is it.
Code bases where fundamental aspects (models, database interaction, custom UI libraries, etc.) were developed and solidified far before unit testing was the norm or expected are a tough nut to crack, but we are making improvements every day. Within the last three months I've solved compilation and linker hurdles and we're now integrated with googletest and googlemock, which are fantastic libraries!
Also, while our list of C++ unit tests is small but growing now, the use of NUnit, etc. was championed from Day One on C# projects stretching back to our usage of .Net 1.1.
It's interesting how much working with Ruby the last 7 years has affected my C++ development. I'm dual-wielding Avid Grimm's Objects-on-Rails and Michael Feather's Working Effectively with Legacy Code with great results.
I would agree with what you said if you replaced "an application" with "a kind of application". This is a 10MLoC project that was mostly not written for unit testablility. As stated in the article, they are doing automatic integration testing on a functional level to catch regressions. They are enforcing zero warnings and run static analysis software. And most importantly, their code has worked in the field for a long time. I strongly suspect that refactoring for unit testability would be a very significant risk in their case. The quote does not say that they don't unit-test new code.
Agreed - I didn't mean for it to be criticism of the project, but that line just jumped out at me. Like you say, for new projects they are using unit testing.
That was almost a blast from my past. They were using a lot of the tools I used at a previous company. That company's UI code base was 3 MM SLOC or so and we used perforce and silk though we didn't use any build accelerators as I had the UI down to under 1 hour from scratch and the backend was under 3 hours if I recall and this was 8 years ago or so.
That codebase was a beast and there was code from 1979 floating around in some of the core Fortran routines. The scary part was the app ostensibly had backward compatibility to files created in the 80s though it was well designed enough to have anticipated most of cases.
I'm sort of surprised their build takes so long without accelerators but then again I'm not as without using precompiled headers I think our build was several hours longer. If they could not structure their projects to take advantage of pch files and if the source is very #include heavy I can see it. How external code references are handled is another reason interpreted languages rock and Java/.NET are superior to C and C++ build systems in my experience.
I work in investment banking and this article is no surprise. Most of the banks I have worked at so far have old legacy sphagetti code all around and like someone mentioned, it is extremely difficult to work on refactoring which does not add any benefit for the business in short term. I have had many conversations with managers and even though everyone knows, no one wants to be the guy who changed status quo.
How do you manage to write 10 Million Lines of Code for an accounting software? I haven't had a proper look at it yet, but that seems like overkill. That's an entire OpenSolaris worth of lines in an accounting software, just to put this into perspective.
But that's a separate product - QuickBooks Payroll.
Also, instruction for payroll deductions are pretty simple[1]. Usually everything you need is on one page. Plus a very short list of exemptions (401(k), HSA, FSA, cafeteria). Nowhere near 11,000 pages of tax code.
QuickBooks Payroll is an add-on whose code lives side-by-side (and is deployed with) the QuickBooks Desktop code. I am the developer that maintains the integration of the TurboTax tax processing code and engines technology with QuickBooks' accounting and payroll functionality.
"Payroll deductions are pretty simple" is not as true when you require nationwide support. The Yonkers residency tax, the Indiana counties payroll tax, etc. are not as simple as getting the employees' states and running calcs. Add in non-tax deductions (401(k), wage garnishments, worker's comp) and you've now found yourself in an interesting world.
"You wouldn't guess it," Burt says, "but on each platform, all those products come out of a single codebase. The Windows version is about 80,000 source files, 10+ million lines of C++ code plus a little C# for the .NET parts.
Well a table driven rules engine for tax is probably a whole programming language with optimizer at a guess... and then it has to generate custom UIs based on the local tax rules. I can imagine this being bloated.
Yes, it's a common flaw among those who don't have any experience with this kind of business software to think that anything "purely technical" (like an operating system) must be more complex. I work on a piece of business software and the database layer, interprocess communications layer, etc all pale in comparison to the code dealing with such mundane items as orders and invoices.
Both! I think the the complexity stems from the fact that an order (or an invoice) is a "human" and "fuzzy" artifact, while a relational database or TCP/IP stack is a mathematical-technical artifact and therefore much more amenable to the computer model. (I don't want to imply that a database isn't complex, but as Intuit shows: modeling accounting on a computer is also pretty damn complex.)
In the past, Burt says, when a build took 4 hours instead of 45 minutes...
Wow! And they've apparently spent a tremendous amount of time and resources optimizing the build process.
It's been 10 years since I worked on a project where a build took more than a few minutes. Back then, a project I worked on took over 3 hours to compile on a 4-processor Solaris box. The PC technology improved so drastically that in a few years we were able to get it cross-compiling to Solaris from a Linux box in under 20 minutes.
Not terribly surprising. Back when I worked at MSFT, I spent ~3 months (and had two other developers and a couple of build engineers working with me) taking our 3 _day_ build and getting it down to ~12 hours. But, that was several times larger than this system (even then), and it was about 7 years ago.
And it had to support all sorts of weird things such as "which compiler do you use to build the compiler; the compiler you just built, the compiler you last used, or the compiler we last shipped?" for a very large configuration of compilers, runtime platforms, etc.
When I worked there a couple years ago, requesting a custom Windows build still took overnight, if you wanted more exhaustive testing than just xcopying a couple binaries.
It's always interesting how the larger Fortune 500 software companies adopt practices associated with teeny-tiny little software firms, and vice versa.