Hacker News new | past | comments | ask | show | jobs | submit login
Ask YC: What do you do when a small project keeps getting larger?
13 points by chez17 on Aug 18, 2008 | hide | past | favorite | 20 comments
I am working on what started as a simple web scheduling program. Since its inception, the client has added different types of customers, different types of jobs, job tracking, schedule editing, etc... Basically, because this system has expanded beyond everything it was originally supposed to do, the code has gotten extremely messy and hard to maintain. This is all my fault of course, but I am not the super programmer many of you experts are here. I don't know what to do at this point, small changes are taking forever and some of the stuff they want done seem to be almost impossible now. Any help is extremely appreciated. Help me News.YC, you're my only hope.



There was a thread on reddit just a couple of days ago about maintenance. Many useful ideas to be found there:

http://www.reddit.com/comments/6wn6j/maintenance_work_blows_...

The short answer is that it sounds like it's time for you to begin taking code quality seriously (better to have done so from the start, but we all fail at that most of the time). Whatever your language of choice is, find the resources that cover good coding practices, and begin making use of them. I work in Perl, and the definitive work is Perl Best Practices (and perlstyle covers a few bits and pieces)...there may not be such a definitive work in your preferred language, but I'm sure there's something. JavaScript has Crockford's new book which, based on reviews, is excellent--I trust that's true, as his lectures and his website are truly wonderful resources.

If you don't have tests, start writing them, and including test development costs in your estimates. If you don't have good code documentation, for yourself and future maintainers, start writing it, and including the cost of writing it in your estimates. It doesn't take that much longer to write a test while you're working on a piece of code--it's fresh in your mind, so you know exactly what you need to test (or vice versa...some folks write the tests and then write the code...I flip back and forth based on whether I know how the code is going to work yet or not).


Thanks for the reddit thread.

Code quality seems to be something I have never really learned. I understand some of the principles of it, but executing them over a long period of time on a project that keeps expanding is certainly something I am not very good at. One thing I should have specified in my explanation above is that this project has been going on for years. They come up with some changes, I finish them, nothing happens for a couple months, and then a phone call with "What if it did this...". The system has gotten so big and bloated over that time.


One thing I should have specified in my explanation above is that this project has been going on for years.

I'm working on a nearly 11 year old codebase. The age of the code makes no difference (actually, it probably does...you probably need to be more respectful of old code, because it probably knows about stuff, like edge cases, that you've long forgotten). You can always gradually improve your code. I get a vague feeling that you're looking for someone to tell you, "A rewrite is just what the doctor ordered!" And, I can assure you that no one with more than passing experience in the field will suggest it (and I note that others have already pointed out good resources on why not...but they mostly boil down to: you'll still hate the code when it's rewritten, and you'll be months or years behind where you would have been had you not rewritten from scratch).

The system has gotten so big and bloated over that time.

That's what living systems do. If it were useless, it could remain tiny and pretty. But real world systems have ugly problems and sometimes uglier solutions.

Your job is to use it as a learning experience. Every time something new is added, require yourself to make adding that new feature easier, and then add it. e.g., if you need to add support for a new data format, take it as an opportunity to abstract out the import/export routines so adding additional data formats is trivial. Once you've abstracted it out, and you have the original data format code working in the new model, then you can add the new data format.

If Lispers have one really undeniably great notion, it is that the tools should match the job, so they build the tools for doing the job, and then they do the job with the custom tools. Most languages aren't quite as malleable as Lisp (though Perl, Ruby, and to a somewhat lesser degree, Python, come pretty close on many fronts), but you can still think of your task as being, "What would a library or subroutine look like that would make adding this new data format really easy?" And then you build that library or subroutine rather than jumping right into writing the data format handling code.

And, of course, every time you have to touch a file, leave it prettier than you found it. Even if it's just mild improvements. Slow and steady wins the race, and such.


This is what I needed to hear. Thanks for the great advice.


I guess a line needs to be drawn.

Have you been paid for a general set of deliverables and only now are the specific requirements coming out?

If you haven't been paid yet, will they not pay until they're "satisfied"?

Is the site even online yet?

That line being drawn is between the original contract and the hourly rate which you'll charge for further enhancements.

(I don't know if you're doing this on contract, as a favor for a small set amount, or for free.)

If maintaining the codebase is becoming unbearable, you probably can't throw away the whole thing and start over.

Maybe you could start the reorganizing effort with the enhancements and work in changes to your yucky, "legacy" code as time permits.

Warning: Any changes you make to simply allow easier maintenance of your code in the future can possibly not be seen by the customer as requirements or time that they'll pay for.


The site is online and live. They use it daily. The changes have been made over a couple of years, so I have been paid for the work that has been completed, which is everything except this last round of changes.


So you just never expected it to get this complicated and you didn't plan for it? Are you looking for help on how to fix it, or how to approach your client and let them know that its become unworkable?


More about the philosophy of quality code. I am going to do the work, that's not the issue, but I was asking for advice on how to proceed with the coding.

I'm sure this has happened to people before and will happen again. I was also thinking that a thread on something like this would be very useful to many people besides myself.


There are all kinds of things we could discuss about this. It might be fun if you mentioned some specific examples and people could offer suggestions. In general, though, I'd suggest you begin paying a lot more attention to what, and when, you're duplicating in your code. There are lots of ways for duplicate code or data to creep into a system. Developing awareness of what they are, and discipline to do something about it, is probably your best bet to start with. You'd be surprised at how far that alone will get you.

Edit: I'll add something about my own practice. When making changes to a system, I nearly always bundle the visible changes together with invisible changes to clean up problems with that portion of the code (if I see any). I don't ask for permission to do this and I don't do big cleanup projects as standalone endeavors. Rather, I amortize the cost of design improvement as evenly as possible over the lifetime of the project. Think of it as paying off your credit card bill each month, rather than waiting for the collection agency to come after you.

In other words, there's basically never a day when I'm not correcting or improving code in addition to adding it. If I were you, I'd start doing this now, and I'd start out with low-hanging fruit (obvious, easy improvements) rather than attempting anything too grand.

You may experience a slowdown in your progress if you switch to working this way - but your throughput will be greater in the long run. Besides, it sounds like you've hit the point, with this codebase, where your changes are taking longer no matter what you do.


If you haven't yet, then please have someone else take a look at your code. I'm thinking they'll be able to give you some pointers.

I can only guess in what a mess your codebase must be; spaghetti code with sparse, little "helper" functions, perhaps?

Is there duplicate code?

1) Opening and querying databases. 2) Presentation/UI. 3) Dynamic menuing systems; it's all static behind the scenes.

Can you give us more of an idea of how it's so difficult to work within your project?


More about the philosophy of quality code.

I can offer the following vague advice:

- When writing new code, do the best you can with the information you have. Resist taking shortcuts (copy & paste instead of a minor refactoring, etc.)

- Refactor old code as you go along, but do it cautiously, and only when it's worth it (very clear performance or maintainability advantages, for example).


Don't rewrite from scratch: http://www.joelonsoftware.com/articles/fog0000000069.html

Can't you refactor the bits you are working on? Like they ask you to add feature X, you refactor parts of the code to make it easier to add feature X. Don't know if reading the book about Refactoring would help, I think the main point was to cover your refactorings with unit tests.


For every absolute rule there must be an exception somewhere... ;-)

Several years ago I was the tech & operations dude for a little organic grocery store. They were good people, with not a lot of money to throw around, so the cheap & quick fix-it-now solution was always pretty attractive.

Well, one of the things they needed was a program that would take two input files -- one a current list of products' wholesale costs, and the other a current list of what was in their point-of-sale database and their retail prices -- and produce a report containing all the price changes they needed to make that week.

VBScript will make most programmers' blood run cold, but it was a good way to solve that problem at the time. Zero initial cost, simple to hack out, ran on their aging Windows point-of-sale server, etc. So that's what I did.

Then the boss came back and said, "This is really cool. Can you make it do X?" And I said, "Sure", and that process repeated for almost two years.

The VBScript program long since got pushed beyond all of the boundaries that any programmer with any good sense would ever push it, and they're still using it today. I have gently refused to update the program any further, or to fix any of the odd bugs it exhibits occasionally, mostly because updating the program is now too expensive.

It really does need to be rewritten from scratch in something more useful, like Python, with a clearer idea of all of the features that they'll want in the end.

I think my point is, sometimes you start with something simple, and use the methods that make sense at the time, and then it grows beyond the limits of its actual architecture. It's often impossible in a situation like that to just refactor portions of the code in a useful way. So, then it might be time to (carefully) consider rewriting the whole mess. You have the benefit of having a clearer idea of what you need to create, right from the beginning, and you can avoid the architectural mistakes that were made with the first system.


Greta article. I was thinking many of the things he mentions in this, I was thinking of doing the whole rewrite thing but this really convinced me not to. Thanks.


Maybe this article can help: http://nedbatchelder.com/text/deleting-code.html

Also, do you have a clear data model of your application? If you have a clear data model of your application, with the right foreign keys among tables, then almost everything the application does is INSERT, UPDATE or DELETE. When you can have the data model in your mind, then you can understand what the application should do and start deleting code.

Edit: Is it the problem that the client keeps extending the requisites? Is it clear in the contract what was going to be in the application and what not?


Are they expanding the requirements and still expecting to only pay for the original, or is this a case of they were happy to pay, you just didn't expect it and now the programming is a mess, and now when they expect to pay for a little change its actually a huge change for you to make?


I agree with SwellJoe, gruseom and gasull: this is a code maintenance issue, and the way to sell it to the customer (and to yourself, so you can get started) is bundling some cleanup with each feature.

I ran into some ancient and crufty code at my last job, and learned some things in the process of dealing with it.

1. Make sure you're using sane version control -- SVN or a DVCS. If you're on a Unix or have access to cygwin, getting started looks like:

  cd ~/projects/bigco/thisone
  git init
  git add .
  git commit
This will give you the courage and safety net to make big changes to your code.

2. Look at your build file (Makefile?) and see what it says about the project's organization. You probably have some idea of what's wrong already, but this might point out things like files that are never compiled, the same few files used to build everything else -- even though one or more files really should only do one specific thing.

3. Start cutting dependencies. Starting with the file where you're adding the newest feature, look into each of the files/modules included/imported and see if the dependency is really necessary. Maybe you only depend on one function or data definition in that other file, and it would make sense to merge that code with another existing dependency. Also take a hard look at your "utils" file/folder (every project has one) and see if some parts can be merged elsewhere, or even eliminated by using existing features of your platform.

4. Clean up the configuration file. Are some options irrelevant now? Are the defaults sane? Look at the code that touches it. Are there more defaults hidden in the code, when it should all be in the config file? Are there settings at the top of some files that could be moved to the config file? Don't go overboard, though: you usually don't need a configuration option to tell one piece of code about a choice you made in another piece of code in the same project. Clean that up, delete some if/else branches, and maybe find and clear some obsolete code in the process.

Also, we might be able to give more specific tips if you tell us more about the platform you're using. If the project is compiled, static analysis tools are worth a shot. If you're on Unix/Linux, it can be useful to break off pieces of code into standalone scripts, or better, use an existing (but perhaps obscure) command-line utility to replace old code or base a new feature on. Maybe use cron instead of your homespun scheduling system to launch these separate processes. But at least the customer is supporting this effort, right?


Offtopic/

Gotta love the community for this: thoughtfulness, mutual understanding and respect.


chez17, do you do contract programing?


Yes. I am a freelancer and this project has come from one of my best sources for work. These changes have come over the course of a year or two so the issue is more about maintainability than initial planning. There was no way to know what the system would become when it was originally created.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: