Code Janitor: nobody's dream, everyone's job

diiq · on Sept 30, 2012

Am I alone in enjoying "adaptive" and "perfective" code maintenance? Dashing off new code is fun, but making bad code into good code is immensely challenging and satisfying to me.

I think it's hard to write beautiful code straight off -- it very rarely happens to me, anyway. So for me, writing beautiful code is mostly perfective maintenance anyway -- or do I misunderstand?

pilif · on Sept 30, 2012

Agreed. I love cleaning out warts, fixing broken APIs and removing unneeded code. I love looking at the output at "git diff --staged" seeing lots and lots of red and very little green.

But that's just me - any maybe other developers in my team. For everybody else - marketing, sales, customers, end users - this is completely irrelevant. Nobody really understands this kind of work, much less actually sees a value in it.

Adding a blinking logo somewhere and you're the hero of the day. Exchanging 500 lines of bad spaghetti with 20 lines of beautiful art and you're wasting everybody's time.

But because I love this kind of work so much, I don't care. I'm in a position where I can do such things for fun an pleasure, so I'm doing them.

coderintherye · on Sept 30, 2012

I like it, because I seem to have a knack for it and mostly because I enjoy working on things that no one else wants to touch. It's nice to have a degree of autonomy and not have to deal with working on a big team, even if it means not getting to work on new things.

JamesLeonis · on Sept 30, 2012

Fixing old code can be a very satisfying experience. However this is usually tempered by management apathy for code maintenance. It's an invisible problem; nobody can see when you fix a piece of bad code unless it's causing a bug. This becomes aggravated when "25% of the people doing maintenance are students and up to 61% are new hires".

To management it's, at best, an "if it ain't broke don't fix it" attitude. At worst they consider programmers prima donnas. It's immensely frustrating trying to convince them of the utility of code maintenance, even when doing so pays dividends in future feature development flexibility and bug fixing.

I would love to see some strategies for convincing management of the virtues of maintenance.

avar · on Sept 30, 2012

You do that by understanding some of the tradeoffs involved and showing that you understand that maintenance for the sake of maintenance is usually a waste of time.

* If it was causing some immediate issue or had a bug that needed to be fixed you could justify it that way, but all too often someone wants to refactor something to make it "nicer"

* Code that's rewritten is going to have all new bugs of its own, those are going to end up wasting more time than the time spent on the initial rewrite.

* A lot of programmers will get the urge to rewrite something before they fully understand the system they're rewriting (and thus aren't qualified to do so). You might be 90% through rewriting it to be nicer only to discover that the remaining 10% doesn't fit with your design. There was some design requirement that you missed which caused the initial code to be so convoluted and "nasty" in the first place, because it was solving a tricky problem in some non-obvious way.

* In a growing business requirements change all the time, maybe you'll spend 6 months now rewriting some system to be nicer, but 2 years down the line it turns out that that system was inherently broken and has to be thrown away and replaced. So it would have been better to commit 1 month of time over those 2 years to keep it on life-support.

* Speaking of resource allocation: Is now the right time to do this, you probably have a 100 tasks with manpower to accomplish 10 of them. Maybe this is better done in half a year when you have more hires, or not at all if some of those other 90 things are more important.

But finally, if it's really the right thing to do just do it incrementally as part of other tasks. If it's really code you're working on all the time, and fixing certain issues saves you time down the line just make those fixes along with other tasks.

ollysb · on Sept 30, 2012

The only time I perform maintenance is when I'm adding a new feature. Before I add any new code I'll inspect the code that will be touched by the new code. If it's hard to understand or the abstractions are a bit wonky for what I want to add then I'll spend some time refactoring it. In this way your maintenance work is always done while delivering observable value to the business. You don't need to provide any justification for it. If it means you're going to take a bit longer to deliver a feature than expected you can just tell your boss that the code wasn't in the state that you expected when you gave an estimate. If you wrote the code in the first place it's fine as well. You can just say that the priority at the time was to get the feature out fast so there wasn't time to shine it up.

qznc · on Sept 30, 2012

Refactoring is important and management should be convinced for this. Most managers also insist that the (physical!) desktops are kept clean. For the same reason code bases should be kept clean. The reason in both cases: Dirt slows down in the long run.

lifeisstillgood · on Sept 30, 2012

Thank you. I can never write the code I want first time - I always end up creeping to a solution, being unhappy with the way it works and so trying something else till it all fits right.

Plays hell with source comments

ZoFreX · on Sept 30, 2012

Give me a giant codebase I've never seen before that started bad and turned into a spaghettified mess, and you'll have a very happy developer on your hands :)

mononcqc · on Sept 30, 2012

I now enjoy it, but from my understanding, it depends on the attitude the team and management has towards it. Maintenance during the earlier phases of a product is also very different from doing it in the very late phases, when you know all your efforts are going to just delaying its inevitable death.

JoeAltmaier · on Sept 30, 2012

All coding efforts are just delaying the product's inevitable death. Software lifecycles are shorter and shorter. A style of shoes has more staying power than most software.

br0ke · on Sept 30, 2012

I'm a big fan of it, myself. We call it 'code gardening' on my team, and I find it has a similar zen like aspect.

clueless123 · on Sept 30, 2012

Once at a wedding dinner in Asia I got in a lot of trouble for describing my job as "computer janitor/computer fireman". The table was filled with with people in my same line of work who where very proud of their jobs. My comment/silly joke was not taken nicely and got me into the dog house with the wife (Asian) for a good while!

Since then I've been proudly describing my job as software gardener.

mononcqc · on Sept 30, 2012

What an insulting attitude towards janitors they have!

reinhardt · on Sept 30, 2012

Absolutely. One of the most satisfying tasks for me is transforming a big steaming pile of spaghetti code (be it procedural or, more often these days, design pattern infested OO code that involves a dozen classes just to print "hello world") to a simpler, elegant codebase with half the LOC.

zem · on Oct 1, 2012

same here. it's not only immensely satisfying, but you are guaranteed to have something beautiful to show for it without having to worry if the product itself is going to be worthwhile.

cellularmitosis · on Sept 30, 2012

For those of you (like me) who's only exposure to the acronym "OTP" is in reference to "one time passwords", I'm guessing he's referring to this: http://learnyousomeerlang.com/what-is-otp

mrj · on Sept 30, 2012

   It's obvious that making changes to the set-up in the second pictures will take less time than modifying the first picture's set-up.

I'm not so sure. I had to snake a single bad cable out once and redoing all the pretty bundles took half the afternoon. :-)

thwarted · on Sept 30, 2012

If only those pictures were of the same thing. The first one is obviously the front of a rack, where all the cables are meant to be non-permanent to afford you flexibility.

The second picture is the back of a rack of punch-down blocks, which are meant to be permanent. The only time these changes is when there is a bad wire; vs the front of the rack, you change that wiring all the time. That being said, I've seen bad wiring, rather than be pulled out of the bundles, just cut and a new, supplementary cable put in in it's place, and not strictly part of the established bundles (but still neatly run).

The back of the racks are neat purely so you can be flexible on the front. Now, obviously, one can create neat front-rack cable layouts too, and potentially run into the problem you had redoing all the bundles, but the neatness is directly related to the intended permanence of the wiring.

AngryParsley · on Sept 30, 2012

The more OTP you know, the less time you expect it takes to adapt to a code base.

I don't think your conclusion is as concrete as you make it out to be. One's knowledge of OTP likely correlates with general programming ability and experience. Better, more-experienced programmers probably find it easier to learn new code bases, so they don't need as much ramp-up time from management.

Overall though, good post. Kudos for gathering data!

cheez · on Sept 30, 2012

This is an excellent post, thank you for sharing.

I find it mind-blowing that multi-million dollar companies who are struggling with growth cannot see the quality of their software engineering actually prevents growth. They try rewrites but with the same approach to software engineering as always, it is doomed to fail.

clueless123 · on Sept 30, 2012

I think Code Gardener is a better name.

Software is like a garden. It can be master planned, well manicured, organized or it can be the result of uncontrolled organic growth of years and years.

Either way, my job is to keep it healthy and servings its purpose. It is my job to water,fertilize, pull out weeds as they show up and sometimes to uproot whole trees and plant them somewhere else as they are necessary for the system.

Continuing with the analogy, it is key to any good gardener to have good tools to do the job, because they make all the difference on how hard you have to work and how much you can get done in a season.

CGamesPlay · on Sept 30, 2012

One of my favorite things about the code base I work in (with a over 1,000 engineers) is the codemod culture. Whenever we create a new abstraction, we mass-convert the code over to it. This means that for many simple things like renaming methods and classes, we can deprecate methods (by renaming them to oldSlowWay_DEPRECATED), and this is an automatic flag to any engineer that they need to dig into the code and find the newFastWay.

If you believe in self-documenting code, I feel like you need to do things like this because it's part of keeping your documentation up to date. (In our code base, we maintain separate documentation, but reading the code is essential to getting the total picture.)

michaelochurch · on Sept 30, 2012

This is an excellent post. I'm now excited about learning OTP. I've seen so many code-quality problems, some of which have destroyed companies, that I'd really love to see if there's a way to solve this problem. "No silver bullet" seems to be the going attitude. Code sucks, it always turns to garbage, and no one reads the shit anymore (that's why we have IDEs! Wheee!) I really want to disagree with this. Readable code shouldn't be a rarity.

Apologies in advance for what's going to seem like self-promotion. Many of my opinions I've already plopped onto the internet in my blog so I'm going to be doing a lot of blog linking.

Maintenance has a nasty paradox. To do it right, you need political power. You need to be able to say, "I am fixing this and that's that and any problems it causes for you are your problem." The level of political clout you need to do maintenance decently is manager-level. You need help from a lot of people and if you don't have the ability to force them to prioritizing helping you in this gnarly task, they probably won't. (Feature freeze? I need a Bob-ruling before I do that.) In most companies, maintenance is not rewarded, and it's not fun, so people with manager-level clout generally avoid such work. (http://michaelochurch.wordpress.com/2011/09/23/taxation-with...) The politically enabled people tend to take the "shiny new" work (and since no one has the clout to oppose them on anything material, they often made bad decisions and produce shoddy results) and the underclasses end up maintaining the garbage after they launch, get promoted, and flee the system. So maintenance ends up being a workball that gets dropped on people who don't have the power to do it right, don't have the expertise to know how to do it, and don't have any good reason to care.

People underestimate complexity density and how much it can vary, and also how high it can be in source code. How long does it take to read "a book"? Well, it's between 2 and 400 hours for me, depending on the material. 10,000 lines of code is a book. Sometimes it's boilerplate that you can scan, but more often, it's closer to the 400-hour end of that extreme.

IDEs don't solve this problem very well, because they tend toward "four-wheel drive" syndrome: they get you stuck in more inaccessible places. I think read-only IDEs are indispensable, but if I were running a company, I'd be tempted to disallow IDEs for new development. One of the perks of maintenance is that you get to use an IDE. If you're on new development, then as long as the code is good I don't care and you can use whatever tools you want, but if I see you writing obvious "IDE code" I am taking that shit away.

One thing that I think is at the root of the problem is a denial about what programming really should be. People say "math is hard", but math is actually easy. At least, it's an only-somewhat-hard way to do things that would otherwise be impossible. Imagine what it would be like to build a physics engine if calculus and linear algebra didn't exist. How would you generalize linear regression to N dimensions without matrix calculus? Quite often, the mathematical abstraction is the simplest (and most terse) way of representing and solving a class of problems. It's hard, because of the intrinsic complexity. Mathematicians are also familiar with the experience of taking a whole day to read and understand a 4-page paper. They know, from painful experience, how much complexity density can exist in 100 innocent-looking lines of code.

Why the FactoryFactory crowd exists is that there's a class of programmers who really don't belong in the industry because they despise or can't hack math (and another class of programmers who have the ability but have been misled into thinking that FactoryFactory bullshit is actually the right way to do things). So instead of mathematical abstractions, they use pointless business-y shit made up to turn the tables against smart people with taste. ("What? You've never used a MetaSelectorVisitorSingletonFactoryFactory? And you call yourself a programmer?") Of course, even though the individual pieces that commodity programmers create are less difficult to understand than appropriate mathematical abstractions, the overall tangle that must be generated in order to have a chance at solving the problem becomes incomprehensible quick. But (channeling a commodity programmer) "no one reads code" anyway. What, you expected insight into the problem in that shit? We just wrote that because our bosses told us to do it.

At any rate, your insight into the sociological problems associated with code maintenance is right-on. The work is given to the people who are least equipped to do it. So can't companies just allocate maintenance work to the best programmers? The answer is no. First, if the best programmers do maintenance work in a company that doesn't value that stuff, they're seen as just screwing around doing junior-level work. Second, no one good will tolerate being "allocated" to maintenance. Good people will happily maintain code they care about, but if you tell a good programmer that she has to maintain someone else's legacy or find a new job, she'll take the latter. So you need a culture in which good engineers choose maintenance and have the political power to do it right. It's really hard to make that happen.

Google is better than the vast majority of companies in this regard, but they still have a culture where it's well understood that promotions usually come from launches. Maintenance is relatively well-respected (by industry standards) at Google, but new invention is still better for your career. You can get promoted on maintenance (which makes Google better than most places) but your odds still aren't as good, and there's more downside to maintenance, where the worst-case outcome is macroscopic non-accomplishment (i.e. you beat your head against a wall and have nothing to show for it). There are large bonuses (sometimes 6- or 7-figures) for people who take on important legacy rescues, which is a step in the right direction-- an acknowledgment that it costs major dough to get anyone good to maintain someone else's mess-- but those usually go to managers and tech leads who oversee the rescue efforts; the engineers are "just doing their job" and usually try to transfer to something with more upside as soon as they can.

So the second point about developing a decent maintenance culture is that you need open allocation (http://michaelochurch.wordpress.com/2012/09/03/tech-companie...). You can't "allocate" good engineers to maintenance. They have to choose it, which means you need an environment where people are encouraged to work on what's important to them and to the company. I've taken to using a 2-by-2 matrix to explain this. First category work is interesting, important stuff: core machine learning algorithms, search at Google. Allocating this work is no problem. Second category is the important but undesired work. Third category is interesting stuff that hasn't become important yet: R&D-type projects best suited for "20% time". Fourth category is unimportant and undesirable work. Important legacy rescues are 2nd-category work. Under open allocation, companies have to create genuine incentives (not just "do it or I'll fire you") for people to do 2nd-category work, and 4th-category work doesn't get done (management isn't willing to pay or promote for it, because it's not actually important). But 4th category work shouldn't be getting done in the first place. It pisses people off and adds minimal business value.

By the way, closed-allocation shops generate huge amounts of 4th-category work (often the majority of the workload of a closed-allocation shop is 4th-category). If you have a closed-allocation shop, the 2nd and 4th category work gets glommed together in "that ball of work we (management) allocate to people we dislike" and not done well. This is a mistake; 2nd-category work actually is important (by definition) and shouldn't get the 4th-category treatment. It needs to be done by people who have the ability and actually care.

Third subrant, and then I'm done (for now). I don't know what OTP is, but these insights about modularity and small-program methodology are right-on. The Unix philosophy works. The only case I've seen of a program departing from the small-program philosophy and adding value is in the database world. Databases have very stringent requirements related to performance, transactional integrity, durability, and concurrency, and all the pieces have to work together. So, they become large systems that macroscopically function as huge, featureful programs. It has also taken some of the best minds in computer science decades to get this stuff right. Databases are hard, yo. Do you have the best minds in CS on your business-logic codeball? Do you have decades? If no, then avoid big-program methodologies outright.

If you use the big-program methodology, your programs end up becoming parochial because their surface area is huge and requirements get barfed on it and the code becomes a pile of political injections, not real software. You end up with Java Shop politics (http://michaelochurch.wordpress.com/2012/04/13/java-shop-pol...). However, to be fair, this problem has little to do with Java itself and is more accurately described as a problem afflicting large-program philosophies in general.

nahname · on Sept 30, 2012

What is "IDE code"?

politician · on Sept 30, 2012

Visual Studio 2012 tries to provide IntelliSense for JavaScript, but it'll only show symbols it can find at design-time (statically-reachable) instead of all of the possible symbols available at run-time. The IDE has the same problem with C# dynamics. Anyway, C# developers raised on perfect IntelliSense from versions past can stumble when relying on IntelliSense for JavaScript. This can lead to code that's written to appease the IDE instead of the idioms of the language.

Personally, I prefer the way Sublime Text 2 implements auto-complete (for dynamic languages anyway) -- it simply indexes the words previously typed in the file.

nahname · on Sept 30, 2012

ReSharper/Web Storm do a much better job wrt to JavaScript. I still don't feel like I understand what you mean by "IDE code". Is all code written in an IDE, "IDE code"? I was left with the impression that it is some inferior subset.

Would you write C#/Java outside of an IDE to prevent "IDE code"? What would the difference be?