Please don’t separate Deployment from Development

I didn’t highlight this point before, but will do so now.

Something that appears to be typical of a number of organizations is to strictly separate the deployment of software from the development of that software into different teams.  Different rationales are offered for why this is important, some of which are ‘organizational’ in nature (‘Legal’ says we need separation of duties), others are ‘conceptual’ in nature, and since there’s usually not much one can do when ‘Legal’ says something, I’ll focus on the latter.

One useful thing that I learned from my Philosophy background is that it is best in the long run to understand a position you disagree with as best you can, for a number of reasons.  One, once you try to think of how someone would believe the position, you might change your mind.  Two, it allows you to build the best criticism you can of the position.

It’s a useful thing, but it’s also difficult to do.  It’s a lot easier (and more fun) to just call someone an idiot.

Anyway’s, as best as I can tell, the ‘conceptual’ rationale for why you would separate the individuals/teams when it comes to development and deployment is something like the following: (making something up) suppose Billy is in charge of the Account Automation process.  He wrote the code, he knows what it does, when it does it, and he knows how to deploy it.

You immediately run into the ‘hit by a bus’ problem, if Billy gets hit by a bus, then you are going to have problems when it comes to changing the Account Automation process.  Additionally, Billy is the one who is aware of any ‘manual steps’ (also often known as ‘hacks’) that are required to get it deployed.

By requiring Billy to put together a detailed description of how the software needs to be deployed so that it can be followed by someone else, he’s (conceptually) more likely to make sure it is as detailed and automated as possible.  This effort (conceptually) allows more people to be involved, thus limiting the ‘bus’ problem.

All of this is well and good.  But here are some of the reasons why the idea is conceptually flawed:

Suppose a deployment fails (and let’s face it, they will from time to time).  Who’s responsible for fixing it?  By instituting a strict separation of duties, it isn’t clear.  Deployments can fail for all sorts of reasons, but it quickly can devolve into a lot of finger pointing, where “who’s responsible” turns into “who’s at fault” into “who’s to blame.”  Even if it doesn’t devolve that way, you no longer have a center of responsibility, a clear directive of who should take the lead in resolving the issue.  Ideally, everyone vaguely related gets involved as a team, but if you’ve ever worked in a large organization, this ideal situation is rarely achievable (for reasons good and bad).

Because Billy knows the system and how it should be deployed, he knows what to look for if the document he produced ends up leading to a deployment failure.  The deployment team doesn’t necessarily have the expertise to know what to look for (though one would hope their expertise would grow), which leads to a focus on documentation as opposed to problem solving.  (I once had a support team request that I ‘list every possible scenario in which the software could fail, and describe the solution.’  Really?  Every possible scenario?  Like if terrorists nuke the data center?). 

As Facebook hit their 500 millionth user, a lot of posts came out, one of which I’ve already noted, and from which I’ll quote extensively:

“None of the previous principles work without operations and engineering teams that work together seamlessly, and can handle problems together as a team. This is much easier said than done, but we have one general principle that's enormously helpful:
The person responsible for something must have control over it.

This seems terribly obvious but it's often not the case. The classic example is someone deploying code that someone else wrote. The person deploying the code feels responsible for it but the person who wrote it actually has control over it. This puts the person deploying the code in a very tough spot - their only options are to refuse to release the code or risk being held responsible for a problem, so they have a strong incentive to say no. On the other side, if the person who wrote the code doesn't feel responsible for whether or not it works, it's pretty likely to not work.

At Facebook we push code to the site every day, and the person who wrote that code is there to take responsibility for it. Seeing something you created be used by half a billion people is awe inspiring and humbling. Seeing it break is even more humbling. The best way we know of to get great software to these 500 million people is to have a person who understands the importance of what they're doing make a good decision about something they understand and control.”

This, of course, isn’t a formal proof of anything, but I tend to agree with it.  Good teams will find a way to produce good results more often than not, but certain structures will help or hinder those efforts, and arbitrarily separating Deployment from Development will more likely hinder them.

posted on Monday, August 09, 2010 12:22 PM Print
No comments posted yet.

Post Comment

Title *
Name *
Comment *  
Please add 3 and 6 and type the answer here: