Now that I’ve successfully completed a 6+ month project, I thought I’d write down some of the things I learned from it (or some of the things that happened anyway), as much as for future reference as anything else.
In part because of NDA blah blah, and in part because it isn’t relevant, I’ll leave out some of the specifics.
Having said that, here’s the skinny. The ultimate end client is a trading team, buying and selling blah blah for the people who have hired them to manage their portfolios. In the grand scheme of things, the volume of the trading is not insignificant. It isn’t real-time hedge fund type volume, but on a daily basis, it’s the sort of money that matters to people. The reason why this is important to bring up is that it sets a level of risk.
Not that your standard CRM tool isn’t important, but if you, for example, have a production outage that brings down your system for a day, not having a CRM tool for a day is less important than if you can’t trade on the market for a day. There’s a real risk there, not only in terms of market exposure, but also in the more general sense that people aren’t going to want you to manage their money if you can’t. So, there is a built-in nervousness about changing anything, even if the current system sucks, because a sucky system that allows people to trade (however inefficiently and painful it might be) is better than a ‘better’ system that fails.
Briefly stated, the overall goal of the project was a backend change, one of those things that is totally unglamorous, but important. In this case, it was a change from using SQL Server as a backend store to using Oracle. The overall system that this project was a part of actively managed trading, but the particular piece this project was trying to changed involved taking data from multiple sources, manipulating that data in certain ways, and then, ultimately, passing that data along to an external vendor. Because it was part of a trading system, it had two main components, one that sent the important data to the external vendor overnight before the market opened, and one that sent data to the external vendor during the day while the market was open.
Besides the backend change, the project also involved changing the existing legacy system to one using more recent technology. Now, ‘legacy’ is a term that often simply refers to any existing system that you’ve never worked on before, but in this case, it was a true legacy system in the sense that it was cobblestoned together over a long period of time (not sure how many years, but at least two or three) with no central architecture or guiding principles. It was put together by a wide range of people, including subject matter experts who were very good business analysts, but not quite so good software developers.
In short, we needed to change a system that included a mix of C# 1, SQL Server DTS (yes, I said DTS), and VBScript (yes, I said VBScript) to C# 3.
Don’t fix anything
Since backend change projects are of a special sort, there is a key question that needs to be addressed up front. Should the project try to fix any of the known issues? Every project has its own parameters, but, in general, the correct answer here is, no. In our case, the ultimate output of the project was a set of data that was being sent to an external vendor, with all the constraints on that that it entails. At the end of the day, you know that your backend changes are successful if you can validate the data you used to send with the data that the new system is sending. If you begin to introduce ‘fixes’ as part of the project, then you have the problem of determining whether the data is different because you successfully introduced a fix or because you incorrectly changed the backend.
It is important to note that it is generally impossible to make the data you send 100% identical to what you sent before. There will almost always be differences (in this case, the fact that SQL Server and Oracle act differently in certain cases introduces unavoidable differences). But, it is important to narrow down the range of differences so that they are explainable and acceptable, and so it is almost always the best path to take to avoid trying to change the behavior of the system in question.
Accept that which you cannot change
I am a very firm believer in a small number of things when it comes to software development. Some of them involve how code is written, some involve how projects should be managed, and any number of other things. Other than this small number of things, I am also a very firm believer that when you are involved in a project like this, you need to determine the things you cannot change, even if you don’t like it, and learn to live with them and work with them the best that you can.
Generally speaking, I believe that certain agile practices, especially project management practices like Kanban, are best. But, with this client, and in this situation, that wasn’t going to happen. Waterfall and BDUF prevailed (although I found ways around that at times).
Generally speaking, I believe that as a software developer, you write the best code when you have full and free reign to use any and all software tools, libraries and techniques that you deem proper. But, with this client, and in this situation, that wasn’t going to happen. Certain software tools, libraries and techniques were pretty much mandated. More importantly, certain things were excluded.
Generally speaking, I believe that certain deployment practices are crucial to successful software projects. But, with this client, and in this situation, that wasn’t going to happen. Certain deployment practices were already in place and not going to change (although I found ways to cheat).
For some people, these sorts of restrictions are unacceptable. I can understand that perspective, but I can also see it as unprofessional. Regardless, as the consultant with this client, in this situation, I learned to discover that which I had to accept and let it go.
Detailed Design, to a point
Every experienced software developer knows that detailed designs are only as good as the paper that they are written on, which is to say, not worth much. Unless you have an infinite amount of time, you know that no matter what you put down on paper as to the design you are going to implement, the minute you start to do actual coding, you will find out that you missed a requirement, or forgot an assumption, or what have you.
At the same time, when you are working on the sort of project that requires capital expense justification, you have to do it. There’s no way around it.
So what you learn to do, is cheat.
I had the luxury of needing to do a ‘spike’ solution to what was going to be implemented. As with all spike solutions, you cut corners to test an end result. Since I knew I had to produce a detailed design to present to the (then) development PM and the rest of the team, I produced a set of documents based entirely on the spike solution, knowing full well that it would change. I produced a nice PowerPoint presentation, I created pretty class diagrams based on the spike solution, and packaged it all up in a pretty package.
I’ve often made a joke that I have the design skills of a drunken ferret, but this worked. The (then) development PM actually called it ‘a work of art.’
It wasn’t that the spike solution was totally off-base. I fully intended to take that as my starting point, since that’s the entire point of doing a spike solution. But, I also knew that once coding started, there would be, oh, a bit of drift.
I don’t recommend this as a practice unless you know you can pull it off.
Once coding starts, all bets are off
Shortly after the detailed design was approved, the (then) development PM left the company, but regardless, once you start coding a real implementation, you start to code without all of the known cut corners, but also, you find all of the areas where you simply got it wrong.
As a consultant, you are often working with systems that you have little to no knowledge of. In a perfect world, you have some sort of training or ramp up on them, but often times, you really don’t. This is part of what being a consultant involves. What that means more often than not is that once you start coding the actual implementation, you find out all sorts of things that you never would have even thought to include in your detailed design.
You also tend to find out all of the dependencies that you didn’t anticipate. To give one example, since this project involved working with Oracle, I had a dependency on other teams that managed the various Oracle installations. Sounds fine, until you discover, for instance, if you need a change to an Oracle installation and you are working off-hours, then if the Oracle team isn’t tasked to work the same off-hours, you are dead in the water until they get back to work. If your code relies on access to data from another company system, and that system goes offline because of other projects they happen to be working on at the same time, you are dead in the water.
So, you find yourself in a position where you can’t get the work you need to get done in the time frame you promised/estimated it would get done in, over and above what you end up having to rework because your initial design was wrong in the first place.
Related to the above, it is important to keep in mind that deployment of your code, especially across multiple teams, matters, and matters a great deal.
Almost nothing can block progress more than a failed deployment, except maybe a deployment that doesn’t get done because the team you need to deployment either isn’t kept up to date on what you need to deploy and when, or if they are too busy because of other deployments for other projects that take priority over what you happen to be doing at the time.
You really have to keep them involved, informed and up to speed.
Burning bridges doesn’t usually help
I’ve never met a bridge I couldn’t burn, and have often done so, but it doesn’t help.
Sometimes you come across a blocking issue that is due to sheer incompetence. The people on that other team really haven’t a clue to what they are doing. But, often times, that isn’t really the issue. They could be overworked. They could have been directed by their superiors not to pay the slightest bit of attention to your project, because that other project they are working on has been given priority above everything else. It’s really important not to take the ‘escalate to their managers’ route unless you really need to, because you will find out that the next time when you really need something, the escalation route won’t work as well.
Love your PM
When all else fails, hope that you are working on a project that has a good PM (yes, there is such a thing), or at least a PM that is listened to and respected by upper management. As a developer, there is nigh near nothing you can’t work around if you can explain it to the PM because they can then explain it to the people up above that, at the end of the day, really matter. This is really, really important. The (then) development PM left at the start of this project and I was lucky enough to work with the next PM whom I’d known previously, and so was able to use him as a filter with the higher-ups to explain and remove roadblocks (amongst other things).
I’ve talked very little about code specific things there, and there’s a reason for that.
I couldn’t use NHibernate or even Entity Framework, so I figured out a different way to do data access.
Thought I don’t use mocking much anymore, I couldn’t use Rhino Mocks, but had to use NMock instead.
I couldn’t use StructureMap, but had to use Unity instead.
The reason why I haven’t talked much about those things is because they don’t really matter. The actual writing of code is about the least important part of a major software project, and as a senior developer, you should be able to write good code regardless of that. What matters more is determining what you can’t change, cultivating relationships with external groups that you have dependencies on, and working with the PM to resolve blocking issues.
That’s how a 6 month software development project can end up being a success.