Posts
1145
Comments
890
Trackbacks
1
April 2010 Blog Posts
Porcupine Tree, 04/30/2010, Riviera Theater, Chicago IL

It was a strategic risk to get tickets to PT during the playoffs when the Pens were playing, and sure enough, Game One conflicted with tonight’s concert.  I was prepared to skip it, since I’m going to the concert in Cincinnati tomorrow.

Luckily, the Pens were up by two going into the third period, and I decided that it would be better to go to the concert, as opposed to watching the third at home if they blew it.  Luckily, I had a good friend who texted me during the entire third period to know the Pens won.

Anyway, here’s the set list:

The Incident (the entire thing)
<10 minute break>
Hatesong
Lazarus
Russia on Ice
Anesthetize, pt. 2
Stars Die
Normal
Bonnie the Cat
<Encore>
Blackest Eyes
Trains

Supposedly at one time in their career, PT did a lot of improv type stuff, but since I’ve seen them the last couple of years, they tend to do every song they play just about exactly as they are on their official releases.  Since I like them, this is fine by me.

I don’t like that they now only play about half of Anesthetize (as good as the half is), but I was glad they played ‘Normal’ as it is one of my favorites. 

SW made some sort of reference about the September show at Radio City Music Hall (which I have a ticket to), and made it seem like it would be one of their last shows for a while.  I’ll pay more attention tomorrow in Cincy about that, but maybe after near constant touring the last two or three years, they are taking a break.

A good show.

posted @ Friday, April 30, 2010 11:31 PM | Feedback (0)
pmk – poor man’s kanban - an interlude – Kanbanand Guide

Okay, so apparently I got completely off-track, so this is a major interlude here.  I actually have the next post in the pipeline.

Anyway, I came across this and thought it was a really good read and explanation of something similar to what I think.

There are some differences.  For instance, since I dislike TDD so much, I’d change all references to Context/Specification testing.

But, it is a very concise and clear summary.  Take a look.

posted @ Friday, April 30, 2010 6:40 PM | Feedback (0)
Code cannot and should not replace technical documentation

I’ve written recently about technical documentation and the good and bad about it.

Over at ElegantCode, new member John Sommez has started out with some posts about eliminating comments from code, and most recently, about how unit tests can replace the need for technical documentation for developers working on a system.

I would go ahead and read what he has to say.  It’s well-written, and he’s obviously given some thought to the topics.  Unfortunately, I think the advice he gives is wrong, and would like to explain why.

As developers, we almost always run into the problem of trying to document what a system is doing and why.  And as I’ve mentioned previously, as developers, we know that detailed technical documents that describe code become stale almost the instant they are written, since the actual code, and the documents that describe that code, are almost never kept in sync, and for obvious reasons.  As developers, we look for ways to make our code better, and in good situations, we create unit tests that prove our code is better.  But we don’t always keep our technical documents absolutely up to date to match our code, and for good reason, as the effort required to do so is often too costly.  I think this is a given and John states this quite well:

“Our simple goal is to replace the “dead” documentation (meaning that it does not update automatically with the thing it is documenting), with “living” documentation in the form of unit tests.”

The problem is that with any system of any semblance of complexity, unit tests cannot document how that system works.

Some of this is for purely technical reasons.  A common infrastructure that I’ve dealt with involves a combination of, C#, Java, SQL, Oracle, and 3rd party vended applications.  Understanding the entire system involves understanding how all of the various parts interact in production.  No unit tests can provide this understanding to a developer.  To understand how the system works involves understanding the entire flow of the system, and by their very nature, unit tests cannot provide this, as they only provide an understanding (at best) of how tiny pieces of the system works. 

Maybe it is just me (though anecdotal evidence suggests it isn’t), but as a developer, I actually don’t really care about understanding an API when learning how a system works.  An API is all about implementation details, and so when writing specific code to implement specific functionality, I do care a bit about that, but that’s (ironically) about the least important thing.  What I want to know is, at a high level, what does the system need to do?  What data needs to move from source A to destination B, and why?  What is the business functionality I am trying to perform?  The developer need to, e.g.,  refactor and make class names and functions more transparent is vitally important, but only after I understand what it is for. 

Unit tests can’t capture this.  They aren’t designed for it (and unit tests as understood in the TDD sense are harmful here, which is one of the reasons why I constantly point out that TDD sucks).

More importantly, unit tests cannot capture why code was written the way it was written, as opposed to the other code that could have been written, but wasn’t.  This sounds more mystical than it needs to be.  One of the best code comments that I’ve ever read (which should have been in a technical document) had an entire block of code that was commented out above the code that actually functioned, with an explanation of why that code was originally used and discarded.  That it was placed in the code itself was obviously not a great thing, except that it was in the perfect place for a developer to see it. 

Existing code can be described very well by unit tests that cover it.  I do not dispute that.  But it cannot tell you why it was chosen.  What discussions with the end users or BAs or project managers led to that code being written as opposed to alternatives?

I’m sure almost everyone has dealt with a situation like something I’ve dealt with recently.  A piece of code referenced a lookup table that defined a threshold level.  Not that it matters, but the level was “12000.”  I suppose one can imagine that someone could create a unit test with a name like “orders_will_be_placed_in_the_processing_queue_within_a_threshold_limit_of_12000_orders_because_the_underlying_MQ_implementation_will_block_otherwise” or some nonsense, but I think it is apparent that this doesn’t really work.  As a developer, I don’t actually want to look at code to know things like this.  I want a technical document that lays out details like this.

When dealing with heterogeneous systems, unit tests cannot explain the boundaries of the systems, of what limitations are there to be dealt with.  Knowing, e.g. that the current Charles River implementation can only handle some level of orders of a certain format is totally outside of anything you can ‘document’ in unit tests.

Developers sometimes have a love affair with code.  My experience with senior level developers has been that, for the most part, they don’t actually care about APIs.  Given some business requirement like “starting with situation A, the end result should be situation B,” there are almost always multiple ways to get from A to B, and the details don’t matter.  Do I need, e.g., to take this code in C# to produce an output that gets processed in the Java part of the system?  Okay.  Tell me what the end goal is, and I’ll write the code that I think does it best.

digression: this is one of the reasons why I believe senior level developers hate code reviews so much in certain instances.  Unless you are dealing with high-performant situations, it really doesn’t matter all that much how you get from business process A to business process B.  It really, really doesn’t matter.

A well written technical document that explains the workflow of a system will pretty much always be more valuable than unit tests.  Unit tests have their own intrinsic value, but for other reasons.

posted @ Tuesday, April 27, 2010 11:53 PM | Feedback (0)
NoSQL links of interest

As I’ve mentioned before, one of the great things (at least in terms of how it relates to software development) about the Internet in general, and the blogosphere in particular, is that it offers a tremendous opportunity for someone to ‘fast-forward’ their skills if they know where to look (being vaguely intelligent and able to read quickly also helps).

This definitely applies when it comes to NoSQL.  Beyond the mindless advocacy of some folks that think NoSQL applies everywhere, and before I have to create a system that needs to deal with the scalability issues of Amazon, I want to know how it applies to, well, the sorts of systems that I’ve dealt with, where RDBMS has played a central and crucial role, and quite nicely. 

digression….Well, except for all of the ways in which RDBMS’ suck.  I think the whole object/relational impedance mismatch thing is largely a load of crap (until, of course, I have to deal with developers who don’t understand basic concepts like indexes and can’t figure out why their query takes two minutes to return, but I digress), but there is a lot about traditional relational databases that is time-consuming and annoying.  And don’t get me started about programming in .NET with ‘raw’ ADO.NET.  But I digress (again)).

How, for instance, would you handle a typical order processing system using NoSQL instead of SQL Server?  When it comes to ‘just’ storing data, I get that (I think), and I get why it scales to the heavens and what not.  But what about the day to day things that a typical DBA (or non-retarded developer) has to deal with?  How do you do those things?

And maybe it’s because I’m older or have grown even more lazy/stupid than ever, or maybe it’s because the NHL playoffs are on-going, but I also don’t want to have to learn it all ‘from scratch’ right now.  I don’t have the time.  I need my sleep.  Reading about other people’s experiences doesn’t replace actually having gone through them, but it helps (Even better, I need sample applications, but I digress.  Again.).  I’m spiking out code to make sure that I am familiar with the APIs involved with various implementations, but that only goes so far.

So one of the things I’ve been looking for are discussions of NoSQL that get to the details.  What isn’t good about it?  What are the problems you are going to face?  Since I’m going to be building some systems using it, what exactly am I getting myself into?  Because of my Philosophy background (Ph.D, University of Miami, “Hi Jeremy!”), I can do the whole theoretical debate thing till the cows come home (where were the cows that they needed to come home?), but what I really want to know is how it’s going to bite me tomorrow (and not in a good way).

Here are some of the things I’ve found:

The Dark Side of NoSQL - I really like this article, because it asks hard questions, and gives a nice description of how I sometimes feel when reading NoSQL advocacy:

“There is a dark side to most of the current NoSQL databases. People rarely talk about it. They talk about performance, about how easy schemaless databases are to use. About nice APIs. They are mostly developers and not operation and system administrators. No-one asks those. But it’s there where rubber hits the road.”

It then goes on to talk about some of the issues with NoSQL implementations: ad hoc data fixing, ad hoc reporting, and data export.  There are so many different NoSQL ‘platforms’ (for lack of a better term) out there that some of these are undoubtedly more or less problematic, but it is a theme that I’ve read in a couple of places now.  As someone who has spent a lot of time doing either ad hoc data fixing or ad hoc reporting (exporting is usually something I deal with in larger ETL type projects), especially in an operational role, the idea that you can’t easily do some equivalent of “join this table to that table to that table with this group by order by blah blah blah” is worrisome (although, I realize, part of the point).  Depending on the client and the situation, this is something you typically need to do *all the time*.  Users expect it.  I know that in the ‘typical’ (there is no such thing, of course) situation, you might have your NoSQL event store and your SQL query store (to use CQRS terminology), but what about when you need to find out why the transform from the event store to the query store didn’t do what you expected?  In production?  And you need to fix it now?

NoSQL, meh – written by someone with prior experience with object-oriented databases, who offers a cautionary tale.  And which includes the following funny comic strip that others have linked to:

fault-tolerance[1]

Apparently, ad-hoc querying is more difficult than I thought.

Anyway, it raises a point that deserves elaboration.  As a SQL ‘snob’ (see previous reference in this post about developers who don’t know anything about indexes), I know that most people really kind of suck at dealing with SQL.  Well, except for the fact that even your generic BA these days knows enough about T-SQL (or whatever your flavor is) to know how to get the data that they need to do their jobs, without having to wait on a DBA to do it for them.  Sure, they might create a Cartesian Product every once in a while, but what the hell, there are tradeoffs everywhere in life and in business and in software development.  SQL is well known enough for many people now.

A NoSQL implementation that prevents this sort of (really, really) ad hoc (and probably inefficient, but who cares) querying would really affect a lot of people.

NoSQL: If Only It Was That Easy – this post lists out a lot of the more popular NoSQL technologies, and how (the blogger in his estimation thinks) they scale, in comparison with RDBMS (well, in comparison with MySQL, which is a toy…</ ZING!!!!!>).  What I really like about it is that it is based on real-world research done with an eye towards producing a real-world solution for a real-world project.

NoSQL Déjà Vu – the blogger talks about his previous experience with working with object databases, and how he thinks it relates to the current NoSQL movement.  A lot of what he talks about relates to ‘political’ issues, which are things as a developer you don’t really want to think about, but eventually have to.  I’ve never had to deal with a dick DBA (other than myself, obviously), and so don’t normally think about those sorts of issues when thinking about how to implement technical solutions, but they are important.  I like this point:

“The geeky programmer in me (that loved working on that CRM project) is rooting for NoSQL databases. The recovering DBA in me cringes at the thought of battling data corruption with inferior, unfamiliar tools.”

I feel the same way.

In the end, there are some things that become clear.

NoSQL is proven technology, especially when it comes to the ‘insanely scalable’ systems of companies like Amazon and Google and Facebook and….etc. etc. etc.  Theoretical arguments are fine and dandy, but the stuff works.

When it comes to ad hoc querying and tooling support, NoSQL (in general) seems to lag behind what you can do with any common RDBMS.  In certain instances, this is a show-stopper, but as time moves on, I think this will be rectified.  Why?  Because the need for ad hoc querying and tooling support is a business requirement, not just a ‘nice to have’, and so the market will produce what is needed.

I just hope it happens sooner rather than later, especially since I’d rather not have to deal with the growing pains myself.  I expect that in the meantime, I’ll have to suffer through those pains.  And it’s just weird to get rid of things like lookup tables and normalization and the like.

Be sure to read the comments on all of these posts, as there is great information there as well.

posted @ Sunday, April 25, 2010 10:30 PM | Feedback (0)
kd lang – Love is Everything (Live)

Yes, I know I just posted this song.

The thing is, kd just nails this song.  Jane Siberry has more vocal tics than you can shake a stick at, whereas kd has the pipes.  She changes the lyrics for some reason at certain points, and I’m not sure why (For the non-Canadians in the audience, there is a law that all female Canadian singer-songwriters must collaborate at some point (this is why Joni Mitchell is serving prison time currently), and so kd and Jane have done so, did she not check with Jane what the lyrics were?), but she really, to use technical terminology, sings the fu*k out of this.

Mild digression: was she separated at birth from Mario Lemieux?  Just wondering.

posted @ Friday, April 23, 2010 9:50 PM | Feedback (0)
Jane Siberry – Love is Everything (Live)

Jane Siberry follows in a long line of Canadian singer-songwriters, and here is a beautiful live rendition of one of her best songs.

The title of the song is ‘Love is Everything’ which means, of course, that it is about love ending.  Huh, go figure.

posted @ Wednesday, April 21, 2010 10:36 PM | Feedback (0)
cqrs for dummies – an interlude – other interesting blog series

I’ve come across a couple of posters who have blogged regularly about CQRS and find their stuff pretty interesting.

Pal Fossmo has written a number of posts and has this great image of how CQRS might work (included with permission):

image[4]

His post category on CQRS can be found here.

Bjarte Naess has been posting quite a bit after taking one of Greg’s two day courses, and you can find his stuff here.

Check them out, good stuff.

posted @ Sunday, April 18, 2010 10:35 PM | Feedback (0)
Odd Comment Spam

Because I moderate comments (though very loosely as described here), I get very little comment spam (a couple a day), as people try to find some usually older post to try and get one in, at which point they open the floodgates (last time it happened, I started getting hundreds a day).  It’s annoying but manageable.

The topics tend to rotate.  For a while, it was selling term papers to college students.  Recently, it’s been about small business loans.  Really boring stuff.

Recently, I got one on one of my CQRS posts:

This is all good advice for purchasing a "used car." However, I would like to add that buying a car "sight unseen" could be risky since you are only seeing a couple of photos and a description of what the buyer's version of the "quality" of the car they are trying to "unload."

Not exactly what you might consider on-topic.  Best thing about it?  It was for a backgammon site.

Nice.

posted @ Wednesday, April 14, 2010 8:16 PM | Feedback (0)
Hello SSIS My Old Friend

A few years ago, Ayende had a post about why he hated SSIS, which brought about some responses (here and here and here) from some other folks.  It was a lot of fun had by all (sarcasm).  I happened to be doing some work prototyping SSIS and whether it could be used to replace some of the Perl-based implementations we had in place.  After only a little bit of effort, we clearly determined that SSIS couldn’t hack it (which is ironic, since getting SSIS to do many common tasks required a lot of hacking, but I digress).

digression: I won’t post the particular comment since it would be even more of a digression than I am wont to do, but what I loved about a response or two was the point that the problem was that Ayende didn’t RTFM.  Anyone who knows Ayende knows what a lazy bastard he is, always stopping every time he finds a problem and tossing it off for others to fix (in case anyone doesn’t get it, this is sarcasm).  That is, it was true that an issue or two he raised could have been gotten around by reading some documentation, but really….But I digress.

Due to various considerations, I’m working on yet another proof of concept utilizing SSIS.  On the surface, it is in its sweet spot:  I need to combine data from Sybase, Oracle, SQL Server and flat files, and get all that sweet data into Oracle (source Oracle data is on different servers from destination).  This is just what something like SSIS should be able to do, and I imagine that it is the sort of scenario that it hopes to combat something like Informatica.  And that would be great, since licensing Informatica is expensive.

One of the key things is that the need to consider SSIS is driven by non-technical requirements, which is to say that some upper-level guy got SSIS approved as a technology, and so there’s a desire to have it implemented to validate that effort.

digression: I actually consider this a valid reason to consider SSIS.  When working with larger corporation type people, these sorts of considerations are part and parcel of the gig.

There’s a concept called the ‘pit of success’ where the idea is that when working with a technology or pattern or whatever, the default experience when working with them is to find success easily.  Working with SSIS is like working with the ‘Mountain of Success’ where almost nothing works the first time you try it, and requires a Google search to find all of the other people who tried to accomplish the same basic task you were trying to accomplish and also failed, and who couldn’t figure out why.

A lot of it has to do with Oracle (parameterized query anyone?  and no, all those fixes/workarounds that apply to and OLE DB Source don’t work with an OLE DB Command).  But then there are the fun things like trying to get data from a SQL Server view and having it hang with no reason if you are using a non-data reader source, until you change the source to be a data reader (which I’m sure isn’t reproducible in other environments, but I don’t care, since I care about the environment I’m working in).  Or setting up a lookup component to hit an Oracle table and it fails to work as expected, so you change it to use a query, and that works, and then for the hell of it, you change the component back to using the table exactly as it was before and it now works.  This latter thing leads to a common SSIS developer practice of changing radio buttons from a failing option to another option and then back, just because it fixed it before (which from Google appears to be fairly common…”Change the option, close the editor, open it, and change it back…it will work now”….well, isn’t that nice).

The dream scenario is a nice one, and if you are myopic, compelling.  “You don’t have to write code, you drag and drop these components and set some properties and drag arrows from the output to another component, and it just works.  No code.  Awesome.”  It is awesome.  When it works.  The problem is that it rarely ‘just works.’  Instead, you find a need to tweak some minor thing and it turns out that the solution is untweakable. 

So, I’m well on the road to once again rejecting SSIS as an option, and in a scenario where it should be a shining light.

I guess I should just RTFM (if only the manual showed how to use parameterized queries with Oracle).

posted @ Tuesday, April 13, 2010 12:08 AM | Feedback (0)
Porcupine Tree – The Sound of Musak (Live)

One of their best songs, with some of SW’s most cynical (yet accurate) lyrics.

The music of rebellion
Makes you wanna rage
But it's made by millionaires
Who are nearly twice your age

posted @ Sunday, April 11, 2010 11:14 PM | Feedback (0)
Good Blog Series – Strengthening your domain

Over as LosTechies, Jimmy Bogard is blogging about fundamental OO and DDD patterns and practices, and the series can be found here.

Two posts so far might be of particular interest to people interested in CQRS, one on avoiding setters (which I talked about also in my post about the command layer) and another on domain events (which I’ll be talking about whenever I finish the next post in the series about the event store layer).

His writing is very clear, and the code examples are concise and precise.  Good stuff.

Check it out.

posted @ Thursday, April 08, 2010 7:51 PM | Feedback (0)
High Scalability Web Site

A really good place to find a heck of a lot of interesting posts about how many different companies handle high scalability issues is here.  What’s really interesting to me is that almost every technology you think might not be able to scale can be made to scale with enough ingenuity/desperation.

Good stuff.

posted @ Wednesday, April 07, 2010 11:15 PM | Feedback (0)
Good Reading – The End of an Architectural Era

Continuing on with the whitepapers I’ve been (re)reading, another very interesting read is entitled The End of an Architectural Era (It’s Time for a Complete Rewrite) that can be found here and which was written by a whole bunch of people, including (again) Pat Helland from Microsoft.  A very good overview of the article is here and is better than anything I’m going to add now (though that won’t stop me).

I’m sure that many people who are part of the NoSQL movement haven’t read this, but I’m also sure that many people have, as it hits on many of the points that will resonate.  It’s a fairly technical article, but not overwhelming so.  It argues for an elimination of RDBMS in favor of newer technologies that (supposedly) overcome the technical limitations of RDBMS.

If you are a fan of RDBMS, then a lot of the more ‘heated’ rhetoric will probably strike you as a bit over the top, but that shouldn’t stop you from finding a lot of value in it.  The fact that the proposed H-Store implementation achieves an 80x improvement on TPC-C benchmark tests should peak your interest (the fact that there doesn’t seem to be a lot of movement with H-Store itself is a mitigating factor, but one could argue that the flourishing projects that attempt to accomplish the same things makes this irrelevant).  It is interesting that the approach they suggest advocates stored procedures, which goes against a lot of what people seem to think a rejection of RDBMS entails.

RDBMS aren’t going anywhere, for a multitude of reasons.  One could argue that Microsoft’s Azure SQL stuff attempts to have things both ways, but unless and until Microsoft and Oracle introduce systems that embrace NoSQL ideas, they won’t take off in the market.  And the simple fact of the matter is that the vast majority of applications are served well by ‘antiquated’ systems like SQL Server and Oracle.

Nevertheless, anyone with an open (yet still skeptical and critical) mind really should take a look at this.  Certainly, anyone who is interested in CQRS and the ways in which you can implement CQRS-inspired systems that don’t rely (solely or at all) on RDBMS should be interested in the concepts.

Take a look.

posted @ Wednesday, April 07, 2010 10:59 PM | Feedback (0)
Good Reading – Life Beyond Distributed Transactions by Pat Helland

One of the things I’ve been doing the last few months is to go over the vast reams of printouts that I have of articles, blog posts, etc. and find the things that are more immediately valuable and chucking the rest.  I got into a habit of printing out anything that might possibly be useful, and ended up with a lot of stuff that, well, I just didn’t need.  So YAGNI to it.  If it doesn’t possibly apply to a current active project of mine, chuck it (which is really only kinda-YAGNI, but it eliminates 80+% of stuff, so good enough).

digression: yes, I print out blog posts.  I like to mark up things that are of most interest, occasionally add comments if it is something that I know I will come back to, and so on.  It is why there has never been, and will never be, a paperless office.  But I digress.

Some articles don’t directly fit this plan though, and are of general value above and beyond any particular project, and Pat Helland’s paper “Life beyond Distributed Transactions: an Apostate’s Opinion” is one of them.

The purpose of the paper, not surprisingly, is to talk about the practical ways in which very large-scale applications get around using distributed transactions, and deal with the fact that as these sorts of applications grow in size, the sheer amount of data makes it impossible to use them.  Anyone interested in CQRS or NoSQL will find the paper interesting, as Helland attempts to talk about common patterns and give them names so that people can converse about them (I don’t know that this part has been all that successful).

I won’t go over it in any detail, that’s what the link is for, but you can find interesting topics, such as:

  • the idea of an almost-infinite scaling application, and how it changes your mindset in thinking about how to code in such a situation
  • “At least once” messaging as the main paradigm of communication (as opposed to synchronous calls, of any sort)
  • how messages seems to necessitate ‘activities’ which are entity-to-entity partnerships that can ensure that all messages are idempotent (even ones that continually change state).  For whatever reason, I think this part of the paper is the most important.  Or maybe I just find it the most interesting.
  • I take that back, the most important point of the paper, and one which has appeared elsewhere, is to point out that business processes inherently deal with uncertainty all the time, and that software can and needs to handle uncertainty in much the same way": “The semantics of uncertainty is wrapped up in the behavior of the business functionality.  While more complicated to implement than simply using distributed transactions, it is how the real world works….”

It’s a good read, regardless of whether you care about things like CQRS or NoSQL or not.  Check it out.

posted @ Tuesday, April 06, 2010 8:12 PM | Feedback (0)
The Paradox of Technical Documentation

At the end of his Tractatus Logico-Philosophicus, Wittgenstein has a famous quotation:

6.54     My propositions serve as elucidations in the following way: anyone who understands me eventually recognizes them as nonsensical, when he has used them – as steps – to climb up beyond them.  (He must, so to speak, throw away the ladder after he has climbed up it.)

More or less, I feel the same way about technical documentation.

Sometimes, you will hear someone say that all you need when it comes to documentation is the code itself, but this is, obviously, silly.  Besides the fact that project managers, business analysts, and others who need to have access to technical documentation generally can’t read the code itself, even if they could, they won’t (nothing about BDD solves this, BTW).

And even if they would or could read code, it wouldn’t solve the basic problem.  As a developer/architect/whatever, when I come on to a client, I’m usually dealing with a codebase of significant enough size that it isn’t enough to just try to read the code.  What is needed is some sort of high level document that explains how things work and why.  I need to know *where* in the code to look and why.

The paradoxical nature of technical documentation comes out in different ways:

  • The developer who doesn’t need it is the one who needs to write it, and knows it is technically obsolete the moment it is written.
  • The developer who does need it doesn’t know how to write it, and once he understands it, doesn’t need it because he knows it was technically obsolete the moment it was written.
  • The developer who needs to write it doesn’t have time to write it because it takes away from his immediate deliverables, even though he knows in the long run it will save him time.
  • The manager who can’t or won’t read code insists that technical documents contain actual code, because he doesn’t understand how technical documentation is useful unless it contains code, even though the developer who doesn’t want to write it and the developer who has to read it don’t usually want to see actual code, because they know that any code included is obsolete the moment it is written (this is the one that really kills me…”I need any developer to be able to read this document and start coding”…um, 6 months ago, I would have wanted a document like this, and I wouldn’t have wanted code in it…”Yeah, but it has to have code in it, otherwise, how could a developer use it?”).

The idea is to find a happy medium of some sort, and the obvious solution to me is some sort of high-level overview of the system.  Perhaps a piece of it would include specification type documentation that would be of most use the the BAs and PMs in the group, and another piece that would tell you how the system worked, and where in the system to look for the details.
posted @ Monday, April 05, 2010 9:01 PM | Feedback (2)
How much of your developer career is due to Google?

I don’t know about you, but I don’t memorize syntax.  I mean, when I’m doing typical daily development work, the speed at which I can get stuff done depends pretty heavily on my short-term memory, that is, on whether I’m writing code that is similar to code I’ve been writing recently.  If it involves code of a sort I haven’t worked on in over a month, there’s a bit of friction (as the kids like to say).

Anyway, this all came to mind the past few days.  For various reasons, I had to write a fairly simple console application.  Basically, it required that I create two files for an external vendor to pick up from our FTP site.  The data for these files came from two sources: a view on top of a SQL Server database and a file produced from a 3rd party vendor application.  For the file produced from the SQL view, there was no business logic involved.  For the file produced from the vended file, there was some pretty basic business logic to aggregate data as well as a couple of other things.  Nothing complicated.

What I found vaguely interesting was how much I had to use Google to write the code.   Here’s what I had to do:

  1. For the data coming from the SQL view, I had to use ‘raw’ ADO.NET as the client did not support an ORM I could use.  I didn’t have to do too much searching here, only a little.
    1. Do you create a SqlConnection and then a SqlCommand, or the other way around?  What’s the best way of doing that using TransactionScope?
    2. Since I was using a DataReader, I remembered seeing some code from Derick Bailey about how to work around checking for DbNull and all that, so modified that slightly.
    3. Writing out a file from a List<T>, how do you do that? 
  2. For the data coming from the vended file, I had a bit more to do.
    1. I had the command that I would run from a command prompt, so how do I ‘shell out’ and run that?
    2. It turns out that the vendor application process exits before the extract file it produces showed up in the proper location.  So, how to sleep for the proper amount of time to allow the file to be produced before looking for it?
    3. A typical way of processing a file you read from disk is to process it line by line, but for various reasons, I needed instead to load the extract file into memory as a byte array, and then process it, so how do you do that?
    4. To produce the proper resulting file, I needed to calculate the last business day of the previous month.  Given that I had a source of all holidays for the year, how do I do that?
    5. Processing the extract file produced a collection of nearly a million records.  This ended up causing obvious performance issues:
      1. I started with basic LINQ to Objects code to implement the aggregation logic I needed.  Which worked, except it took nearly an hour to run, and that’s not good.
        1. Did I write the LINQ to Objects code wrong?  Have other people noted that LINQ to Objects code has performance issues?
        2. Once it was determined that operating over a collection with LINQ to Objects has issues, were there better collections to use?
        3. Given that I didn’t have a unique key that I could use per each record (eliminating a lot of options), what can I do here?  I ended up using Lookup, which reduced the processing time from almost an hour to less than a minute.

I’ve probably left out a few steps, but you get the point.  I didn’t actually need to use Google for every single step here, but you get the point. 

Setting up the solution to use the ‘DDD-Lite’ structure I prefer (console calls services which call repositories which call data access or file access) required almost no thought.  Because of my cqrs-junkie status, I actually implemented a query layer between the service layer and repository layer, but I don’t like how I did that, so I need to refactor.  I encapsulated calls to app.config with an IConfiguration interface, but I don’t like that I pass in an IConfiguration in too many layers, so I obviously did that wrong, so I need to refactor.

But all of that stuff took less time than me ‘remembering’ how to shell out to run the 3rd party vendor application, for instance, since I just haven’t done that in quite a while.

Now, admittedly, I am lazy and/or stupid.  But without Google, I would have been stuck on basic syntax issues. 

I suspect I’m not the only one who has been in a situation like this.  Or perhaps that’s wishful thinking.

posted @ Thursday, April 01, 2010 6:42 PM | Feedback (0)