Wednesday
07May

Ruby Influenced C#

Before joining my current project I spent about 4 months working with Ruby every day, the first time I’d done so for a few years. It was a glorious time: uncluttered syntax, closures, internal iterators, and with open classes, the ability to extend the ‘core’ at will.

Today I’m working with C# and .NET, and I’ve noticed that those 4 months with Ruby have changed the way I’ve been writing code. Most noticeably, I’m using anonymous delegates a lot more. But that’s not all.

I’ve found myself aching to use the List<T>’s ForEach method; I’m now wired to use Ruby’s internal iterators where instead of

List people = FindAllPeople();
foreach (Person person in people) {
   ...
}

I can instead do

List<Person> people = FindAllPeople();
people.ForEach(delegate(Person person) {
  Console.WriteLine(person.Name);
});

But, frequently I’m put off by the surrounding guff that’s needed to express the same and have almost always gone back to the more traditional external iterator-based approach. It’s simply too high-a-price to pay.

One of the largest smells I’ve noticed recently (to my mind) appears driven out of not having open classes and external iterators. If they were there, I’m sure people would use them. The result: all across the codebase, whenever you need to convert from type to another you’ll see

List<Person> people = FindAllPeople();
List<String> firstNames = new List<String>();

foreach (Person person in people) {
  firstNames.Add(person.Name);
}

This smells to me. But it really, really smells from having used Ruby where I would previously have written something as succinctly as this:

find_all_people.collect {|person| person.name}

(I’m sure other languages could do equally good things- but I’m familiar with Ruby, before the Pythonists pounce :p)

Well, turns out that you can get nearly there with C# 2.0 and .NET 2.0 with the almost certainly underused ConvertAll method (also part of List<T>).

List<Person> people = FindAllPeople();
List<String> names = people.ConvertAll(delegate(Person person) {
  return person.Name;
});

There’s still a fair bit of accidental complexity remaining- lot’s of delegate and type declarations.

C# 3.0 introduced lambda expressions and we can use that to bubble our soup down to a nice intentional broth even more. We can get rid of the delegate bumpf and let the compiler infer the type (we are still statically typed after all)

List<Person> people = FindAllPeople();
List<String> names = people.ConvertAll(person => person.Name);

Next step, we can also infer the types for our local variables:

var people = FindAllPeople();
var names = people.ConvertAll(person => person.Name);

Pretty nice. Most of the code is focused on the task at hand, and on expressing the necessary complexity (what it means to convert people to names). Guess learning a new language each year has it’s benefits.

I’ve got another bit of Ruby influenced C# refactoring to cover (a somewhat declarative way of removing switch statements), hopefully I’ll get that posted tomorrow!

Thursday
01May

Prioritising Work

I was involved in the inception work (and the resulting delivery) for a project late summer last year. We estimated the total work to be nearly 500 units- too much to complete in the time we had. So, working with the client we cut it down to a reasonable scope (this client rocked at that!) of around a third.

What was really cool, however, was that a few months in after that initial scope had been delivered, we looked ahead for what we’d do next. According to our original inception, there was still more than double left to go, but, instead what we planned to do next was radically different. What we’d believed to be important 3 months ago, no longer was. The result was we delivered about 30% more, and about 50% overall was not part of that initial (500 unit) scope.

The real key was that our client had a very definite focus on work that was important (that is, work that delivers the most value) and avoided getting drawn into unimportant but urgent work, or left valuable work to the point it was always urgent. If you’re always working on urgent things you’re working at breaking point and miss out on the opportunity on higher value items that are less urgent.

Here’s a diagram to show the two

Alt text

(More can be read about Covey’s grid on Wikipedia)

The right-most two quadrants (coloured blue and purple) are the key. Both represent sections that involve working on things that are valuable, i.e. those that are important to do. In contrast, 1 and 2 (the uncoloured boxes) are relatively unimportant and should demand less attention.

The rub (of course) is that urgent things tend to be shouty and demand attention. I would also say it’s often easier to measure how urgent something is rather than how important it is; as a result, urgency is an easier (and thus more likely) benchmark for prioritising tasks - despite ignoring whether the work is worth doing at all.

Some of the best people I’ve worked with (including our sponsor at our client last year) have a remarkable ability to cut through the context and spot what’s really important now; as opposed to just reacting to what’s demanding our attention now.

Monday
07Apr

Poor Man's C# Singleton Checker

Paul Hammant wrote a nice article about how to refactor the “nest-of-singletons design” towards using dependency injection using Google’s Guice IoC Container.

Whilst waiting for one of my many builds to finish today, I figured I’d satisfy a curiosity - roughly how many singletons are there defined within this codebase. I fired up Cygwin and used the following

find . -type f -name “*.cs” | xargs cat | grep “public static [A-Za-z]\{1,100\} Instance” | wc -l

Result:

180

Yikes!

For Java, you can always use the Google Singleton Checker which also has some nice stuff about why they’re controversial.

(Update) Paul Hammant pointed out that his article wasn’t about refactoring out singletons, rather, breaking away from using the service locator to dependency injection. Apologies for muddling it up a little :)

Monday
11Feb

Declarative Programming with Ruby

During the most recent ThoughtWorks away-day (a chance for the office to get together, catch-up, drink etc.), George and I presented on a number of Ruby and Rails lessons we learned from our (now previous) project. One of the most interesting sections (to us anyway) was on declarative programming, specifically, refactoring to a declarative design.

I guess much like DSLs, its easier to feel when you’re achieving something declarative as opposed to necessarily defining what makes it. But, I’ll try my clumsy best to define something.

Almost every language I’ve turned my hand to (save for Erlang) are imperative languages - where programs are written as sequences of operations, with changes of state. You determine the what and the how of the system - what to do and how to do it.

Declarative programming is an alternative paradigm, whereby code is expressed as what’s. How the system executes is someone else’s responsibility.

So, for the purposes of this discussion let’s consider that application code can be split into two groups


  1. Logic- the rules, the guts of things- that what’s.

  2. Control- statements about execution flow.

Interestingly, one of the principles listed in Kent Beck’s most recent book (Implementation Patterns) includes “Declarative Expression” - that you should be able to read what your code is doing, without having to understand the wider execution context.

Declarative languages are all around us, with most developers I’m guessing using them almost daily.

Think of SQL, when you write a statement such like SELECT [Name], [Age] FROM [Person] WHERE [Age] > 15 you’re making a statement about what you’d like, not how you get there- that’s up for the database engine to figure out. And a good thing to! Have you ever taken a look at an execution plan for modestly complex queries?

Closter to home - think of .NET and Java Annotations- where you can decorate constructs with additional behaviours. They look and feel like core extensions to the language, but are programmable and can be used to adapt the runtime behaviour of the system.

Before working for ThoughtWorks, I worked on a system where we used attributes to allow us to add validations to properties, allowing us to re-use code and extend easily.

[LengthMustBeAtLeast(6)]
public property string FirstName
{
get { ... }
set { ... }
}

Everything was nicely decoupled, read well, and reduced the amount of clutter in our code. More importantly, our validation code was not spread throughout every setter of every property. We could isolate responsibilities making code easier to digest and understand, and test!

Onto Ruby. A substantial part of our project involved reading lots of CSV files from different sources to update our deal information. The answer was a kind of anticorruption layer (borrowing heavily from domain-driven design) for each different feed.

Quickly, we ended up with a few concrete classes and a base class that co-ordinated effort. Dependencies were shared both ways, imagine a number of template methods that are called in sequence to accomplish their work. Over time it grew complex, and with Ruby it’s a little tougher (than with languages like Java or .NET) to navigate and browse around the code without strong IDE support.

It was starting to get a little too complex and we felt we needed to change things, so we did (bolstered somewhat by having Jay with us).

Onto the code.

So imagine we have a class representing a Feed of information (read from a CSV file), and we want to be able to ask that Feed to provide us with a number of Deals. Internally, it will iterate over the items in the Feed, creating a Deal for each one (if possible).

Our first solution looked a little like this, firstly in the ‘abstract’ base class:

def create_deal
Deal.create(:network_name => network_name)
end

and in the feed’s concrete implementation:

FIELD_INDICES = {:name => 1, :network_name => 2}

def network_name
read_cell(FIELD_INDICES[:network_name])
end

Our main feed class asks our implementation class for the network_name - a template method. Not bad, now build that up to tens of attributes, a bit longer. Since we’ve defined column indices in a constant, we also now have to navigate up and down lots to determine where we’re reading from. Add in a few other tables with look-ups for other bits that we need to do the mapping and it can get a little complex pretty quick.

Our code is not only made up of the stuff determining what it is to translate between a CSV representation of our Deals to an object model one, but also all of the code necessary to find out which CSV column we’re in, how we map that column etc. They were essentially just reading values from cells, no translation needed. Most of the code we had was infrastructural, and the logic (the what of our application) was hidden amongst the noise.

This complexity, combined with the split of flow between abstract and concrete classes, made it difficult to follow and understand. Our goal was to try and reduce each concrete implementation to a single page on our screens.

These little ‘mapping’ translation methods were our first target - reduce the amount of code for each of these to one line that described the mapping, rather than how we get it all.

Firstly, we introduced a convention - every attribute of a Deal could be retrieved by calling a deal_attribute_my_attribute style method. So, we renamed all our methods, ran the tests, and then started to make steps towards having each deal_attribute_blah method defined dynamically.

We went from:

def network
...
end

to

def deal_attribute_network
...
end

to … nothing.

Well, not quite. Instead, what we wanted was to have a method constructed by adding a class method to a module that we could mix-in. Then, we could just define the mapping and Ruby would wire up the rest. We settled on the following syntax

deal_attribute :name => 'NAME'

Neat. A little classeval and instanceeval magic later we were able to push our infrastructural code out of our concrete feed class and into the co-ordinator.

Our code is now more declarative. We’re stating what we need to do our work, rather than worrying about how we get at it. Not only that, for the common case (where we may be just moving values from one place to another) there’s no need to do anything more than describe that relationship. Declarative programming makes it much easier to express important relationships. It’s now much easier to see the relationship between the name attribute of a Deal and the NAME column for this CSV feed.

Notice also how we’re pushing our dependencies up to our caller- the coupling is now one-way. We state what we need from our caller (the main deal feed class) - our caller is able to then pass the information on. We’re just answering questions, rather than answering questions and asking questions (of our caller).

The syntax also lends itself to explaining a dependency, that an attribute of our deal is read from ‘NAME’ (for example).

That’s great, next step was to tidy up some of the slightly more complex examples where we do some additional translation. For example, where we take the name of something and we need to pull back an object from the database instead. Let’s say we keep track of the Phone that a Deal is for.

So, from

def deal_attribute_phone
Phone.find(deal_attribute_brand, deal_attribute_model)
end

to

deal_attribute(:phone => ['MAKE', 'MODELNAME']) do |brand, model|
Phone.find(brand, model)
end

We’ve extended the syntax to reveal that this translation needs values from both the ‘MAKE’ and ‘MODELNAME’ columns. From our perspective, we’re pushing responsibility up and keeping our code focused on what we need to map this attribute.

This is a little more complex to achieve since we’re also passing arguments across (and our deal_attribute translator methods also sometimes need to access instance variables) so we need to use instance_exec instead of the standard instance_eval.

The end result was feed classes that looked as follows

class MySpecialFeed
deal_attribute :name => 'NAME'
deal_attribute(:description => 'DESC') {|desc| cleanse_description(desc) }
deal_attribute(:phone => ['MAKE', 'MODELNAME']) do |brand, model|
Phone.find(brand, model)
end
...
end

In total, it took us probably just over a day to refactor the code for all of our classes. Most of which we managed to get down to some 40 or 50 lines total. We didn’t refactor all the code, so there’s still potential for exploiting it further but it was definitely a very exciting thing to see happen.

Thursday
15Nov

Copying Classes

From across the desk, George asks “can you copy classes in Ruby?”. We talk about it quickly and reason that since everything’s an Object (even classes), you probably can. Since the constant isn’t changed or duplicated (you’re essentially assigning a new one) then it ought to be possible.

Turns out it is!

class First
def initialize
@value = 99
end

def say_value
@value
end
end
First.new.say_value # => 99

Second = First.clone
Second.class_eval do
define_method :say_value do
@value + 100
end
end
Second.new.say_value # => 199

Neat.

I’m not sure quite why you would want to clone a class to take advantage of re-use - rather than extract to a module (and share the implementation that way) or, if there’s a strong relationship that doesn’t violate the LSP etc. then look for some kind of inheritance-based design.

But, I guess you could work some kind of cool ultra-dynamic super-meta system from it. Perhaps someone with way more of a Ruby-thinking brain than me could offer some thoughts?