Saturday, December 31, 2005

Egoless development

When you’re the only one you know working with a programming language it’s easy to ignore style. There’s no one to criticize anything so it’s easy to just focus on the problem at hand and use it to get the job done any old way you can.



That’s the way I’ve always felt about working with Ruby, at least until now. I’ve used Ruby off and on since 2001 but I’ve never known anyone else interested in it. Also, I’ve only used it for small side projects so I’ve never developed great fluency. All of sudden now, though, it seems like the whole damn world is interested and I’m starting to think I need to hone up my Ruby skills. This includes trying to actually code like a 'Real' Ruby coder.




So that was my mindset as I sat down last night to write a little chunk of Ruby code to process the comment file created by the PHP comment system I use in my blog. The file is called comments.txt and has a very simple format. Each line is a comment in the form:




ID | DATA | TIME | NAME | EMAIL | HOMEPAGE | CONTENT




My goal was to load this file into an array of hashes, so my first chunk of Ruby was super simple and looked like this:





def load_comments()
ra = Array.new
IO.foreach("comments.txt") do |line|
a = line.split('|')
h = Hash.new
h[:id] = a[0];
h[:date] = a[1];
h[:time] = a[2];
h[:name] = a[3];
h[:email] = a[4];
h[:web] = a[5];
h[:comment] = a[6];
ra.push h
end
ra
end

comments = load_comments()




This worked fine and if I had had a deadline or still wasn't concerned with style, I would have lived with this, but now a nagging little voice was creeping into my thoughts telling me there had to be a better way. If someone fluent in Ruby saw those seven lines mapping the array elements into a hash they would laugh. I knew, there had to be a Ruby way to deal with this type of repetition. So I put on my thinking cap and read some ruby code on the web and came up with this solution.




class Array
def to_h_for_keys(keys)
hash = {}
counter = 0
each do |value|
hash[keys[counter]] = value;
counter = counter + 1
end
hash
end
end

class String
def split_to_h(delim, keys)
split(delim).to_h_for_keys(keys)
end
end

def load_comments()
ra = Array.new
IO.foreach("comments.txt") do |line|
h = line.split_to_h('|', [:id, :date, :time, :name,:email, :web, :comment])
ra.push h
end
ra
end

comments = load_comments()




With this technique I extended the Array class by adding a method that created a hash from the contents of the array. The method took an array of keys that mapped to the receiving array's offset. At first I liked this solution, but it seemed like a lot of code to solve something fairly simple. In addition I just hated how I had that counter value in Array.to_h_for_keys. It just seemed wrong to mix iteration with indexed based access. So I started to think how I could solve that and I came up with the following.




class Iterators
def Iterators.each(c1, c2)
l1 = c1.length
l2 = c2.length
count = if l1 > l2 then l2 else l1 end
count.times() {|x| yield c1[x], c2[x]}
end
end

class Array
def to_h_for_keys(keys)
hash = {}
Iterators.each(self, keys) do |v1, v2|
hash[v2] = v1;
end
hash
end
end

def load_table(filename, seperator, columns)
a = []
IO.foreach(filename) do |line|
a << line.split(seperator).to_h_for_keys(columns)
end
a
end

comments = load_table('comments.txt', '|', [:id, :date, :time, :name,:email, :web, :comment])




Here I've built a parallel iterator and used that to implement to_h_for_keys(). I also made a few other changes. I got rid of String's split_h method, it didn't seem worth cluttering the String classes namespace with a method that wouldn't be that useful in the long run. I also pulled some of the constants of load_comments and turned it into a more generic load_table.


I liked this better from an engineering standpoint, but again, it was a lot of code. I might have kept this too, If I hadn't been reviewing the documentation for Array where I noticed an iterator method I had forgotten about: each_index(), that iterates a collection by indexes instead of values. I immediately realized I could have used this to implement my previous version without the ugliness of mixing the value iteration with indexed offset access. So as much as I liked my parallel iterator, I rewrote the code once again to look like the following:





class Array
def to_h_for_keys(keys)
hash = {}
each_index() do |x|
hash[keys[x]] = self[x];
end
hash
end
end

def load_table(filename, seperator, columns)
a = []
IO.foreach(filename) do |line|
a << line.split(seperator).to_h_for_keys(columns)
end
a
end

comments = load_table('comments.txt', '|', [:id, :date, :time, :name,:email, :web, :comment])




Now this is feeling better, but the to_h_for_keys() method is bugging me now. Much like the String.to_h method I expunged earlier, I now feel to_h_for_keys needs to go as well. So now my final version follows:





def load_table(filename, seperator, columns)
a = []
IO.foreach(filename) do |line|
parts = line.split(seperator)
hash = {}
parts.each_index() do |index|
hash[columns[index]] = parts[index]
end
a << hash;
end
return a
end

comments = load_table('comments.txt', '|', [:id, :date, :time, :name,:email, :web, :comment])



OK, so now I have 14 lines of code that does the same work as the original 18 but is more flexible and hopefully easier to maintain and reuses; I've also learned a thing or two on the way. So why did I tell this story?



There were two reasons. First, I wanted to document how screwy learning a programming language can be. Second, the whole experience is troubling me. I really just wanted to build something and I got diverted by the whole 'doing it right' thing.



The fact is we talk a lot about egoless programming when discussing team programming, but I'm thinking we should apply the idea to individual development as well. While it's good to strive to do things right, there's a line that can be crossed were you descend into creeping elegance. In this case the effort was probably worth it since I needed to learn a few things, but I know that wont always be the case; at some point the rewrites will be more out of ego than necessity. The problem is that just gets in the way of the real goal - building cool things.

Post a Comment
 
The Out Campaign: Scarlet Letter of Atheism