I recently had a lot of fun coding the hamming distance via exercism.io. Firstly, I want to caveat that I totally recognize that my approach to this problem is done in a bubble. That is, I don’t have business constraints, tech constraints, just newbie brain constraints. ;p
I got exercism.io up and running and after reading the README.md
, ran the first test.* The exercism.io exercises have tests written that need to pass, which is how you end up writing the program – yeah TDD!
Before I started coding I decided that my overall approach was going to be Kent Beck’s Make it Work, Make it Right, Make it Fast. This meant I wasn’t going to worry about the next test or the previous test. I just wanted to do the easiest thing working.
Step 1: Get it working!
The tests are 42 lines of code. My first solution was 61 lines of code. And all the tests passed. It was ugly but it worked, like using a fork to comb your hair* – it sort of hurts but it gets the job done.
There were three areas of refactoring that I could think of:
- readability
- duplication
- nested conditionals
Step 2: Refactor for readability
The first thing that jumped out at me was how I was converting a string into an array. Basically I had an empty array for each string, splitting on each letter of the string and then each of those letters ended up into their respective array. Although it was explicit, it was also a bit taxing to read how it was being split, e.g. .split(//)
. .split(//)
is powerful as it allows strings to be split in a number of ways but I just needed a basic splitting by each character.)
Then, I learned about the .chars
method.
OMG .chars
! (Yes it’s similar to my name but that’s not why I’m excited.) .chars
returns an array of characters for a string (Ruby docs). Basically str.each_char.to_a
. Or in my mind, delicious magic.
Down to 43 lines of code.
Step 3: Refactor explicit duplication
The easiest thing I noticed next was that I declared the same variable 4 times. That’s at least 3 times too many.
Since all my conditionals are in the same method they can all access the variable. So since Ruby reads code from top to bottom and since , I’ll extract it to the top of my method, outside of all the conditionals. (There’s also a pattern to extract variables in The Refactoring Catalog.)
Note: it’s at this stage that I realised there’s a bug within my code (lines 26 & 33). I’m submitting a pull request to add tests to exercism.io and for the purposes of this blog it should be fine.
Alright, 38 lines!
Step 4: Refactor simple functional duplication
The two conditionals on lines 21 and 27 look very similar. They both:
- check if one strand is longer than the other (lines 21, 27)
- trim off the extra letters of the longer strand so that it’s the same length as the shorter strand (lines 22, 28)
- creates an array for the new strand (lines 23, 29)
- creates a combined array of arrays that contains the paired elements from the two arrays (lines 24, 30)
- compares to see if the paired elements match and counts if they do not match (lines 25, 31)
So really, if one strand is longer than the other, regardless if it’s the first strand passed or second, what we’re really saying is that the two strands don’t equal.
I can use .min
to determine the minimum length of the two strands and then I can trim both strands to that minimum length while zipping them into one array that contains arrays of paired elements from the two strands.
Lastly, I can iterate through those paired elements and check if they are the same value. I can then count the number of times they do not match, which is the Hamming Distance.
31 lines!
Step 5: Refactor functional duplication
I wrote the conditional on line 5 more explicitly. But I noticed that it’s doing the same function as the conditional on line 21. (There’s the beauty of Ruby – many ways to do something!) However, at RailsConf this year I went to Sandi Metz’s talk “All the Little Things”. She mentions that nested conditionals are good candidates for refactoring. And let’s be honest, while I do love that it’s very much step-by-step, it is a bit long and there’s mental overhead to remember where I am in the condition.
(As a sidenote, this refactoring made me realize the potential tradeoff (especially as a newbie) between clarity and readability, specifically:
- laying out all the steps seems clear however it take mental effort to remember all the steps and where I am in those steps
- collapsing code makes the code more dense and that also takes effort to read.)
It’s at this point that I suspect (ever so cautiously) that the conditional on line 21 might pass all my tests. (For the record, this is like waiting in line for a roller coaster or trying out a new recipe and waiting for it to bake in the oven – you are really excited and hopeful but also a bit scared that it might just go all wrong. And my imposter syndrome rears it’s head too.)
So I move the magic conditional to the top of my method, remove it’s conditional-ness, and comment out all the rest of the code. (#fearofcodecommitment)
…and run my tests.
…and listen to:
It worked! I’m a wizard! Actually I’m a paladin! Someone high five me or fist bump me!
7 lines of code!
5 steps. 90% less code.
You can nitpick me here. (On line 4, I do think there’s one debatable improvement for readability and one less debatable improvement for removing duplication.)
*Thanks to exercism.io for inadvertently teaching me about skip
in MiniTest.
*Yes I’m referencing Ariel in The Little Mermaid.