Meeting the needs of your business from a distance

Code Kata 6-10 of 21

by mark shiffer 5. September 2011 05:23

I came across this site quite a while back and have been meaning to blog about it. The site is CodeKata, and the premise is essentially to give reasonably interesting problems that can be solved using code in order to practice and hone your skills. Not everything is a coding problem, but where I can see these coming in handy are using the katas to learn new languages. I always learn better when I have something to implement as I learn. Anyway, my original intention was to just link to the site, however it looks like the site hasn’t been updated in 4 years! It concerns me that it might not be around if I merely link to it, so I am going to copy the katas here for future reference. I hope Dave doesn’t mind.

Kata 6 – Anagrams

Back to non-realistic coding this week (sorry, Martin). Let's solve some crossword puzzle clues.

In England, I used to waste hour upon hour doing newspaper crosswords. As crossword fans will know, English cryptic crosswords have a totally different feel to their American counterparts: most clues involve punning or word play, and there are lots of anagrams to work through. For example, a recent Guardiancrossword had:

  Down:
    ..
    21. Most unusual form of arrest (6)

The hint is the phrase ‘form of,’ indicating that we’re looking for an anagram. Sure enough ‘arrest’ has six letters, and can be arranged nicely into ‘rarest,’ meaning ‘most unusual.’ (Other anagrams include raster, raters, Sartre, and starer)

A while back we had a thread on the Ruby mailing list about finding anagrams, and I’d like to resurrect it here. The challenge is fairly simple: given a file containing one word per line, print out all the combinations of words that are anagrams; each line in the output contains all the words from the input that are anagrams of each other. For example, your program might include in its output:

  kinship pinkish
  enlist inlets listen silent
  boaster boaters borates
  fresher refresh
  sinks skins
  knits stink
  rots sort

If you run this on the word list here you should find 2,530 sets of anagrams (a total of 5,680 words). Running on a larger dictionary (about 234k words) on my OSX box produces 15,048 sets of anagrams (including all-time favorites such as actaeonidae/donatiaceae, crepitant/pittancer, and (for those readers in Florida) electoral/recollate).

For added programming pleasure, find the longest words that are anagrams, and find the set of anagrams containing the most words (so "parsley players replays sparely" would not win, having only four words in the set).

Kata Objectives

Apart from having some fun with words, this kata should make you think somewhat about algorithms. The simplest algorithms to find all the anagram combinations may take inordinate amounts of time to do the job. Working though alternatives should help bring the time down by orders of magnitude. To give you a possible point of comparison, I hacked a solution together in 25 lines of Ruby. It runs on the word list from my web site in 1.5s on a 1GHz PPC. It’s also an interesting exercise in testing: can you write unit tests to verify that your code is working correctly before setting it to work on the full dictionary.

 

Kata 7 – How’d I Do?

The last couple of kata have been programming challenges; let’s move back into mushier, people-oriented stuff this week.

This kata is about reading code critically—our own code. Here’s the challenge. Find a piece of code you wrote last year sometime. It should be a decent sized chunk, perhaps 500 to 1,000 lines long. Pick code which isn’t still fresh in your mind.

Now we need to do some acting. Read through this code three times. Each time through, pretend something different. Each time, jot down notes on the stuff you find.

  • The first time through, pretend that the person who wrote this code is the best programmer you know. Look for all the examples of great code in the program.
  • The second time through, pretend that the person who wrote this code is the worst programmer you know. Look for all the examples of horrible code and bad design.
  • The third (and last) time though, pretend that you’ve been told that this code contains serious bugs, and that the client is going to sue you to bankruptcy unless you fix them. Look for every potential bug in the code.

Now look at the notes you made. What is the nature of the good stuff you found? Would you find similar good stuff in the code you’re writing today. What about the bad stuff; are similar pieces of code sneaking in to your current code too. And finally, did you find any bugs in the old code? If so, are any of them things that that you’d want to fix now that you’ve found them. Are any of them systematic errors that you might still be making today?

Moving Forward By Looking Back

Perhaps you’re not like me, but whenever I try this exercise I find things that pleasantly surprise me and things that make me cringe in embarrassment. I find the occasional serious bug (along with more frequent less but serious issues). So I try to make a point of looking back at my code fairly frequently.

However, doing this six months after you write code is not the best way of developing good software today. So the underlying challenge of this kata is this: how can we get into the habit of critically reviewing the code that we write, as we write it? And can we use the techniques of reading code with different expectations (good coder, bad coder, and bug hunt) when we’re reviewing our colleagues code?

 

Kata 8 – Conflicting Objectives

Why do we write code? At one level, we’re trying to solve some particular problem, to add some kind of value to the world. But often there are also secondary objectives: the code has to solve the problem, and it also has to be fast, or easy to maintain, or extend, or whatever. So let’s look at that.

For this kata, we’re going to write a program to solve a simple problem, and we’re going to write it with three different sub-objectives. Our program is going do process the dictionary we used in previous kata, this time looking for all six letter words which are composed of two concatenated smaller words. For example:

  al + bums => albums
  bar + ely => barely
  be + foul => befoul
  con + vex => convex
  here + by => hereby
  jig + saw => jigsaw
  tail + or => tailor
  we + aver => weaver

Write the program three times.

  • The first time, make program as readable as you can make it.
  • The second time, optimize the program to run fast fast as you can make it.
  • The third time, write as extendible a program as you can.

Now look back at the three programs and think about how each of the three subobjectives interacts with the others. For example, does making the program as fast as possible make it more or less readable? Does it make easier to extend? Does making the program readable make it slower or faster, flexible or rigid? And does making it extendible make it more or less readable, slower or faster? Are any of these correlations stronger than others? What does this mean in terms of optimizations you may perform on the code you write?

 

Kata 9 – Back to the Checkout

Back to the supermarket. This week, we’ll implement the code for a checkout system that handles pricing schemes such as "apples cost 50 cents, three apples cost $1.30."

Way back in KataOne we thought about how to model the various options for supermarket pricing. We looked at things such as "three for a dollar," "$1.99 per pound," and "buy two, get one free."

This week, let’s implement the code for a supermarket checkout that calculates the total price of a number of items. In a normal supermarket, things are identified using Stock Keeping Units, or SKUs. In our store, we’ll use individual letters of the alphabet (A, B, C, and so on). Our goods are priced individually. In addition, some items are multipriced: buy n of them, and they’ll cost you y cents. For example, item ‘A’ might cost 50 cents individually, but this week we have a special offer: buy three ‘A’s and they’ll cost you $1.30. In fact this week’s prices are:

  Item   Unit      Special
         Price     Price
  --------------------------
    A     50       3 for 130
    B     30       2 for 45
    C     20
    D     15

Our checkout accepts items in any order, so that if we scan a B, an A, and another B, we’ll recognize the two B’s and price them at 45 (for a total price so far of 95). Because the pricing changes frequently, we need to be able to pass in a set of pricing rules each time we start handling a checkout transaction.

The interface to the checkout should look like:

   co = CheckOut.new(pricing_rules)
   co.scan(item)
   co.scan(item)
       :    :
   price = co.total

Here’s a set of unit tests for a Ruby implementation. The helper method price lets you specify a sequence of items using a string, calling the checkout’s scan method on each item in turn before finally returning the total price.

  class TestPrice < Test::Unit::TestCase

    def price(goods)
      co = CheckOut.new(RULES)
      goods.split(//).each { |item| co.scan(item) }
      co.total
    end

    def test_totals
      assert_equal(  0, price(""))
      assert_equal( 50, price("A"))
      assert_equal( 80, price("AB"))
      assert_equal(115, price("CDBA"))

      assert_equal(100, price("AA"))
      assert_equal(130, price("AAA"))
      assert_equal(180, price("AAAA"))
      assert_equal(230, price("AAAAA"))
      assert_equal(260, price("AAAAAA"))

      assert_equal(160, price("AAAB"))
      assert_equal(175, price("AAABB"))
      assert_equal(190, price("AAABBD"))
      assert_equal(190, price("DABABA"))
    end

    def test_incremental
      co = CheckOut.new(RULES)
      assert_equal(  0, co.total)
      co.scan("A");  assert_equal( 50, co.total)
      co.scan("B");  assert_equal( 80, co.total)
      co.scan("A");  assert_equal(130, co.total)
      co.scan("A");  assert_equal(160, co.total)
      co.scan("B");  assert_equal(175, co.total)
    end
  end

There are lots of ways of implementing this kind of algorithm; if you have time, experiment with several.

Objectives of the Kata

To some extent, this is just a fun little problem. But underneath the covers, it’s a stealth exercise in decoupling. The challenge description doesn’t mention the format of the pricing rules. How can these be specified in such a way that the checkout doesn’t know about particular items and their pricing strategies? How can we make the design flexible enough so that we can add new styles of pricing rule in the future?

 

Kata 10 – Hashes vs. Classes

If we’re programming business applications in a language such as Java or C#, we’re used to constructing and using classes to manipulate our business objects. Is this always the right way to go, or would a less formal approach serves us well sometimes?

This week’s topic doesn’t involve a coding challenge. Instead, we’re thinking about design and tradeoffs.

Imagine that you’ve been asked to write an export utility for a large and complex database. The export has to read data from 30 or so tables (perhaps 100 columns are potentially written to each export record). Some of the exported data is written exactly as read from the database, but other exported data must be calculated. In addition, if certain flag fields have specific values, then additional data must be read from the database to complete an export row.

The export data must obviously be correct, but the client is also asking for a flexible solution; their world changes a lot.

One solution is to use existing business objects and existing persistence mechanisms, and to use higher-level classes to aggregate their results into a form that can be used to generate export rows. This higher level object could perform the calculations necessary for the virtual fields, and read in additional business objects if the flag fields dictate.

An alternative solution might be to read the data row at a time into a Hash using ad-hoc queries, keying the hash on the field names. A separate pass could then be made to perform any necessary calculations, storing the results back in to the same hash. Additional data could be read from the database if the flag fields are set, again storing the results in the hash. The contents of the hash are then used to write the export record, and we loop back to do the next row.

This kata is a thought experiment. What are the top three advantages and top three disadvantages of the two approaches? If you’re been using classes to hold data in your business applications, what would the impact be if you were to switch to hashes, and vice versa? Is this issue related to the static/dynamic typing debate?

Tags:

Comments

Add comment


(Will show your Gravatar icon)

  Country flag


  • Comment
  • Preview
Loading



Copyright © 2001-2012 MS Consulting, Inc. All Rights Reserved.