Thursday 2 September 2010

Baby steps in Haskell

After lots of reading and sometimes comprehending over an embarrassingly long time, I now have a very limited but actually working piece of code in a functional language.

Usually, when trying to learn a new language, the hardest thing is deciding on a vaguely pointful thing to build as the learning experience. Not this time. I wanted to create a library for dealing with DICOM medical imaging data. The format's complex enough (lots of optional stuff, various encoding schemes, heirarchical object layout, etc) to be interesting and there are few really good parsers out there for it in procedural languages I'm used to.

There probably still won't be a good one for Haskell when I've finished but I'm hoping to have learned a lot. It's certainly been very hard going getting even this far but the following is a fragment of the output from my first efforts:


Prelude: 0000000000000000000000000 ... 000
Magic: DICM
Metadata:
[ (0002,0000) UL [4] "\192\NUL\NUL\NUL"
, (0002,0001) OB [2] "\NUL\SOH"
, (0002,0002) UI [26] "1.2.840.10008.5.1.4.1.1.4\NUL"
, (0002,0003) UI [56] "1.3.12.2.1107.5.99.2.1951.30000009050612303648400002557\NUL"
, (0002,0010) UI [20] "1.2.840.10008.1.2.1\NUL"
, (0002,0012) UI [20] "1.3.12.2.1107.5.99.2"
, (0002,0013) SH [16] "XXXXXXX"
]
DICOM:
[ (0008,0005) CS [10] "ISO_IR 100"
, (0008,0008) CS [22] "ORIGINAL\\PRIMARY\\M\\ND "
, (0008,0012) DA [8] "2009XXXX"
, (0008,0013) TM [14] "140340.156000 "
, (0008,0016) UI [26] "1.2.840.10008.5.1.4.1.1.4\NUL"
.
.
.
]


Doesn't look like much but it's decoded the group and element info (0002,0001) etc, the value representation (UI, CS, OB, etc) and correctly parsed the right number of bytes for the value. It's also split the file into the encapsulation (prelude, magic and metadata) and the actual DICOM object.

Ok, there's no error handling, it only knows one encoding scheme, doesn't grok heirarchical objects and there's no error handling but it's only 140 lines of code. A good third of that formats it nicely for printing on the console though so under 100 lines for a parser just blows me away. For what it does, that's probably excessively verbose but first it needs to work, then it can be refined and finally, if necessary, it can be made faster.

I don't think I've ever been so pleased with so little code before :o Haskell is very expressive but OMG has it been hard work learning to write even this little bit :/

1 comment:

  1. Looks good. Would it be possible for you to share the code? Thanks,
    MR

    ReplyDelete