Clojure Soundex
2012-08-28In need of a quick program to force myself to dive in to clojure, I chose to implement a soundex program that I at one time had written in C++. It was a fun exercise to step back and look at how my thought process changed based on the language I used. Hope you find this useful.
;steps
;1 keep first letter
;2 replace consonants
;3 remove w and h
;4 two adjacent are same, letters with h or w separating are also the same
;5 remove vowels
;6 continue until 1 letter 3 nums
(use 'clojure.contrib.str-utils)
(defn trnsfrm[ word]
(->>
(re-gsub #"(?i)[fbvp]" "1" word)
(re-gsub #"(?i)[cgjkqsxz]" "2" ,,)
(re-gsub #"(?i)[dt]" "3" ,,)
(re-gsub #"(?i)[l]" "4" ,,)
(re-gsub #"(?i)[mn]" "5" ,,)
(re-gsub #"(?i)[r]" "6" ,,)))
(defn replace-adjacent [word]
(->> (re-gsub #"(?i)[wh]" "" word )
trnsfrm
(re-gsub #"(?i)([a-z0-9])\1+" "$1" )))
(defn pad [word](subs (str word "0000") 0 4))
(defn do-soundex [word]
(pad ( str (first word)(re-gsub #"[aeiouy]" "" (subs (replace-adjacent word) 1)))))
UPDATE 1: Refactored version
Not quite happy with the above example, I decided to see if I could refactor my code. Below is what I came up with(4 less lines code).
(use 'clojure.contrib.str-utils)
(def re-map{ #"(?i)[fbvp]" "1",#"(?i)[cgjkqsxz]" "2",#"(?i)[dt]" "3",#"(?i)[l]" "4",#"(?i)[mn]" "5",#"(?i)[r]" "6" })
(defn trns [word] (map #(re-gsub (key %1) (val %1) word) re-map))
(defn pad [word](subs (str word "0000") 0 4))
(defn rm1 [word] (apply str(drop 1 word)))
(defn do-soundex [word]
(pad(str (first word) (->>
(re-gsub #"(?i)[^aeiou\d]" "" (apply str (apply interleave (trns word ))))
(re-gsub #"(?i)([a-z\d])\1+" "$1" )
rm1
(re-gsub #"(?i)[a-z]" "" )))))
UPDATE 2: Always one to go back to my previous works, I thought I would try a different approach.
Now for the test cases
;;;Start test
(=(do-soundex "Ashcroft") "A261")
(=(do-soundex "Ashcraft") "A261")
(=(do-soundex "Tymczak") "T522")
(=(do-soundex "Pfister") "P236")
(=(do-soundex"lukaskiewicz")"l222")
(=(do-soundex"Rubin")"R150")
(=(do-soundex"Rupert")"R163")
(=(do-soundex"Robert")"R163")
(=(do-soundex "Vazquez")"V220")
;;;end test