Tag Archives: Clojure

Arnoldc interpreter using Clojure instaparse

Wow, I didn’t realize I had taken such a long sabbatical since my last post. I haven’t let off the coding gas while I was away ūüôā ¬†One of the many things I have been working on is¬†writing a clojure Arnoldc interpreter. If you are unfamiliar with Arnoldc, check out this link¬†https://github.com/lhartikk/ArnoldC. ¬†I had come across clojure instaparse¬†during the advent of code 2015 and used it to solve day 7¬†and thought it was the perfect tool to accomplish my goal. ¬†Per the README on project page, “Instaparse aims to be the simplest way to build parsers in Clojure” and I couldn’t agree more.

The project is made up of the following clojure files:

 

The lexr.clj (Parser)

The lexr file contains the code utilizing the instaparse library to transform Arnoldc strings into hiccup structures. Below is a brief description of the clojure vars and functions within the lexr and interpreter.

NOTE: I left out code that supported my need to capture the actual thrown parse exception instead of the default implemented within the library. Refer to http://stackoverflow.com/questions/26338945/how-to-test-for-texts-not-fitting-an-instaparse-grammar-clojure for further details.

ADDITIONAL NOTE: I use ,,, in place of ellipsis as commas are whitespace with in clojure ūüôā I also bold parenthesis when used as part of clojure code in the hopes that it adds clarity.

 

(def tokens ,,,)

This contains a map of all the tokens of the Arnoldc language. I created this with the idea that I could show how easy it would be to transform the language and continue to have the same functionality. At the end of the post I describe an example of this with arnoldc pig latin lexr and the test

(defn arnold-grammar ,,,)

This function will create the string representation of the Arnoldc grammar. As clojure core doesn’t contain string interpolation, this is not the greatest¬†looking… When you combine the tokens and the string you would receive the following string

(def arnoldc (insta/parser arnold-grammar))

This is where the parsing magic happens.  Passing in a string will parse it into hiccup structures according to the Arnoldc grammar that I defined.

example usage is here

(def transform-ops ,,,)

This map was created to transform the parsed elements that match clojure keyswords. Early versions of the transform simply returned values  0, 1 for @I LIED and @NO PROBLEMO that were not in the hiccup style [:key value] . I needed these specific elements to conform to a hiccup shape so that I could evaluate it in a consistent way when I wrote my interpreter. For clarity:

  • I originally returned 0 then transformed it into ¬†[:false 0]
  • I originally returned 1 then transformed it into [:true 1]

I found that the original version(returning just the value 0 or 1) made the interpreter much more complicated as I was having to check which kind of structure I was receiving after recursing down to the returned value.  By adjusting the transform to return a hiccup structure I was able to remove the complex validation checks I was building due to the keyword that told me what I was dealing with.  This allowed me to continue to rely on multimethods that were guaranteed to get the correct structure and use the same recursive function for all the tokens.

 

(defn lights-camera-action ,,,) 

This wraps the Arnoldc instaparse parser and is what should be used to parse Arnoldc strings. 

 

The interpreter.clj

The interpreter is mostly a multimethod that is dispatched on the first element within a hiccup structure(a clojure keyword). A global symbol-table  is used to hold variables and state. I will describe a some of the more interesting items below.

(def symbol-table (atom {}))

This atom is a  map that holds the state for an Arnoldc program.

DISCLAIMER:¬†I made a trade-off in my design with how I implemented garbage collection…. I haven’t done it yet. Completing a program will clear the symbol-table, however running a long enough loop that calls a function with parameters will eventually cause problems. Details on how variables are created can be found below in the ¬†(defn transform-method-variables ,,,)¬†section.

(defmulti run (fn [s] (nth s 0)))

This multimethod is the engine that powers the interpreter. It is dispatching on the keywords within each hiccup structure(clojure vector) which is always in the 0 position of each vector.

(defmethod run :Program ,,,)

The Arnoldc language allows methods to be declared before, after, or both before and after the main program. The let form(see below) separates out all the method declarations via the  clojure group-by function so that I can define them(by run-statements which will dispatch to multimethods) before they are called within the main Arnoldc program(via (run bmain) ). bmain gets all statements that evaluate to true via group-by, while method-des gets all the method declarations.

I reset the symbol-table when the program completes so that I don’t pollute subsequent runs. To see what can happen, comment out the (reset! symbol-table {})¬†line and run the tests :). You will find that the symbol table keeps the state around and causes problems between tests that use the same method declarations and variables.

(defmethod run :Program [[e & rest :as statements]]
 (let [ {[bmain] true method-des false} 
                (group-by #(= :begin-main (first %)) rest)]
 (try 
   (run-statements method-des)
   (run bmain)
 (catch Exception e (throw e))
 (finally
  (reset! symbol-table {})))))

(defn arithmetic-helper ,,,)

This function handles the Arnoldc logic and arithmetic operators. The thing to note is the following lines of code. case is wrapped in a second set of parenthesis in order to immediately call the returned function from choose-logic-op or choose-op. The function will be invoked with operand and the result of (run varnum-node) as parameters.

A function case, will return a function and be passed and operand the the result of a multimethod dispatch against varnum-node. Higher order functions FTW!

(recur ( (case arith-key
           :logical-op (choose-logic-op operator)
           :arithmetic-op (choose-op operator)) 
         operand 
         (run varnum-node)) 
  rest)

(defn transform-method-variables ,,,)

This function¬†is called by the¬†:call-method multimethod. I used¬†gensym and the method name¬†to create a prefix to be passed to the¬†transform-method-variables¬†function. I did this as I am storing the variables all in the “global”¬†symbol-table atom. An¬†issue this function resolved was one that I encountered in recursive test cases where I would stomp on variables declared in the methods as they were called multiple times. I created¬†an issue¬†to clean this up.

(defmethod run :call-method ,,,)

Most of the complexity of this code is in the handling of arguments passed to the method(if/when they are passed). new-meth-args gets the same treatment as the variables mentioned in the transform-method-variables function mentioned above and gets a prefix. 

(defn roll-credits ,,,)

This function is the preferred way to interpret Arnoldc code. See the tests for an example.

(roll-credits  "IT'S SHOWTIME
         HEY CHRISTMAS TREE var
         YOU SET US UP 123
         YOU HAVE BEEN TERMINATED" )

Pig Latin Arnoldc lexr

As mentioned in the (def tokens ,,,) section above, we have finally reached the section where I describe why I defined the tokens outside of the arnoldc-grammar. The key to the transformation is the update-map function which will transform all the values within a map.

(def pig-latin-arnoldc ,,,)

The following function code will

  1. use the update-map function and take the arnoldc tokens map and return a map with the same keys, but with new pig latin arnoldc values(the qoutes that make up the language). The translate-to-pig-latin function will split a string on whitespace and map the pig-latin function to all the strings from the split and then join  them together to form a pig latin string. 
  2. pass the “new” map¬†of the pig latin arnoldc language(described in #1) to the arnold-grammar function. Since the keywords stay the same, the function will simply pull out the new values mapped to the original arnoldc keys and insert them into the grammar.
  3. Finally insta/parser will return an executable parser based on the pig-latin grammar.
(def pig-latin-arnoldc
  (-> (update-map arnie/tokens translate-to-pig-latin);#1
      (arnie/arnold-grammar);#2
      (insta/parser)));#3

(defn ights-camera-actionlay ,,,)

This function is the pig latinified lights-camera-action function from the arnoldc lexr.

  1. pass the arnoldc lexr’s parser function¬†the pig latin transformed string(s) represented by¬†expr.
  2. transform the parsed hiccup datastructures that match the keys within the transform-ops from the arnoldc lexr. This continues to work without modification because the map keys remain the same as the orignal arnoldc definitions(we only updated the values :D).
  3.  catch any thrown parse errors
(defn ights-camera-actionlay [& expr]
"interpret pig-latin text"
  (try (->> (arnie/parser pig-latin-arnoldc (clojure.string/join expr));#1
            (insta/transform arnie/transform-ops));#2
       (catch Exception e 
         (throw (Exception.  (str "EREWHAY ETHAY UCKFAY IDDAY IWAY OGAY ONGWRAY?" (.getMessage e)))))))

Conclusion

I was really satisfied with how quickly instaparse allowed me to create an interpreter. The hiccup structures were easy to work with and clojure is a joy to code in. I will certainly be keeping it within my toolbox and I encourage you to try it out.

 

Clojure Soundex

In need of a quick program to force myself to dive in to clojure, I chose to implement a soundex program that I at one time had written in C++. It was a fun exercise to step back and look at how my thought process changed based on the language I used. Hope you find this useful.

 

;steps
;1 keep first letter
;2 replace consonants
;3 remove w and h
;4 two adjacent are same, letters with h or w separating are also the same
;5 remove vowels
;6 continue until 1 letter 3 nums
(use 'clojure.contrib.str-utils)

(defn trnsfrm[ word]
  (->>
    (re-gsub #"(?i)[fbvp]" "1" word)
    (re-gsub #"(?i)[cgjkqsxz]" "2" ,,) 
    (re-gsub #"(?i)[dt]" "3" ,,) 
    (re-gsub #"(?i)[l]" "4" ,,)
    (re-gsub #"(?i)[mn]" "5" ,,)
    (re-gsub #"(?i)[r]" "6" ,,)))

(defn replace-adjacent [word] 
  (->> (re-gsub  #"(?i)[wh]" "" word ) 
  	trnsfrm 
  	(re-gsub #"(?i)([a-z0-9])\1+" "$1" )))  	

(defn pad [word](subs (str word "0000") 0 4))  	

(defn do-soundex [word]
    (pad ( str (first word)(re-gsub #"[aeiouy]"  "" (subs (replace-adjacent word) 1)))))

Update Refactored version
Not quite happy with the above example, I decided to see if I could refactor my code. Below is what I came up with(4 less lines code).

(use 'clojure.contrib.str-utils)

(def re-map{ #"(?i)[fbvp]" "1",#"(?i)[cgjkqsxz]" "2",#"(?i)[dt]" "3",#"(?i)[l]" "4",#"(?i)[mn]" "5",#"(?i)[r]" "6" })

(defn trns [word] (map #(re-gsub (key %1) (val %1) word) re-map))

(defn pad [word](subs (str word "0000") 0 4))

(defn rm1 [word] (apply str(drop 1 word)))

(defn do-soundex [word]
    (pad(str (first word) (->>
        (re-gsub #"(?i)[^aeiou\d]" "" (apply str (apply interleave (trns word ))))
        (re-gsub #"(?i)([a-z\d])\1+" "$1" )
        rm1
        (re-gsub #"(?i)[a-z]" "" )))))

 

UPDATE 2:Always one to go back to my previous works, I thought I would try a different approach. 

Now for the test cases

;;;Start test
(=(do-soundex  "Ashcroft") "A261")
(=(do-soundex  "Ashcraft") "A261")
(=(do-soundex  "Tymczak") "T522")
(=(do-soundex  "Pfister") "P236")
(=(do-soundex"lukaskiewicz")"l222")
(=(do-soundex"Rubin")"R150")
(=(do-soundex"Rupert")"R163")
(=(do-soundex"Robert")"R163")
(=(do-soundex "Vazquez")"V220")

;;;end test

POP3 Gmail access with Clojure and JavaMail

I recently had the need to access gmail using Clojure. I used JavaMail to accomplish this via pop3. Below is some code that I wrote to help me get emails. Hope you find it useful Enjoy ūüôā


(use '[clojure.contrib.duck-streams])
(def props (System/getProperties))
; Get the default Session object.
(def session (javax.mail.Session/getDefaultInstance props))

; Get a Store object that implements the specified protocol.
(def store (.getStore session "pop3s"))

;Connect to the current host using the specified username and password.
(.connect store "pop.gmail.com" "username@gmail.com" "password")

;Create a Folder object corresponding to the given name.
(def folder (. store getFolder "inbox"))

; Open the Folder.
(.open folder (javax.mail.Folder/READ_ONLY ))
; Get the messages from the server
(def messages (.getMessages folder))

(defn getFrom [message](javax.mail.internet.InternetAddress/toString (.getFrom message)))
(defn getReplyTo [message] (javax.mail.internet.InternetAddress/toString (.getReplyTo message)) )
(defn getSubject [message] (.getSubject message))

;print out the body of the message
(for [m messages] (read-lines(.getInputStream m)) )

;;;;;code for sending an email

(def props (System/getProperties))
(. props put "mail.smtp.host", "smtp.gmail.com")
(. props put "mail.smtp.port", "465")
(. props put "mail.smtp.auth", "true")
(. props put "mail.transport.protocol", "smtps")

(def session (javax.mail.Session/getDefaultInstance props nil))
(def msg (javax.mail.internet.MimeMessage. session))
(. msg setFrom (javax.mail.internet.InternetAddress. "sender@gmail.com"))
(. msg addRecipients javax.mail.Message$RecipientType/TO
"receiver@gmail.com")

(. msg setSubject "i am the subject")
(. msg setText "I am the body!!!")

(. msg setHeader "X-Mailer", "msgsend")
(. msg setSentDate (java.util.Date.))

; send the email
(def transport (. session getTransport))
(. transport connect "smtp.gmail.com" 465 "sender@gmail.com" "password")
(. transport sendMessage msg (. msg getRecipients javax.mail.Message$RecipientType/TO))
(. transport close)

(def Bonjour-Clojure “Welcome to functional programming”)

After the briefest of introductions to functional programming in college(a la Lisp) and dabbling with Scala, I took the functional plunge and started using Clojure recently. At this point, I have only written a couple of small programs and haven’t formed much of an opinion on where it stacks against my current favorite language at the moment(Groovy). This post will follow my usual getting started with a language snippets. I plan to write more entries as I get more familiar with the language. On to the code!


;binding
user=> (def Bonjour-Clojure “Welcome to functional programming”)
#’user/Bonjour-Clojure
user=> Bonjour-Clojure
“Welcome to functional programming”

;items in a list can be seperated via a comma or white space..
user=> (= [ 1 2 3] [1,2,3])
true

;count the number of consonants in a string
(defn count-consonants [string] (count ( re-seq #”[^aeiouAEIOU\s]” string )))
user=> (count-consonants “writing code is fun”)
10

;count the number of vowels in a string
(defn count-vowels [string] (count ( re-seq #”[aeiouAEIOU\s]” string )))
user=> (count-vowels “lukaskiewicz”)
5

;read a file into a list.. any suggestions on other ways are welcome ūüôā
;usage (file-lines “string_path_to_file”) or to read a webpage ((file-lines “http://javazquez.com”)
(defn file-lines [file] (with-open [rdr (clojure.java.io/reader file)] ( set ( line-seq rdr))))

;view objects class
user=>(class “Im a string”)
java.lang.String

;length of string
user=>(count “I am 18 chars long”)
18

user=>(range 1 9)
(1 2 3 4 5 6 7 8 )

;repeat a digit
user=>(repeat 4 3)
(3 3 3 3)

;list comprehension
user=>(for [fruit [“apple” “orange” “grape”] ] (str fruit))
(“apple” “orange” “grape”)

;use map to create a new list… #() is a shortcut for an anonymous
user=>(map #(* 2 %1) [1 2 3 4])
(2 4 6 8 )

; also an anonymous function
user=> (map (fn [item](* 2 item)) [1 2 3 4])
(2 4 6 8 )

;simple fiter example on a list using odd?
user=> (filter odd? [1, 2,3,4,5])
(1 3 5)

;factorial using reduce
user=> (reduce * [1 2 3])
6

;if statement
user=> (if true (str “i am true”)(str “i am false”))
“i am true”