Code and Data
Later: Color Notations
Earlier: (oodles of delicious atoms)
Introduction
This article was adapted from discussions we had at the weekly Clojure study group at OpinionLab, in June, 2018.
Clojure is a Lisp. Code is data in Lisp, and data can also be code. Lisp code consists of lists of symbols (or atoms, as they are called in other Lisps), other data primitives such as numbers and keywords, or (nested) lists of the same.1
The first item in an expression (list) is the name of an operation,
usually a function; the rest of the items are the arguments. When
lists are nested, the inner expressions are evaluated first: in (+ 1
(* 2 3))
, the multiplication happens before the addition.
Somewhat less commonly, instead of function names, the first item in
the list may be the name of a macro or so-called "special form." In
these cases, the exact method of evaluation depends on the
implementation of the macro or form. For example, in (dotimes [_ 10]
(println 'hi))
, the println
is evaluated 10 times; if dotimes
were a function, println
would be evaluated exactly once.
To prevent evaluation of an expression, one can quote
it:
(quote (+ 1 (* 2 3)))
;;=>
(+ 1 (* 2 3))
The single quote (') character is an abbreviation for quote
:
'(+ 1 (* 2 3))
;;=>
(+ 1 (* 2 3))
The REPL will evaluate expressions unless you quote them; if you
decide later you want to evaluate an expression, you can use eval
:
(def expr '(+ 1 (* 2 3)))
(eval expr)
;;=>
7
Quoting and evaluation are inverse operations:
(eval (eval (eval (eval ''''(+ 1 (* 2 3))))))
;;=>
7
Here are some examples you should try at the REPL, to help build an understanding of these two operations:
(+ 1 1)
'+
+
'(+ 1 1)
(cons '+ '(1 1))
(eval (cons '+ '(1 1)))
(first '(+ - * /))
(cons (first '(+ - * /)) '(1 1))
(eval (cons (first '(+ - * /)) (rest '(not-a-number 1 1))))
(->> '(not-a-number 1 1)
rest
(cons (first '(+ - * /)))
eval)
Try to understand each of these as best you can, and try some
variations. The last expression is the same as the one prior to it,
just rewritten using the thread-last (->>
) macro.
Arithmetic of the Fittest
The close relationship of code and data suggests making new code on the fly to solve problems, which is a concept used in genetic algorithms. Here is an example inspired by GAs. Let's write a function which finds expressions involving the four arithmetic operators that return the number 2. Examples:
(+ 1 1)
(- 4 2)
(* 1 2)
(/ 2 1)
First, let's make our expression generator (type or copy-paste into your REPL):
(defn oper [] (rand-nth '(* / - +)))
(defn digit [] (rand-int 10))
(defn numlist []
(let [numnums (rand-int 10)]
(repeatedly numnums digit)))
(defn expr-fn []
(cons (oper) (numlist)))
Trying these out:
(oper) ;;=> *
(oper) ;;=> /
(numlist) ;;=> (6)
(numlist) ;;=> (6 9 3 4 4 2 4 3 2)
(expr-fn) ;;=> (/ 6 1 3 7 0 3 0 9)
(expr-fn) ;;=> (- 5 7 2)
Our expression generator selects an arithmetic operator at random, and tacks it on to the beginning of a random list of 0-9 digits.
Evaluating some example functions:
(eval (expr-fn)) ;;=> 0
(eval (expr-fn)) ;;=> 41
(eval (expr-fn)) ;;=> 7/1152
(eval (expr-fn)) ;;=> java.lang.ArithmeticException stacktrace!?!?!
This last one shows a problem: some of our expressions will not be
valid mathematically (for example, (/ 1 0)
). Let's catch any exceptions:
(defn eval-with-catch-exception [expr]
(try
(eval expr)
(catch Throwable t
(-> t .getClass .getSimpleName symbol))))
This function attempts to evaluate expr
and, if the code throws an
exception, simply returns the exception name as a symbol.
We are now in a position to try our code generator out:
(dotimes [_ 10]
(let [e (expr-fn)]
(println e (eval-with-catch-exception e))))
which gives
(/ 9 8 7 0 0 7 2 2 6) ArithmeticException
(+ 1 3 3 0 7 7 6) 27
(+ 4 2) 6
(- 4 1 2 1 0 0 0) 0
(/ 5 7 0 7) ArithmeticException
(* 0 4 6) 0
(* 8 6 7 2 4 1) 2688
(* 7 4 6 9 8 6) 72576
(*) 1
(- 7) -7
(*
called with no arguments returns 1; +
with no arguments returns 0. This may or
may not make intuitive sense to you, depending on how much math you've had in
school.)
Let's generate some that return the desired result, 2:
(distinct
(for [_ (range 1000)
:let [f (expr-fn)
result (eval-with-catch-exception f)
fitness-fn #(= result 2)]
:when (fitness-fn)]
f))
;;=>
((+ 2)
(- 9 1 1 1 4)
(- 2 0)
(/ 4 2)
(/ 2 1)
(* 2))
(Note that (+ n)
and (* n)
both just return n
.)
Macros
Macros are an advanced topic, but they are relevant because they allow
(among other things) selective evaluation of arguments. Arguments to a
macro can be evaluated zero, one, or many times; this makes macros
especially helpful for implementing control flow: loop
, for
,
while
etc. are all macros.
Macros rely on code-as-data: they essentially functions which generate code, which is then evaluated. Most or all modern Lisps have macros, the equivalent of which are very rare in non-Lisps. These macros play a large role in making Lisp so expressive.
Exercises
Exercise 1: Creating and evaluating code on the fly in this manner is not unique to Lisp, but Lisp provides a very natural way to do so. Consider how to do the above examples in JavaScript, Ruby, or other mainstream languages. What, if any, are the differences between Clojure/Lisp and those languages?
Exercise 2: Set a different arithmetic goal (other than 2
) and
generate expressions which satisfy it.
Exercise 3: Adapt your code generator to include nested expressions,
so that some of the numbers are generated expressions in their own
right. Example: (+ 1 4) -> (+ 1 (* 2 2))
.
Exercise 4: Instead of generating expressions, adapt expr-fn
to
return functions of no arguments, and adapt the rest of the code to
call and evaluate those functions. I.e., (+ 1 2 3)
becomes (fn []
(+ 1 2 3))
.
Exercise 5 (advanced): Both the generated functions and the "fitness function"
fitness-fn
can be made more complex; for example, in addition to the
operators + - * /
and the numbers 0-9, you could add an "argument"
variable x
and change the fitness function to expect expressions
that return the square of x
; some acceptable outputs would
then be (* x x)
and (* 1 x x 1)
.
Change your expression generator and fitness functions to implement the squaring function.
Exercise 6 (advanced): try adding other Clojure core functions and data types to the list of possible operators and arguments; try other fitness functions as well.
Exercise 7 (more advanced): develop a way of combining expressions
and evaluating the fitness of the resulting expression. To build on
our previous example, if you have the two lists (+ 1 1)
and (* 2
3)
, you could choose terms at random from each list, e.g. (* 1 3)
;
you must decide how to handle lists with different lengths. The
fitness function could just be fitness-fn
from our example, or a
more ambitious one2. If you want to include nested
expressions, then you have to decide how to combine those.
Rather than success or failure, the fitness function could return a number which reflects how well or poorly the expression performed. A strategy from genetic algorithms is to generate many expressions, evaluate these, take the best performers, and then generate new expressions by combining these. Implement this.
See Also
- This freaking amazing talk3 given by William Byrd and Greg Rosenblatt on program synthesis, given at Clojure/conj 2016.
- Genetic Algorithms on Wikipedia4.
- Colin Jones's excellent book on macros5.
Later: Color Notations
Earlier: (oodles of delicious atoms)