Meta-Programming with Scala: Conditional Compilation and Loop Unrolling

29 10 2008

The kind of comments I keep getting on my static meta-programming with Scala blogs are often along the lines of: “The ability to encode Church Numerals in Scala still seems uselessly academic to me, but cool none-the-less”. In this blog I will show how meta-programming can be applied in practice – at least theoretically.

In my previous blogs I introduced a technique for static meta-programing with Scala. This technique uses Scala’s type system to encode values and functions on these values. The Scala compiler acts as interpreter of such functions. That is, the type checking phase of the Scala compiler actually carries out calculations like addition and multiplication.

In this post I show how to apply meta-programming for two practical problems: conditional compilation and loop unrolling. The examples make use of type level encoding for booleans and natural numbers. While I introduced an encoding for natural numbers before, I use an alternative method which is more powerful in this post. Previously it was not possible to build nested expressions having expressions as operands themselves. The new encoding supports such expressions. However, in general the new encoding depends on the -Yrecursion compiler flag which is experimental and as of now only available in the Scala trunk. The type level encoding for booleans is along the same lines as the one for natural numbers.

Conditional Compilation

Conditional compilation is useful for example for enabling or disabling debugging or logging statements. Ideally code which is excluded by a compile time condition does not have any effect on the run-time behavior of the rest of the code. That is, the rest of the code behaves exactly as if the excluded code were not there at all. Optimizing compilers generally remove code which is unreachable. This is where meta-programming fits in: type level encoded functions (meta-functions) are evaluated at run-time. The result of the evaluation is a type. Now we only need to trick the compiler into compiling a block of code or not compiling it depending on that type.

Lets first define meta-booleans and some operations on them (full code here):

object Booleans {
  trait BOOL {
    type a[t <: BOOL, f <: BOOL&#93; <: BOOL
    type v = a&#91;TRUE, FALSE&#93;
  }
  final class TRUE extends BOOL {
    type a&#91;t <: BOOL, f <: BOOL&#93; = t
  }
  final class FALSE extends BOOL{
    type a&#91;t <: BOOL, f <: BOOL&#93; = f
  }
  trait IF&#91;x <: BOOL, y <: BOOL, z <: BOOL&#93; extends BOOL {
    type a&#91;t <: BOOL, f <: BOOL&#93; = x#a&#91;y, z&#93;#a&#91;t, f&#93;
  }
  trait NOT&#91;x <: BOOL&#93; extends BOOL {
    type a&#91;t <: BOOL, f <: BOOL&#93; = IF&#91;x, FALSE, TRUE&#93;#a&#91;t, f&#93;
  }
  trait AND&#91;x <: BOOL, y <: BOOL&#93; extends BOOL {
    type a&#91;t <: BOOL, f <: BOOL&#93; = IF&#91;x, y, FALSE&#93;#a&#91;t, f&#93;
  }
  trait OR&#91;x <: BOOL, y <: BOOL&#93; extends BOOL {
    type a&#91;t <: BOOL, f <: BOOL&#93; = IF&#91;x, TRUE, y&#93;#a&#91;t, f&#93;
  }

  // aliases for nicer syntax
  type !&#91;x <: BOOL&#93; = NOT&#91;x&#93;
  type ||&#91;x <: BOOL, y <: BOOL&#93; = OR&#91;x, y&#93;
  type &&&#91;x <: BOOL, y <: BOOL&#93; = AND&#91;x, y&#93;
}
&#91;/sourcecode&#93;

The following pre-processor object contains an <a href="http://www.scala-lang.org/node/114">implicit</a> method for converting a value of type <em>TRUE</em> to an <em>Include</em> object whose <em>apply</em> method executes a block of code. Similarly it contains an implicit method for converting a value of type <em>FALSE</em> to an <em>Exclude</em> object whose <em>apply</em> method simply does nothing. The strange line where <em>null</em> is being cast to <em>B</em> is a trick for getting a witnesses of a value of type <em>B</em>.


object PreProc {
  def IF[B] = null.asInstanceOf[B]

  object Include {
    def apply(block: => Unit) {
      block
    }
  }

  object Exclude {
    def apply(block: => Unit) { }
  }

  implicit def include(t: TRUE) = Include
  implicit def exclude(f: FALSE) = Exclude
}

Using these definitions is quite convenient now:

object IfDefTest {
  import PreProc._

  type LOG = TRUE
  type ERR = TRUE
  type WARN = FALSE

  def errTest() {
    IF[(LOG && ERR)#v] {
      println("err")
    }
  }

  def warnTest() {
    IF[(LOG && WARN)#v] {
      println("warn")
    }
  }

  def main(args: Array[String]) {
    errTest()
    warnTest()
  }
}

Running the above code will print err but wont print warn to the console.

Loop Unrolling

Another application for static meta-programming is loop unrolling. When the number of iterations of a loop is small and only depends on quantities known at compile time, run time performance might profit from unrolling that loop. Instead of resorting to copy paste, we can use similar techniques like above.

Again let’s first define meta-naturals and their operations (full code here):

object Naturals {
  trait NAT {
    type a[s[_ <: NAT&#93; <: NAT, z <: NAT&#93; <: NAT
    type v = a&#91;SUCC, ZERO&#93;
  }
  final class ZERO extends NAT {
    type a&#91;s&#91;_ <: NAT&#93; <: NAT, z <: NAT&#93; = z
  }
  final class SUCC&#91;n <: NAT&#93; extends NAT {
    type a&#91;s&#91;_ <: NAT&#93; <: NAT, z <: NAT&#93; = s&#91;n#a&#91;s, z&#93;&#93;
  }
  type _0 = ZERO
  type _1 = SUCC&#91;_0&#93;
  type _2 = SUCC&#91;_1&#93;
  type _3 = SUCC&#91;_2&#93;
  type _4 = SUCC&#91;_3&#93;
  type _5 = SUCC&#91;_4&#93;
  type _6 = SUCC&#91;_5&#93;

  trait ADD&#91;n <: NAT, m <: NAT&#93; extends NAT {
    type a&#91;s&#91;_ <: NAT&#93; <: NAT, z <: NAT&#93; = n#a&#91;s, m#a&#91;s, z&#93;&#93;
  }
  trait MUL&#91;n <: NAT, m <: NAT&#93; extends NAT {
    trait curry&#91;n&#91;_&#91;_&#93;, _&#93;, s&#91;_&#93;&#93; { type f&#91;z&#93; = n&#91;s, z&#93; }
    type a&#91;s&#91;_ <: NAT&#93; <: NAT, z <: NAT&#93; = n#a&#91;curry&#91;m#a, s&#93;#f, z&#93;
  }

  // aliases for nicer syntax
  type +&#91;n <: NAT, m <: NAT&#93; = ADD&#91;n, m&#93;
  type x&#91;n <: NAT, m <: NAT&#93; = MUL&#91;n, m&#93;
}
&#91;/sourcecode&#93;

The pre-processor object defines a trait <em>Loop</em> having an <em>apply</em> method which takes a block of code as argument. Again there are two implicit conversion methods. One which converts the zero type to a <em>Loop</em> with an empty <em>apply</em> function. An another one which convert the type <em>N + 1</em> to a a <em>Loop</em> with an <em>apply</em> function which executes the block once and then applies itself to the type <em>N</em>.


object PreProc {
  def LOOP[N] = null.asInstanceOf[N]

  trait Loop[N] {
    def apply(block: => Unit)
  }

  implicit def loop0(n: ZERO) = new Loop[ZERO] {
    def apply(block: => Unit) { }
  }

  implicit def loop[N <: NAT&#93;(n: SUCC&#91;N&#93;)(implicit f: N => Loop[N]) = new Loop[SUCC[N]] {
    def apply(block: => Unit) {
      block
      null.asInstanceOf[N].apply(block)
    }
  }
}

Again using this is easy and convenient:

object LoopUnroll {
  import PreProc._

  def unrollTest() {
    // The following line needs the -Yrecursion 1 flag
    // LOOP[(_3 x _2)#v] {
    LOOP[_6] {
      println("hello world")
    }
  }

  def main(args: Array[String]) {
    unrollTest()
  }
}

The above code prints the string “hello word” six times to the console.

Conclusion

Scala’s type system is powerful enough for encoding commonly encountered functions. Together with Scala’s strong capability for creating internal DSLs, this results in convenient techniques for static meta-programming. Such techniques can be applied to practical problems – at least in theory. In practice Scala’s support is not (yet?) there. For one the technique presented here depends on an experimental compiler flag (-Yrecursion). Further the types required for meta-programming might causes an exponential growth in compilation time which is not desirable. And finally an analysis with c1visualizerwith showed, that the compiler seems not to remove all unnecessary calls.

Actions

Information

Date : October 29, 2008
Tags: Meta-Programming, Scala
Categories : Uncategorized

14 responses

30 10 2008

Lachlan O'Dea (04:08:28) :

You’re blowing my mind with this stuff. That’s good…

30 10 2008

Daniel Spiewak (16:59:08) :

Wow! Really, solidly interesting stuff. I do see a few problems though…

First, neither of these techniques are actually performed inline. In other words, with the conditional compilation, the compiler will still emit bytecode which creates a thunk for a block (not the cheapest operation since it does require a dedicated class) and then passes that value to a static method. This is hardly a performance improvement over if (LOG) log.write(…). It *could* be, but I think only if the compiler actually recognized the general pattern of calling apply on an object passing a block created on the spot. This wouldn’t be a tough optimization to write, but as you pointed out, it doesn’t really exist yet.

Loop “unrolling” is even worse. You’re sacrificing the flexibility of using a runtime value for your loop and trading it for a statically defined constant. I would argue such a technique is of limited utility (unlike conditional compilation, which I think is great). However, the lack of flexibility isn’t really the bad part. The more serious issue is the loop isn’t really being “unrolled”. Intuitively, your LOOP[_6] { println(“Hello”) } should become:

println(“Hello”)
println(“Hello”)
println(“Hello”)
println(“Hello”)
println(“Hello”)
println(“Hello”)

However, the compiler is actually going to emit something like the following:

val thunk = new PrintlnThunkClass00abc(this)
PreProc.loop(thunk)(PreProc.loop(_)(PreProc.loop(_)(PreProc.loop(_)(…..

While the calls within #loop are *technically* tail-recursive, the compiler can’t optimize it because it is a dynamic dispatch. In other words, not only is this very slow (far slower than a for-comprehension), but it also breaks the contract for a loop in that you could easily overflow the stack if you go too far.

This is all a really intensely cool experiment, but I think it’ll need a little magic from the Scala compiler team before it is seriously useful in the real world. As I said above, it wouldn’t be too hard for scalac to recognize the pattern produced by the conditional compilation trick. Looping is a different matter, and to be honest I’m not sure how feasible it would be to optimize that one.

Props again on a really neat post, but I don’t think we’re quite up to “real-world useful” just yet. 🙂

30 10 2008

michid (19:50:43) :

Daniel,

Thanks for that great and insightful comment! I was pushing the blog out although I should have concluded with a more detailed discussion. You did that for me now 😉

I was aware of the limitations of conditional compilation from my analysis with the c1visualizer tool. I did not do a more detailed analysis for the loop unrolling example though. After having read your comment it is now clear to me that ‘unrolling’ is probably a misnomer.

As I said at the beginning of the post my objective was to show how meta-programming can be applied in practice – at least theoretically. What I find really interesting personally is, that Scala’s type system – with it’s higher order type parametrism – is in principal powerful enough for doing static meta-programming (i.e. encoding values and functions as types). For what I understand it is not as powerful as C++ template meta-programming since it does not seem to be Turing complete. It still covers a quite large class of functions though. So the language itself does have everything for doing “real-world” meta-programming. The tools are not yet there.

30 10 2008

Daniel Spiewak (20:39:11) :

> As I said at the beginning of the post my objective was to show how
> meta-programming can be applied in practice – at least theoretically.

Well, if I may say so, you succeeded. This post was an extremely worthwhile read, even without a detailed performance analysis. I think the question of whether or not Scala’s type system is Turing complete deserves some closer analysis though. Just with the concepts you have presented, it’s certainly *close*. Looping is obviously a huge step in the right direction, but without mutable variables it doesn’t quite qualify. Of course, the type system does have *immutable* values (in the form of types), the trick would be finding a way to work with them in some sort of generalized stack-based pattern. Your loops are really, *really* close to being able to do this, I’m just not sure how to push it the rest of the way to get general-purpose “type functions”.

30 10 2008

michid (22:56:35) :

> the question of whether or not Scala’s type system is Turing complete
> deserves some closer analysis though.

Back to theory then 😉

I wanted to blog about this before but did not yet come around to read up on the theory. The authors of “Towards Equal Rights for Higher-kinded Types” mention towards the end that Scala’s type system corresponds to the simply-typed lambda calculus. I think I can proof that. AFAIK that means that we can encode the primitive recursive functions. So we have bounded loops (like the one I showed).
I however suspect that we can’t get much further. Specifically I think we cannot encode unbounded loops. Proving this seems much harder since we would have to show that some (Turing computable) function cannot be encoded.

1 11 2008

Daniel Spiewak (23:02:14) :

Well, I think I believe that Scala’s type system is equivalent to the simply-typed lambda calculus, but that would actually mean that it is *not* Turing Complete. The simply-typed lambda calculus cannot encode unbounded recursion due to its inability to type-check the fix-point combinator. Slightly more formally:

t ::= fn x: T . t
| t t
| x
| unit

T ::= T -> T | Unit

(type and eval rules elided)

(t t) : T ==> contradiction by simple induction

With that said, it can be proven with the simply-typed lambda calculus that the addition of recursive types is sufficient to allow (t t) : T. Obviously, -Yrecursive enables recursive types in Scala, but what about recursive higher-kinds? These would be the analogue of recursive types in lambda calculus. If we could show that the type system can indeed encode recursive higher-kinds, then I think we could also show that it is Turing Complete.

Incidentally, a Turing Computable function which cannot be encoded without recursion would be a function to compute the nth value of the Fibonacci series. As far as I can see, just using the techniques demonstrated so far, I don’t think that it is possible to encode this function.

4 11 2008

Jesper Nordenberg (14:39:50) :

Scala’s type system is definitely Turing complete when using the -Yrecursion flag. You can encode arbitrary recursive type functions, including the Fibonacci function. The biggest drawback compared to C++ templates is that there is no type specialization, i.e. you can’t create conditionals based on type equality unless you explicitly design your type hierarchy for it using constructs like the visitor pattern. This means that things like type sets (which would be a useful construct) are virtually impossible to create in Scala.

For example in MetaScala I’ve defined a visitor for natural numbers:

trait NatVisitor extends TypeVisitor {
type Visit0 <: ResultType
type VisitSucc[Pre <: Nat] <: ResultType
}

The Nat type then accepts a visitor of this type and subtypes “dispatches” evaluation to the correct NatVisitor#VisitX type. This way you can implement arbitrary recursive functions over the natural numbers. In C++ you wouldn’t have to use the visitor pattern as the C++ compiler support type specialization, so Scala is less powerful in this regard.

5 11 2008

Sarah (23:51:07) :

Actually, I think there’s probably an easier argument that Scala’s type system isn’t Turing complete. If Scala’s type checker is guaranteed to terminate—and I imagine it is, though I don’t know it that well—then it’s not Turing complete.

Otherwise, you have a rather surprising solution to the Turing-machine halting problem. 🙂

13 11 2008

Buddy Casino (13:50:12) :

Impressive! But to be honest, I begin to understand why some people prefer the less-is-more approach of Clojure to the more-is-more approach of Scala.

28 02 2009

asyropoulos (16:10:13) :

I think the term “Turing complete” refers to programming languages and not just type systems. In addition, one should not forget that Turing machines and the untyped λ-calculus cannot deal with interaction. On the other hand, it would be interesting to see how Scala compares to the π-calculus, which is based on the idea that interaction is a fundamental notion.

28 06 2009

Osman (11:05:47) :

I need the following control

if type Y is not tuple:

class Arrow[X, Y] {
    type ActualArrow = Option[X] => List[Option[Y]]
}

otherwise, when Y = (Y1, Y2):

class Arrow[X, Y] {
    type ActualArrow = Option[X] => List[Option[(Option[Y1], Option[Y2])]]
}

Is it possible to implement such metaprogramming with Scala?

29 01 2010

Scala type level encoding of the SKI calculus « Michid’s Weblog (17:59:02) :

[…] type level encoding of the SKI calculus 29 01 2010 In one of my posts on type level meta programming in Scala the question of Turing completeness came up already. The […]

18 06 2010

Type Level Programming: Equality « Michid’s Weblog (16:14:16) :

[…] Level Programming: Equality 18 06 2010 Apocalisp has a great series on Type Level Programming with Scala. At some point the question came up whether it is possible to determine […]

9 01 2015

Jonathan Neufeld (20:34:29) :

Although it’s more intuitive to resolve dynamic code paths using pattern matching in the runtime, resolving code paths during compile-time can, for example, shave hours off of an ETL process. I don’t think that’s uselessly academic, I’d say that’s mission critical in some circumstances.

Michid’s Weblog