Union types

12 06 2011

In his recent blog post Miles Sabin came up with an ingenious way of expressing union types in Scala. A union type is the union of some types: its values are the union of the values of each of the individual types.

In a nutshell he first defines the negation of types as

type ¬[A] = A => Nothing

and then the union of two types via De Morgan’s law

type ∨[T, U] = ¬[¬[T] with ¬[U]]

With the following auxiliary constructs

type ¬¬[A] = ¬[¬[A]]
type |∨|[T, U] = { type λ[X] = ¬¬[X] <:< (T ∨ U) }

union types can be used in a very intuitive way

def size[T: (Int |∨| String)#λ](t: T) = t match {
    case i: Int => i
    case s: String => s.length
}

scala> size(3)
res0: Int = 3

scala> size("three")
res1: Int = 5

scala> size(4.2)
:13: error: Cannot prove that ((Double) => Nothing) => Nothing >: Nothing with (java.lang.String) => Nothing) => Nothing.
       size(4.2)
           ^

… and beyond

With type negation and disjunction from above, it becomes possible to express all types whose set of values can be expressed by a term in propositional calculus. But can we do better? That is, is it possible to express types which don’t have a corresponding term in propositional calculus?

Generalizing the type constructor ∨[T, U] to some arbitrary acceptor

type Acceptor[T, U] = { type λ[X] = ... }

it becomes apparent, that all types for which there is a corresponding type level acceptor function are expressible. Since type level calculations in Scala are Turing complete, it should be possible to find an acceptor for any recursive function. This means that – in theory at least – Scala’s type system is powerful enough to express any type whose set of values is recursive.





Regular expression matching in <100 lines of code

6 12 2010

The recent discussion about the Yacc is dead paper on Lambda the Ultimate sparked my interest in regular expression derivatives. The original idea goes back to the paper Derivatives of Regular Expressions published in 1964 (!) by Janusz A. Brzozowski. For a more modern treatment of the topic see Regular-expression derivatives reexamined.

The derivative of a set of strings with respect to a character, is the set of strings which results from removing the first character from all the strings in the set which start with that character. Let for example S = \{foo, bar, baz\} then \partial_b S = \{ar, az\} . It turns out that regular languages are closed under derivatives. That is, any derivative of a regular language is again a regular language. Furthermore, it is possible to extend the notion of derivatives to regular expression such that given a regular expression r which generates the language \mathcal{L}(r) and a character c , one can derive a regular expression \partial_c r such that \mathcal{L}(\partial_c r) = \partial_c(\mathcal{L}(r)).

This is a key ingredient for a very elegant regular expression matching algorithm: to match a string against a regular expression repeatedly calculate the derivative of the regular expression for each characters in the string. When no character is left, check whether the last derivative accepts the empty string. If so we have a match and otherwise not.

The exact algorithm for finding whether a regular expression is nullable (i.e. accepts the empty string) is given in Regular-expression derivatives reexamined as is the algorithm for calculating derivatives of regular expressions. Below is a direct implementation of that algorithm in Scala (with a slight modification to allow for strings instead of individual characters).

trait RegExp {
  def nullable: Boolean
  def derive(c: Char): RegExp
}

case object Empty extends RegExp {
  def nullable = false
  def derive(c: Char) = Empty
}

case object Eps extends RegExp {
  def nullable = true
  def derive(c: Char) = Empty
}

case class Str(s: String) extends RegExp {
  def nullable = s.isEmpty
  def derive(c: Char) =
    if (s.isEmpty || s.head != c) Empty
    else Str(s.tail)
}

case class Cat(r: RegExp, s: RegExp) extends RegExp {
  def nullable = r.nullable && s.nullable
  def derive(c: Char) =
    if (r.nullable) Or(Cat(r.derive(c), s), s.derive(c)) 
    else Cat(r.derive(c), s)
}

case class Star(r: RegExp) extends RegExp {
  def nullable = true
  def derive(c: Char) = Cat(r.derive(c), this)
}

case class Or(r: RegExp, s: RegExp) extends RegExp {
  def nullable = r.nullable || s.nullable
  def derive(c: Char) = Or(r.derive(c), s.derive(c))
}

case class And(r: RegExp, s: RegExp) extends RegExp {
  def nullable = r.nullable && s.nullable
  def derive(c: Char) = And(r.derive(c), s.derive(c))
}

case class Not(r: RegExp) extends RegExp {
  def nullable = !r.nullable
  def derive(c: Char) = Not(r.derive(c))
}

Having these constructors we need a way to match strings against regular expressions.

object Matcher {
  def matches(r: RegExp, s: String): Boolean = {
    if (s.isEmpty) r.nullable
    else matches(r.derive(s.head), s.tail)
  }
}

Here are some pimps to make usage of the regular expression constructors more convenient.

object Pimps {
  implicit def string2RegExp(s: String) = Str(s)

  implicit def regExpOps(r: RegExp) = new {
    def | (s: RegExp) = Or(r, s)
    def & (s: RegExp) = And(r, s)
    def % = Star(r)
    def %(n: Int) = rep(r, n)
    def ? = Or(Eps, r)
    def ! = Not(r)
    def ++ (s: RegExp) = Cat(r, s)
    def ~ (s: String) = Matcher.matches(r, s)
  }

  implicit def stringOps(s: String) = new {
    def | (r: RegExp) = Or(s, r)
    def | (r: String) = Or(s, r)
    def & (r: RegExp) = And(s, r)
    def & (r: String) = And(s, r)
    def % = Star(s)
    def % (n: Int) = rep(Str(s), n)
    def ? = Or(Eps, s)
    def ! = Not(s)
    def ++ (r: RegExp) = Cat(s, r)
    def ++ (r: String) = Cat(s, r)
    def ~ (t: String) = Matcher.matches(s, t)
  }

  def rep(r: RegExp, n: Int): RegExp =
    if (n <= 0) Star(r)
    else Cat(r, rep(r, n - 1))
}

And finally here is how to use it:

object Test {
  import Pimps._

  val digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
  val int = ("+" | "-").? ++ digit.%(1)
  val real = ("+" | "-").? ++ digit.%(1) ++ ("." ++ digit.%(1)).? ++ (("e" | "E") ++ ("+" | "-").? ++ digit.%(1)).?

  def main(args: Array[String]) {
    val ints = List("0", "-4534", "+049", "99")
    val reals = List("0.9", "-12.8", "+91.0", "9e12", "+9.21E-12", "-512E+01")
    val errs = List("", "-", "+", "+-1", "-+2", "2-")

    ints.foreach(s => assert(int ~ s))
    reals.foreach(s => assert(!(int ~ s)))
    errs.foreach(s => assert(!(int ~ s)))

    ints.foreach(s => assert(real ~ s))
    reals.foreach(s => assert(real ~ s))
    errs.foreach(s => assert(!(real ~ s)))
}

Now that’s 48 + 6 + 32 = 86 lines of code for a regular expression matching library!





So Scala is too complex?

24 08 2010

There is currently lots of talk about Scala being to complex. Instead of more arguing I implemented the same bit of functionality in Scala and in Java and let everyone decide for themselves.

There is some nice example code in the manual to the The Scala 2.8 Collections API which partitions a list of persons into two lists of minors and majors. Below are the fleshed out implementations in Scala and Java.

First Scala:

object ScalaMain {
  case class Person(name: String, age: Int)
    
  val persons = List(
    Person("Boris", 40),
    Person("Betty", 32),
    Person("Bambi", 17))

  val (minors, majors) = persons.partition(_.age <= 18) 
   
  def main(args: Array[String]) = {
    println (minors.mkString(", "))
    println (majors.mkString(", "))
  }
}

And now Java:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;

class Person {
    private final String name;
    private final int age;

    public Person(String name, int age) {
        super();
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }

    @Override
    public boolean equals(Object other) {
        if (this == other) {
            return true;
        }
        else if (other instanceof Person) {
            Person p = (Person) other;
            return name == null ? p.name == null : name.equals(p.name)
                    && age == p.age;

        }
        else {
            return false;
        }
    }

    @Override
    public int hashCode() {
        int h = name == null ? 0 : name.hashCode();
        return 39*h + age;
    }

    @Override
    public String toString() {
        return new StringBuilder("Person(")
            .append(name).append(",")
            .append(age).append(")").toString();
    }
}

public class JavaMain {

    private final static List<Person> persons = Arrays.asList(
        new Person("Boris", 40),
        new Person("Betty", 32),
        new Person("Bamby", 17));

    private static List<Person> minors = new ArrayList<Person>();
    private static List<Person> majors = new ArrayList<Person>();

    public static void main(String[] args) {
        partition(persons, minors, majors);
        System.out.println(mkString(minors, ","));
        System.out.println(mkString(majors, ","));
    }

    private static void partition(List<? extends Person> persons,
            List<? super Person> minors, List<? super Person> majors) {

        for (Person p : persons) {
            if (p.getAge() <= 18) minors.add(p);
            else majors.add(p);
        }
    }

    private static <T> String mkString(List<T> list, String separator) {
        StringBuilder s = new StringBuilder();
        Iterator<T> it = list.iterator();
        if (it.hasNext()) {
            s.append(it.next());
        }
        while (it.hasNext()) {
            s.append(separator).append(it.next());
        }
        return s.toString();
    }

}

Impressive huh? And the Java version is not even entirely correct since its equals() method might not cope correctly with super classes of Person.





Type Level Programming: Equality

18 06 2010

Apocalisp has a great series on Type Level Programming with Scala. At some point the question came up whether it is possible to determine equality of types at run time by having the compiler generate types representing true and false respectively. Here is what I came up with.

trait True { type t = True }
trait False { type t = False }

case class Equality[A] {
  def check(x: A)(implicit t: True) = t
  def check[B](x: B)(implicit f: False) = f
}
object Equality {
  def witness[T] = null.asInstanceOf[T]
  implicit val t: True = null
  implicit val f: False = null
}

// Usage:
import Equality._
    
val test1 = Equality[List[Boolean]] check witness[List[Boolean]]
implicitly[test1.t =:= True]
// Does not compile since tt is True
// implicitly[test1.t =:= False]  

val test2 = Equality[Nothing] check witness[AnyRef]
// Does not compile since ft is False
// implicitly[test2.t =:= True]  
implicitly[test2.t =:= False]

Admittedly this is very hacky. For the time being I don’t see how to further clean this up. Anyone?





Working around type erasure ambiguities (Scala)

14 06 2010

In my previous post I showed a workaround for the type erasure ambiguity problem in Java. The solution uses vararg parameters for disambiguation. As Paul Phillips points out in his comment, this solution doesn’t directly port over to Scala. Java uses Array to pass varargs, Scala uses Seq. Unlike Array, Seq is not reified so Seq[String] and Seq[Int] again erase to the same type putting us back to square one.

However, there is another way to add disambiguation parameters to the methods: implicits! Here is how:

implicit val x: Int = 0
def foo(a: List[Int])(implicit ignore: Int) { }
  
implicit val y = ""
def foo(a: List[String])(implicit ignore: String) { }

foo(1::2::Nil)
foo("a"::"b"::Nil)




Working around type erasure ambiguities

30 05 2010

In an earlier post I already showed how to work around ambiguous method overloads resulting from type erasure. In a nut shell the following code wont compile since both overloaded methods foo erase to the same type.

Scala:

def foo(ints: List[Int]) {}
def foo(strings: List[String]) {}

Java:

void foo(List<Integer> ints) {}
void foo(List<String> strings) {}

It turns out that there is a simple though somewhat hacky way to work around this limitations: in order to make the ambiguity go away, we need to change the signature of foo in such a way that 1) the erasure of the foo methods are different and 2) the call site is not affected.

Here is a solution for Java:

void foo(List<Integer> ints, Integer... ignore) {}
void foo(List<String> strings, String... ignore) {}

We can now call foo passing either a list of ints or a list of strings without ambiguity:

foo(new ArrayList<Integer>());
foo(new ArrayList<String>());

This doesn’t directly port over to Scala (why?). However, there is a similar hack for Scala. I leave this as a puzzle for a couple of days before I post my solution.





[ANN] Talking at Scala Days 2010 in Lausanne next Thursday

11 04 2010

I’ll be talking at Scala Days 2010 in Lausanne on April 15th about the Scala scripting engine for Apache Sling. While my talk at Jazoon 09 was mainly about using Scala from Sling, this session will be more focused on internals of the Scala scripting engine.

Unfortunately (or fortunately depending on the point of view) the conference is sold out already. Watch my Scala for scripting page for the session slides and other upcoming support material.





Scala type level encoding of the SKI calculus

29 01 2010

In one of my posts on type level meta programming in Scala the question of Turing completeness came up already. The question is whether Scala’s type system can be used to force the Scala compiler to carry out any calculation which a Turing machine is capable of. Various of my older posts show how Scala’s type system can be used to encode addition and multiplication on natural numbers and how to encode conditions and bounded loops.

Motivated by the blog post More Scala Typehackery which shows how to encode a version of the Lambda calculus which is limited to abstraction over a single variable in Scala’s type system I set out to further explore the topic.

The SKI combinator calculus


Looking for a calculus which is relatively small, easily encoded in Scala’s type system and known to be Turing complete I came across the SKI combinator calculus. The SKI combinators are defined as follows:

Ix \rightarrow x ,
Kxy \rightarrow x ,
Sxyz \rightarrow xz(yz) .

They can be used to encode arbitrary calculations. For example reversal of arguments. Let R \equiv S(K(SI))K . Then

R x y \equiv
S(K(SI))K x y \rightarrow
K(SI)x(Kx)y \rightarrow
SI(Kx)y \rightarrow
Iy(Kxy) \rightarrow
Iyx \rightarrow yx .

Self application is used to find fixed points. Let \beta \equiv S(K\alpha)(SII) for some combinator \alpha . Then \beta\beta \rightarrow \alpha(\beta \beta) . That is, \beta\beta is a fixed point of \alpha . This can be used to achieve recursion. Let R be the reversal combinator from above. Further define

A_0 x \equiv c for some combinator c and
A_n x \equiv x A_{n-1} .

That is, combinator A_n is the combinator obtained by applying its argument to the combinator A_{n-1}. (There is a bit of cheating here: I should actually show that such combinators exist. However since the SKI calculus is Turing complete, I take this for granted.) Now let \alpha be R in \beta from above (That is we have \beta \equiv S(KR)(SII) now). Then

\beta\beta A_0 \rightarrow c

and by induction

\beta\beta A_n \rightarrow \beta\beta A_{n-1} \rightarrow c.

Type level SKI in Scala


Encoding the SKI combinator calculus in Scala’s type system seems not too difficult at first. It turns out however that some care has to be taken regarding the order of evaluation. To guarantee that for all terms which have a normal form, that normal form is actually found, a lazy evaluation order has to be employed.

Here is a Scala type level encoding of the SKI calculus:

trait Term {
  type ap[x <: Term] <: Term
  type eval <: Term
}
  
// The S combinator
trait S extends Term {
  type ap[x <: Term] = S1[x] 
  type eval = S
}
trait S1[x <: Term] extends Term {
  type ap[y <: Term] = S2[x, y]
  type eval = S1[x]
}
trait S2[x <: Term, y <: Term] extends Term {
  type ap[z <: Term] = S3[x, y, z]
  type eval = S2[x, y]
}
trait S3[x <: Term, y <: Term, z <: Term] extends Term {
  type ap[v <: Term] = eval#ap[v]
  type eval = x#ap[z]#ap[y#ap[z]]#eval
}

// The K combinator
trait K extends Term {
  type ap[x <: Term] = K1[x]
  type eval = K
}
trait K1[x <: Term] extends Term {
  type ap[y <: Term] = K2[x, y]
  type eval = K1[x]
}
trait K2[x <: Term, y <: Term] extends Term {
  type ap[z <: Term] = eval#ap[z]
  type eval = x#eval
}
  
// The I combinator
trait I extends Term {
  type ap[x <: Term] = I1[x]
  type eval = I
}
trait I1[x <: Term] extends Term {
  type ap[y <: Term] = eval#ap[y]
  type eval = x#eval
}

Further lets define some constants to act upon. These are used to test whether the calculus actually works.

trait c extends Term {
  type ap[x <: Term] = c
  type eval = c
}
trait d extends Term {
  type ap[x <: Term] = d
  type eval = d
}
trait e extends Term {
  type ap[x <: Term] = e
  type eval = e
}

Eventually the following definition of Equals lets us check types for equality:

case class Equals[A >: B <:B , B]()

Equals[Int, Int]     // compiles fine
Equals[String, Int] // won't compile

Now lets see whether we can evaluate some combinators.

  // Ic -> c
  Equals[I#ap[c]#eval, c]
  
  // Kcd -> c
  Equals[K#ap[c]#ap[d]#eval, c]

  // KKcde -> d
  Equals[K#ap[K]#ap[c]#ap[d]#ap[e]#eval, d]
  
  // SIIIc -> Ic
  Equals[S#ap[I]#ap[I]#ap[I]#ap[c]#eval, c]

  // SKKc -> Ic
  Equals[S#ap[K]#ap[K]#ap[c]#eval, c]

  // SIIKc -> KKc
  Equals[S#ap[I]#ap[I]#ap[K]#ap[c]#eval, K#ap[K]#ap[c]#eval]

  // SIKKc -> K(KK)c
  Equals[S#ap[I]#ap[K]#ap[K]#ap[c]#eval, K#ap[K#ap[K]]#ap[c]#eval]

  // SIKIc -> KIc
  Equals[S#ap[I]#ap[K]#ap[I]#ap[c]#eval, K#ap[I]#ap[c]#eval]

  // SKIc -> Ic
  Equals[S#ap[K]#ap[I]#ap[c]#eval, c]
  
  // R = S(K(SI))K  (reverse)
  type R = S#ap[K#ap[S#ap[I]]]#ap[K]
  Equals[R#ap[c]#ap[d]#eval, d#ap[c]#eval]

Finally lets check whether we can do recursion using the fixed point operator from above. First lets define \beta .

  // b(a) = S(Ka)(SII)
  type b[a <: Term] = S#ap[K#ap[a]]#ap[S#ap[I]#ap[I]]

Further lets define some of the A_ns from above.

trait A0 extends Term {
  type ap[x <: Term] = c
  type eval = A0
}
trait A1 extends Term {
  type ap[x <: Term] = x#ap[A0]#eval
  type eval = A1
}
trait A2 extends Term {
  type ap[x <: Term] = x#ap[A1]#eval
  type eval = A2
}

Now we can do iteration on the type level using a fixed point combinator:

  // Single iteration
  type NN1 = b[R]#ap[b[R]]#ap[A0]
  Equals[NN1#eval, c]

  // Double iteration
  type NN2 = b[R]#ap[b[R]]#ap[A1]
  Equals[NN2#eval, c]

  // Triple iteration
  type NN3 = b[R]#ap[b[R]]#ap[A2]
  Equals[NN3#eval, c]

Finally lets check whether we can do ‘unbounded’ iteration.

trait An extends Term {
  type ap[x <: Term] = x#ap[An]#eval
  type eval = An
}
// Infinite iteration: Smashes scalac's stack
  type NNn = b[R]#ap[b[R]]#ap[An]
  Equals[NNn#eval, c]

Well, we can 😉

$ scalac SKI.scala
Exception in thread "main" java.lang.StackOverflowError
        at scala.tools.nsc.symtab.Types$SubstMap.apply(Types.scala:3165)
        at scala.tools.nsc.symtab.Types$SubstMap.apply(Types.scala:3136)
        at scala.tools.nsc.symtab.Types$TypeMap.mapOver(Types.scala:2735)




Implementing Java Interfaces and Generics

30 08 2009

In an earlier post I asked for an implementation of the following Java interface in Scala:

public interface Iterator2 extends java.util.Iterator {}  

The solution was to use a combination of self types and existential types. However, some comments made me aware of Ticket #1737 where a more general form of the problem is discussed. Given the following Java code:

public interface A<T> {
    void run(T t);
    T get();
}

public abstract class B implements A {}

How could B be implemented in Scala? My previous approach is not sufficient here since for implementing the run() method, one needs to be able to refer to the existential type in run()‘s parameter list:

class C extends B {
  this: A[_] =>
      
  def run(t:???) {} // What should go here for t's type?
  def get = "hello world"
}

So what we need is a way to name the existential type used in the self type (that is the type parameter to A) such that we can refer to it in the parameter list of the run() method. Here is my shot:

class D {
  type Q = X forSome { type X; }
}

class C extends B {
  this: A[D#Q] =>
      
  def run(t:D#Q ) {}
  def get: D#Q = "hello world"
}

Unfortunately this results in a compilation error with Scala 2.7.5

error: class C needs to be abstract, since method run in trait A of type (T)Unit is not defined

and in a AssertionError for Scala 2.8.0.r18604-b20090830020201

Exception in thread "main" java.lang.AssertionError: assertion failed: michid.iterator.A[T]
at scala.Predef$.assert(Predef.scala:107)
at scala.tools.nsc.symtab.Types$TypeRef.transform(Types.scala:1417)
at scala.tools.nsc.symtab.Types$TypeRef$$anonfun$baseTypeSeq$3.apply(Types.scala:1588)
...

I created Ticket #2091 for these problems.





Scala for Sling @ Jazzon 09

26 06 2009

Yesterday I gave a presentation at Jazoon 09 about using Scala for scripting RESTful web applications with Apache Sling.

In the session I showed how to take advantage of Scala to create RESTful web applications with Apache Sling. I demonstrated how to uses its DSL capability and support for XML literals to create type safe web site templates. In contrast to conventional web site template mechanisms (e.g. JSP), this does not rely on a pre-processor but rather uses pure Scala code.

There are Session slides and support material available here. The support material contains a fully workable demo application. A Scala scripting bundle for Sling is also included.