Scala Pattern Matching, from a Java developer perspective. Part 1.

Pattern matching is one of the most used Scala feature. Try to open a random Scala file and there are great chances to find a couple of match blocks. From a Java developer perspective, pattern matching may look like a switch statement, but it is much, much more powerful than that. Syntactically, a Scala match block is made up of:

  • a sequence of different alternatives, each starting with the case keyword
  • followed by a pattern,
  • then the arrow sign =>
  • and finally, one or more expressions.

But there are so many things you can do with it: matching against different data types, matching using case classes (objects that can be decomposed), matching against patterns with alternatives or wildcards and much more. In this article series I will explore the different features of pattern matching in Scala with examples and additional explanations when I feel that they are needed.

Literal pattern matching

This is the closest form to a Java switch statement. In the example below the pattern is made up of String literals. For every case you can specify different alternatives separated by |. On the first matched pattern the right-hand side expression is evaluated and the match block returns immediately. Pattern matching in Scala uses a first-match policy (in opposition to Java’s fall through mechanism).

def matchMonth(month: String) = month match {
  case "March" | "April" | "May" => "It's spring"
  case "June" | "July" | "August" => "It's summer"
  case "September" | "October" | "November" => "It's autumn"
  case "December" | "January" | "February" => "It's winter"
}
print(matchMonth("November")) // will print "It's autumn"

Decompiling the byte code generated by Scala we can see how literal pattern matching is implemented: a series of nested if else statements. I used here the cfr decompiler. We will look to the decompiled code for some of the example just to have a better view of the things that are going on under the hood.

private static final String matchMonth$1(String month) {
    String string;
    String string2 = month;
    boolean bl = "March".equals(string2) ? true
      : ("April".equals(string2) ? true
      : "May".equals(string2));
    if (bl) {
        string = "It's spring";
    } else {
        boolean bl2 = "June".equals(string2) ? true
          : ("July".equals(string2) ? true
          : "August".equals(string2));
        if (bl2) {
            string = "It's summer";
        } else {
          ...
        }
    }
    return string;
}

Literal pattern matching against different data types

We can mix up literals of different types as we can see in the below example. And more, we can match against whole data types, like Int and String (_: Int means any integer, _: String means any string and _ is the wildcard pattern).

    def matchMonth(month: Any) = month match {
      case 3 | 4 | 5 | "March" | "April" | "May" => "It's spring!"
      case 6 | 7 | 8 | "June" | "July" | "August" => "It's summer!"
      case 9 | 10 | 11 | "September" | "October" | "November" => "It's autumn!"
      case 12 | 1 | 2 | "December" | "January" | "February" => "It's winter!"
      case _: Int => "Invalid int month"
      case _: String => "Invalid string month"
    }

There is something that we’ve ignored until now: what will happen if none of the patterns are matched? Let’s decompile the code again and look to the last statement: it throws a MatchError exception. We can also see that it uses the instanceof operator for matching against data types and int literals are first boxed to Integer before applying a special Scala equals logic.

  String string;
  Object object = month;
  boolean bl =
    BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 3), object) ? true
    : (BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 4), object) ? true
    : (BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 5), object) ? true
    : ("March".equals(object) ? true
    : ("April".equals(object) ? true
    : "May".equals(object)))));
  if (bl) {
    string = "It's spring";
   ...
  if (object instanceof Integer) {
    string = "Invalid int month";
  } else if (object instanceof String) {
    string = "Invalid string month";
  } else {
    throw new MatchError(object);
  }

Pattern matching using case classes

Case classes in Scala are classes with some special features, the most important one being decomposable through pattern matching: we can match objects against patterns that represent the internal structure of the object. In the below example we match a Shape object against different subclasses like Circle, Square or Rectangle and, in the same time, we decompose the object into its constituent parameters and we use them in the right-hand side expression.

abstract class Shape
case class Circle(radius: Int) extends Shape
case class Square(length: Int) extends Shape
case class Rectangle(length: Int, width: Int) extends Shape
def perimeter(shape: Shape): Double = shape match {
  case Circle(radius) => 2 * Math.PI * radius
  case Square(length) => 4 * length
  case Rectangle(length, width) => 2 * length + 2 * width
  case _ => 0.0
}
println(perimeter(Rectangle(10, 20)))

This looks like magic. The type of pattern used here is called constructor pattern. For example, Circle(radius) means that all the circle objects will be matched and the parameter used to construct the object is capured in a variable named radius. This variable is used in the right-hand side expression to compute the perimeter. Decompiling a case class would generate a lot of code (getters, constructors, equals and hashCode, toString and more). It would look like this:

public class Circle.3 extends Shape.1 implements Product, Serializable {
  private final int radius;
  public int radius() { return this.radius; }
  ...
  public int hashCode() { ... }
  public String toString() { ... }
  public boolean equals(Object x$1) { ... }
  public Circle.3(int radius) {
    this.radius = radius;
    Product.$init$((Product)this);
  }
}

And here is the actual matching logic:

private static final double perimeter$1(Shape.1 shape) {
  double d;
  Shape.1 var3_1 = shape;
  if (var3_1 instanceof Circle.3) {
    Circle.3 var4_2 = (Circle.3)var3_1;
    int radius = var4_2.radius();
    d = 6.283185307179586 * (double)radius;
  } else if (var3_1 instanceof Square.3) {
    Square.3 var6_5 = (Square.3)var3_1;
    int length = var6_5.length();
    d = 4 * length;
  } else if (var3_1 instanceof Rectangle.3) {
    Rectangle.3 var8_7 = (Rectangle.3)var3_1;
    int length = var8_7.length();
    int width = var8_7.width();
    d = 2 * length + 2 * width;
  } else {
     d = 0.0;
  }
 return d;
}

So, nothing fency. The case class is just a plain java class and the matching logic is using a bunch of nested if-else statements and the instanceof operator. But look how beautiful, succinct and natural the Scala code looks like. In the above code we can see that no MatchError is thrown. This is because of the last case statement which uses the wild-card pattern _. Wild-card pattern can have other use cases, for example Rectangle(length, _) will match all rectangles but only the length will be available for the right-hand side expression.

Conclusion

We’ve only covered the basics of pattern matching with this article. In the next articles we will look to extractor patterns, xml patterns, matching on arrays and lists, matching on regular expressions and more. We will continue to look under the hood by decompiling the code as, for those of us coming from an imperative programming world, might make things easier to grasp.

2 Comments

  1. Uwe

    Perhaps I don’t understand your idea of the wildcard in “Pattern matching using case classes”. I get the error
    Error:(25, 21) not found: value Triangle
    println(perimeter(Triangle(10, 20, 25)))
    if I use an undefined shape, ie.
    // Triangle(leg1, leg2, angle)
    println(perimeter(Triangle(10, 20, 25)))
    Your case _ => 0.0 should give for an undefined shape the result 0.0?

  2. Marius Ropotica

    Hey,
    Wildcard pattern matches everything else, so, if nothing matches on the above patterns, then the expression associated with the wildcard pattern is evaluated. For example, if you define the Triangle shape like this:
    case class Triangle(a: Int, b: Int, c: Int) extends Shape
    and you don’t include a constructor pattern for the Triangle in the match block (you let the code exactly like in the example) then the match block will return 0 (wildcard pattern is matched).
    But, if you include the Triangle also in the match block, like this:

    def perimeter(shape: Shape): Double = shape match {
    case Circle(radius) => 2 * Math.PI * radius
    case Square(length) => 4 * length
    case Rectangle(length, width) => 2 * length + 2 * width
    case Triangle(a, b, c) => a + b + c
    case _ => 0.0
    }
    println(perimeter(Triangle(3, 4, 5)))

    then 12 is printed.
    You can play more here with the code: https://scalafiddle.io/sf/gTBMrHl/0

Comments are closed, but trackbacks and pingbacks are open.