Pattern matching is one of the most used Scala feature. Try to open a random Scala file and there are great chances to find a couple of match blocks. From a Java developer perspective, pattern matching may look like a switch statement, but it is much, much more powerful than that. Syntactically, a Scala match block is made up of:
- a sequence of different alternatives, each starting with the case keyword
- followed by a pattern,
- then the arrow sign =>
- and finally, one or more expressions.
But there are so many things you can do with it: matching against different data types, matching using case classes (objects that can be decomposed), matching against patterns with alternatives or wildcards and much more. In this article series I will explore the different features of pattern matching in Scala with examples and additional explanations when I feel that they are needed.
Literal pattern matching
This is the closest form to a Java switch statement. In the example below the pattern is made up of String literals. For every case
you can specify different alternatives separated by |
. On the first matched pattern the right-hand side expression is evaluated and the match block returns immediately. Pattern matching in Scala uses a first-match policy (in opposition to Java’s fall through mechanism).
def matchMonth(month: String) = month match { case "March" | "April" | "May" => "It's spring" case "June" | "July" | "August" => "It's summer" case "September" | "October" | "November" => "It's autumn" case "December" | "January" | "February" => "It's winter" } print(matchMonth("November")) // will print "It's autumn"
Decompiling the byte code generated by Scala we can see how literal pattern matching is implemented: a series of nested if else statements. I used here the cfr decompiler. We will look to the decompiled code for some of the example just to have a better view of the things that are going on under the hood.
private static final String matchMonth$1(String month) { String string; String string2 = month; boolean bl = "March".equals(string2) ? true : ("April".equals(string2) ? true : "May".equals(string2)); if (bl) { string = "It's spring"; } else { boolean bl2 = "June".equals(string2) ? true : ("July".equals(string2) ? true : "August".equals(string2)); if (bl2) { string = "It's summer"; } else { ... } } return string; }
Literal pattern matching against different data types
We can mix up literals of different types as we can see in the below example. And more, we can match against whole data types, like Int and String (_: Int
means any integer, _: String
means any string and _
is the wildcard pattern).
def matchMonth(month: Any) = month match { case 3 | 4 | 5 | "March" | "April" | "May" => "It's spring!" case 6 | 7 | 8 | "June" | "July" | "August" => "It's summer!" case 9 | 10 | 11 | "September" | "October" | "November" => "It's autumn!" case 12 | 1 | 2 | "December" | "January" | "February" => "It's winter!" case _: Int => "Invalid int month" case _: String => "Invalid string month" }
There is something that we’ve ignored until now: what will happen if none of the patterns are matched? Let’s decompile the code again and look to the last statement: it throws a MatchError
exception. We can also see that it uses the instanceof
operator for matching against data types and int literals are first boxed to Integer
before applying a special Scala equals logic.
String string; Object object = month; boolean bl = BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 3), object) ? true : (BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 4), object) ? true : (BoxesRunTime.equals(BoxesRunTime.boxToInteger((int) 5), object) ? true : ("March".equals(object) ? true : ("April".equals(object) ? true : "May".equals(object))))); if (bl) { string = "It's spring"; ... if (object instanceof Integer) { string = "Invalid int month"; } else if (object instanceof String) { string = "Invalid string month"; } else { throw new MatchError(object); }
Pattern matching using case classes
Case classes in Scala are classes with some special features, the most important one being decomposable through pattern matching: we can match objects against patterns that represent the internal structure of the object. In the below example we match a Shape
object against different subclasses like Circle
, Square
or Rectangle
and, in the same time, we decompose the object into its constituent parameters and we use them in the right-hand side expression.
abstract class Shape case class Circle(radius: Int) extends Shape case class Square(length: Int) extends Shape case class Rectangle(length: Int, width: Int) extends Shape def perimeter(shape: Shape): Double = shape match { case Circle(radius) => 2 * Math.PI * radius case Square(length) => 4 * length case Rectangle(length, width) => 2 * length + 2 * width case _ => 0.0 } println(perimeter(Rectangle(10, 20)))
This looks like magic. The type of pattern used here is called constructor pattern. For example, Circle(radius)
means that all the circle objects will be matched and the parameter used to construct the object is capured in a variable named radius
. This variable is used in the right-hand side expression to compute the perimeter. Decompiling a case class would generate a lot of code (getters, constructors, equals and hashCode, toString and more). It would look like this:
public class Circle.3 extends Shape.1 implements Product, Serializable { private final int radius; public int radius() { return this.radius; } ... public int hashCode() { ... } public String toString() { ... } public boolean equals(Object x$1) { ... } public Circle.3(int radius) { this.radius = radius; Product.$init$((Product)this); } }
And here is the actual matching logic:
private static final double perimeter$1(Shape.1 shape) { double d; Shape.1 var3_1 = shape; if (var3_1 instanceof Circle.3) { Circle.3 var4_2 = (Circle.3)var3_1; int radius = var4_2.radius(); d = 6.283185307179586 * (double)radius; } else if (var3_1 instanceof Square.3) { Square.3 var6_5 = (Square.3)var3_1; int length = var6_5.length(); d = 4 * length; } else if (var3_1 instanceof Rectangle.3) { Rectangle.3 var8_7 = (Rectangle.3)var3_1; int length = var8_7.length(); int width = var8_7.width(); d = 2 * length + 2 * width; } else { d = 0.0; } return d; }
So, nothing fency. The case class is just a plain java class and the matching logic is using a bunch of nested if-else
statements and the instanceof
operator. But look how beautiful, succinct and natural the Scala code looks like. In the above code we can see that no MatchError
is thrown. This is because of the last case
statement which uses the wild-card pattern _
. Wild-card pattern can have other use cases, for example Rectangle(length, _)
will match all rectangles but only the length
will be available for the right-hand side expression.
Conclusion
We’ve only covered the basics of pattern matching with this article. In the next articles we will look to extractor patterns, xml patterns, matching on arrays and lists, matching on regular expressions and more. We will continue to look under the hood by decompiling the code as, for those of us coming from an imperative programming world, might make things easier to grasp.
Perhaps I don’t understand your idea of the wildcard in “Pattern matching using case classes”. I get the error
Error:(25, 21) not found: value Triangle
println(perimeter(Triangle(10, 20, 25)))
if I use an undefined shape, ie.
// Triangle(leg1, leg2, angle)
println(perimeter(Triangle(10, 20, 25)))
Your case _ => 0.0 should give for an undefined shape the result 0.0?
Hey,
Wildcard pattern matches everything else, so, if nothing matches on the above patterns, then the expression associated with the wildcard pattern is evaluated. For example, if you define the Triangle shape like this:
case class Triangle(a: Int, b: Int, c: Int) extends Shape
and you don’t include a constructor pattern for the Triangle in the match block (you let the code exactly like in the example) then the match block will return 0 (wildcard pattern is matched).
But, if you include the Triangle also in the match block, like this:
def perimeter(shape: Shape): Double = shape match {
case Circle(radius) => 2 * Math.PI * radius
case Square(length) => 4 * length
case Rectangle(length, width) => 2 * length + 2 * width
case Triangle(a, b, c) => a + b + c
case _ => 0.0
}
println(perimeter(Triangle(3, 4, 5)))
then 12 is printed.
You can play more here with the code: https://scalafiddle.io/sf/gTBMrHl/0