Weblogs Forum - Pattern matching in Scala for expressions

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Weblogs Forum
Pattern matching in Scala for expressions

6 replies on 1 page. Most recent reply: Feb 15, 2010 3:01 AM by Benjamin James

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 6 replies on 1 page

Ian Robertson

Posts: 68
Nickname: ianr
Registered: Apr, 2007

Pattern matching in Scala for expressions (View in Weblogs)

Posted: Jan 31, 2010 11:53 AM

Summary
Scala pattern matchers with side effects can have unexpected results in for expressions.

Scala has taken the traditional for-construct and put it on steroids. One of the interesting features is the ability to use pattern matching as part of a for expression. For example, the following:

case class Person(firstName:String, lastName: String); val people = List( Person("Jane", "Smith"), Person("John", "Doe"), Person("Jane", "Eyre")); for (Person("Jane", last) <- people) yield "Ms. " + last;

will (effectively) iterate over all people with a first name of Jane, bind last to each person's last name in turn, and then generate a list of the results of the yield block, thus returning

List("Ms. Smith", "Ms. Eyre")

In this example, we're using a case class to provide the pattern matching for us, but we can easily role our own. For example, consider the following matcher which will return the first element of an iterator, if there is one:

object First { def unapply[A](iter:Iterator[A]): Option[(A)] = { if (iter.hasNext) Some(iter.next) else None; } }

We can try to get the first element of some iterators:

val iters = List(Iterator.from(1), Iterator.from(3)); for(First(x) <- iters) yield x;



However, this does not yield List(1,3) as one might
expect.  Instead, it yields List(2, 4)!


So what is going on here?  According to Programming in Scala,
a pattern match in a for loop gets converted into a filter followed by
a map.  Specifically,


for (pat <- expr₁) yield expr₂


is translated by the compiler to


expr₁ filter {

  case pat => true;

  case _ => false;

} map {

  case pat => expr₂;

}



In other words, first the input expression is filtered for those
elements which match the pattern, and then in a separate pass, the
pattern is applied for the purpose of
evaluating expr₂.  Unfortunately, there are a couple
of issues with this.  First, if the pattern match is computationally
expensive, that work will be doubled for every match.  Second, and
more concerning, if the pattern match is not side effect free, then
unexpected results like the one above can happen.  In the example
above, the pattern match, by calling next on an iterator,
changes the state of the iterator, so that a different result comes
out in the second pass.

Interestingly, Programming Scala states that "it's
guarannteed that a pattern-matching generator will never throw
a MatchError".  In fact, this is not true.  For example,
executing


for (First(x) <- List(Iterator.single(1))) yield x;


will try to access a single-element iterator twice, resulting in:


scala.MatchError: empty iterator

    at $anonfun$2.apply(<console>:6)

    at $anonfun$2.apply(<console>:6)

    at scala.List.map(List.scala:812)




One way to view this situation is that Scala's for
construct is inherently functional, and should not be mixed with
non-functional constructs such as iterators.  However, it turns out
that there is a completely functional fix to this problem.  Simply
change the compile-time pattern-match translation to something like:



expr₁ map {

  case pat => Some(expr₂);

  case _ => None;

} filter {

  case Some(_) => true;

  case None => false;

} map {

  case Some(x) => x;

}


or, more succinctly:


expr₁ map {

  pat.unapply(_);

} filter {

  _.isDefined;

} map {

  _.get;

}


Of course, there is a cost to this approach: by introducing a third
pass, extra an extra list of Option objects is created.
Another fix would be to add new methods filterMap
and filterForEach which cut down the work to a single
pass.  While less elegant, it would clearly provide a performance
boost, and it would be largely hidden from view.  In the mean time,
one should use pattern matching in for expressions with care.

Jorge Ortiz

Posts: 2
Nickname: jorgeortiz
Registered: Dec, 2007