The Artima Developer Community
Sponsored Link

Articles Forum
What's New in Scala 2.8: Collections API

16 replies on 2 pages. Most recent reply: Sep 1, 2010 7:33 AM by michael lawley

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 16 replies on 2 pages [ 1 2 | » ]
Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

What's New in Scala 2.8: Collections API Posted: Aug 23, 2010 10:00 PM
Reply to this message Reply
Advertisement
In the first of a series of articles on the latest Scala release, Scala 2.8, Martin Odersky discusses the new collections API.

http://www.artima.com/scalazine/articles/scala_collections.html

What do you think of Scala's new collections API?


Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 25, 2010 7:12 AM
Reply to this message Reply
From a quick first look I had, it seems the new collections API offers a lot of collection types, I'd dare say almost all widely known collection types.

Without being a Scala user and only knowing the very basics of the language, I'd like to ask:

1) why the emphasis on immutable data structures? what are the benefits? aren't those slower than the mutable types?

2) when a collection's type is a primitive, does autoboxing happen? for example, an Array[int] is actually an Array[Integer]?

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 26, 2010 7:38 AM
Reply to this message Reply
> From a quick first look I had, it seems the new
> collections API offers a lot of collection types, I'd dare
> say almost all widely known collection types.
>
> Without being a Scala user and only knowing the very
> basics of the language, I'd like to ask:
>
> 1) why the emphasis on immutable data structures? what are
> the benefits? aren't those slower than the mutable types?

I assume you mean slower for writes. Theoretically there is an opportunity for the compiler to give a performance boost on reads for immutable collections.

As someone who is happy to see the emphasis on immutability, the benefits are safer and simpler code. I need not worry about whether successive reads will produce different results or whether the collection will change while I am iterating over it. This is especially the case when dealing with concurrency but not exclusively.

> 2) when a collection's type is a primitive, does
> autoboxing happen? for example, an Array[int] is actually
> an Array[Integer]?

Unlike Java, there is are no primitives in Scala but as I understand it, compiled Scala uses Java primitives. I've not taken the time to understand how this is done but it's pretty sexy.

Mark Thornton

Posts: 275
Nickname: mthornton
Registered: Oct, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 26, 2010 12:35 PM
Reply to this message Reply
> 1) why the emphasis on immutable data structures? what are
> the benefits? aren't those slower than the mutable types?
You don't need to make defensive copies when passing them to methods (outside your control). This can save quite a lot of time in some circumstances.

Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 5:01 AM
Reply to this message Reply
> Unlike Java, there is are no primitives in Scala but as I
> understand it, compiled Scala uses Java primitives. I've
> not taken the time to understand how this is done but it's
> pretty sexy.

Perhaps it sees that an object is an 'Integer' and JITs the relevant code to use 'int'.

Daniel Gronau

Posts: 3
Nickname: 56653
Registered: Jul, 2008

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 5:41 AM
Reply to this message Reply
I found the new collections easy to use. There are a lot of new, interesting functions, and the resulting types were almost always "as expected".

The only function I couldn't find was the cartesian product of two collections, which would be nice to have.

Bill Pyne

Posts: 165
Nickname: billpyne
Registered: Jan, 2007

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 6:32 AM
Reply to this message Reply
> 1) why the emphasis on immutable data structures?

Parallel collections should be easier to implement with immutable structures.

http://www.scala-lang.org/node/7285

Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 8:31 AM
Reply to this message Reply
> > 1) why the emphasis on immutable data structures?
>
> Parallel collections should be easier to implement with
> immutable structures.
>
> http://www.scala-lang.org/node/7285

Is the parallelization automatic?

Bill Pyne

Posts: 165
Nickname: billpyne
Registered: Jan, 2007

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 8:59 AM
Reply to this message Reply
> Is the parallelization automatic?

From the bits and pieces I read on the internet, the answer seems like a Yes. The following link has a paragraph near the bottom in which Martin Odersky describes use of "partition" across sequential collections and goes on to say, "Furthermore, the partition operation is quite fast, and will get even faster on parallel collections on multi-cores."

http://www.scala-lang.org/node/7364

sean mcdirmid

Posts: 4
Nickname: seanm
Registered: Jul, 2009

Re: What's New in Scala 2.8: Collections API Posted: Aug 27, 2010 7:43 PM
Reply to this message Reply
Immutable collections are usually not justified based on performance; they tend to have worse performance than the equivalent mutable collections. Use them for their functional elegance if that floats your boat, but don't use them if you think they are going to be faster.

Performance increases through parallelization and multi cores has yet to be seen. Immutability does confer some theoretical parallelization benefit, but in practice many other factors are at play it would be very difficult to turn that into an actual performance benefit.

Martin Odersky

Posts: 84
Nickname: modersky
Registered: Sep, 2003

Re: What's New in Scala 2.8: Collections API Posted: Aug 28, 2010 4:36 AM
Reply to this message Reply
> From a quick first look I had, it seems the new
> collections API offers a lot of collection types, I'd dare
> say almost all widely known collection types.
>
> Without being a Scala user and only knowing the very
> basics of the language, I'd like to ask:
>
> 1) why the emphasis on immutable data structures? what are
> the benefits? aren't those slower than the mutable types?
>
The library is symmetric wrt mutability. Every major type has mutable as well as immutable implementations. Immutable ones are available directly whereas mutable ones need an explicit qualification or import.

The reason for preferring immutable is that immutable is safer. So in the spirit of avoiding premature optimizations
we made immutable the default.

Immutable collections do have persistent implementations which keep the cost overhead fairly low, and which even give an advantage over mutable for small collections (an immutable set or map of up to 4 elements takes only a single object; compare to a mutable map backed by a hashtable).

> 2) when a collection's type is a primitive, does
> autoboxing happen? for example, an Array[int] is actually
> an Array[Integer]?

For general collections, yes. But for arrays, no. Scala arrays are represented exactly the same as Java arrays. There's a section on arrays in the collections document.

The next versions of Scala will specialize further collection types iso that primitive types need not be boxed. That's an ongoing process which was started with the availability of @specialized annotation in Scala 2.8.

Martin Odersky

Posts: 84
Nickname: modersky
Registered: Sep, 2003

Re: What's New in Scala 2.8: Collections API Posted: Aug 28, 2010 4:40 AM
Reply to this message Reply
> Is the parallelization automatic?

Yes. The following paper describes the upcoming parallel collections. More work is needed to validate performance, but the benchmark results contained in the paper are pretty promising so far.

http://infoscience.epfl.ch/record/150220

Martin Odersky

Posts: 84
Nickname: modersky
Registered: Sep, 2003

Re: What's New in Scala 2.8: Collections API Posted: Aug 28, 2010 4:45 AM
Reply to this message Reply
> The only function I couldn't find was the cartesian
> product of two collections, which would be nice to have.

Cartesian product can be expressed by flatMap and map, but the easiest way is with a for expression:


for (x <- 1 to 10; y <- 1 to 10) yield (x, y)


The Scala compiler will translate this to


(1 to 10).flatMap(x => (1 to 10).map(y => (x, y)))

Morgan Conrad

Posts: 307
Nickname: miata71
Registered: Mar, 2006

Re: What's New in Scala 2.8: Collections API Posted: Aug 31, 2010 9:25 AM
Reply to this message Reply
> Immutable collections are usually not justified based on
> performance; they tend to have worse performance than the
> equivalent mutable collections.

Sean - can you cite a source for this (or is this just personal experience)?

What you say seems plausible, but more data would be interesting. In my experience, I've never bothered timing collection performance because a few extra milliseconds spent on an insert pales in comparison to the seconds spent on analysis algorithms, and we have achieved major benefits from parallelization.

Mark Thornton

Posts: 275
Nickname: mthornton
Registered: Oct, 2005

Re: What's New in Scala 2.8: Collections API Posted: Aug 31, 2010 11:54 AM
Reply to this message Reply
I have timed immmutable/persistent collections against regular ones and found the immutable collections significantly slower for basic operations. However this isn't representative of how you would normally use the immutable collections. In particular it doesn't account for time saved by not having to copy a collection. You can thus only assess overall performance in the context of a specific real application.

In my work I often make a complex set of changes to a set of data and then assess whether the result is acceptable (feasible). If it isn't feasible, then the changes have to be reversed. If the data set as a whole is immutable, then that undo is achieved by simply keeping a reference to its previous state. With mutable data structures you either make a complete copy or actually perform the undo operations, in either case at significant cost in CPU and/or code complexity.

Flat View: This topic has 16 replies on 2 pages [ 1  2 | » ]
Topic: What's New in Scala 2.8: Chained Package Clauses Previous Topic   Next Topic Topic: Skinning Components in Flex 4

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use