The Artima Developer Community
Sponsored Link

Programming in Scala Forum
scala sort performance?

4 replies on 1 page. Most recent reply: Aug 29, 2008 3:25 PM by j herber

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 4 replies on 1 page
Feng Tian

Posts: 1
Nickname: ft
Registered: Aug, 2008

scala sort performance? Posted: Aug 27, 2008 6:48 AM
Reply to this message Reply
Advertisement
Hi,

I started playing with the language, and the first try does not go well.

(1 to 3000000).toList.sort(_>_)

takes 100% CPU and does not return. I have to kill it after 4 min. Sorting 3 million integer should take no time. For example, on python

l = range(3000000, 0, -1)
l.sort()

returns immediately.

Why scala sort is so slow? Is it the sort algorithm or the language?

Thanks,


Mac OSX 10.5
Welcome to Scala version 2.7.1.final (Java HotSpot(TM) Client VM, Java 1.5.0_13).


Jeff Heon

Posts: 40
Nickname: jfheon
Registered: Feb, 2005

Re: scala sort performance? Posted: Aug 27, 2008 9:03 AM
Reply to this message Reply
Maybe it's the generating of the list that takes along time?

Try doing it in two steps like you did it in Python.

Feng Tian

Posts: 2
Nickname: fengttt
Registered: Aug, 2008

Re: scala sort performance? Posted: Aug 27, 2008 12:41 PM
Reply to this message Reply
Hi,

Thanks for reply.

val l = (1 to 2000000).toList

takes about 15 seconds or so. This is very slow, but not as bad as sort.

Another problem is that the memory consumption is out of control. top shows java is using 1G virtual memory with 400M resident. 2 Million Int should take 8M. Since it is 2M objects in JVM, let's give it a factor of 10 :-) I cannot understand why this thing goes above 100M. My JAVA_OPTS='-Xmx512M -Xms16M -Xss16M'. Issue the statement twice will out of memory. Ouch!

scala> bash-3.2$ scala
Welcome to Scala version 2.7.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_06).
Type in expressions to have them evaluated.
Type :help for more information.

scala> val l = (1 to 2000000).toList
l: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, ...
scala> val l = (1 to 2000000).toList
java.lang.OutOfMemoryError: Java heap space
at scala.StringBuilder$.scala$StringBuilder$$copyOf(StringBuilder.scala:868)
at scala.StringBuilder.expandCapacity(StringBuilder.scala:113)
at scala.StringBuilder.append(StringBuilder.scala:247)
at scala.StringBuilder.append(StringBuilder.scala:235)
at RequestResult$.<init>(<console>:4)
at RequestResult$.<clinit>(<console>)
at Request...
scala>

Feng Tian

Posts: 2
Nickname: fengttt
Registered: Aug, 2008

Re: scala sort performance? Posted: Aug 27, 2008 12:49 PM
Reply to this message Reply
BTW, Linux + JVM 1.6 is a better combo than OSX + JVM 1.5. I was able to sort 3 million integers in about 1.5 minutes. I assume this is because JVM 1.6 is better.

j herber

Posts: 6
Nickname: 53144
Registered: Jan, 2008

Re: scala sort performance? Posted: Aug 29, 2008 3:25 PM
Reply to this message Reply
Your code looks harmless, but under the hood it is generating a Range object, then converting that to a List, then performing a sort, then creating a String representation of the result, then printing a truncated version of the result to the Console.

Your example highlighted a missing function on the Collection API - sort. But, since it is not there, we can add it with one slick line of implicit magic.

implicit def ArraySort(a:Array[Int]) = new { def sort() = { scala.util.Sorting.quickSort(a); a } }

Now you compare apples with apples:


scala> def timesort()={ val s=System.currentTimeMillis; Array.range(1,3000000).sort; val e=System.currentTimeMillis; e-s}
timesort: ()Long

scala> timesort
res61: Long = 1823


To summarize, when run from the Scala Console, big Arrays and Lists attempt to create string representations of results even though they are merely showing a truncated version in the console. This is why even my newfangled Array#sort will run out of memory, when you run directly in the Console as Array.range(1,3000000).sort.

Flat View: This topic has 4 replies on 1 page
Topic: Payment without credit card Previous Topic   Next Topic Topic: Last example in chapter 3 issues..

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use