Archive for the ‘scala’ Category

QOTD - Twitter Learning Scala

Tuesday, July 22nd, 2008

Several of us engineers at Twitter, Inc. are learning Scala as the language in which to develop new components for our system.

From Graceless Failures: Hello World

Programming in Scala - The Book

Wednesday, December 12th, 2007

Artima has just released a pre-release copy of Programming in Scala. Much like the Pragmatic Programmers handle their book publishing, you can buy the pre-release copy as a PDF now, and Artima will provide free updates as the book progresses. You can choose to purchase just the PDF, or the PDF/Printed bundle.

Scala is a really fun programming language, combining object oriented and functional aspects into one dynamic and powerful package. Best of all, Scala runs on the JVM, so it can take advantage of the entire Java ecosystem.

If you’re interested in functional programming (and you should be) but don’t want to completely abandon your investment in OOP or in Java, then you need to give Scala a look.

One of my favorite features of Scala is the Actors support. You can use Actors to achieve Erlang style concurrent and scalable systems on the JVM. Now that’s hot.

Functional Programming Meets Web Application Development

Friday, November 9th, 2007

If you’re checking out functional programming, but are firming stuck in object oriented programming, then give Scala a try. Scala merges aspects of OOP and FP together into a cohesive and flexible language.

If you’re a web application developer interested in functional programming, then you’ll want to check out David Pollak’s presentation on lift to the Bay Area Functional Programmers.

lift is a web framework written in Scala, using both functional programming and Scala’s Actors, which enable heavily concurrent applications via lightweight (threadless) processes.

I Second That Emotion

Monday, September 24th, 2007

So Tim Bray finds out that Erlang IO is slow. I can attest to this fact, as my recent work on reading large files in Erlang has shown that IO and string manipulation is much slower than I would have wanted.

Yes, like Bray, my file reading is single threaded (although, what I do with the line is very multi-threaded) so I suppose using a single thread for Erlang isn’t very Erlang-like in the first place.

In the meantime, I’m porting my OLAP cube generator to Scala. The assumption (and shortly, hopefully proof) is that the JVM can do file IO much better than Erlang, yet I can still take advantage of Scala’s Actors to retain my concurrency.

Update: OK, some numbers and code. This is a benchmark for Erlang and Scala to read in a file line by line.

First, the Erlang code:


	process_file2(Filename) ->
		{ok, File} = file:open(Filename, read),
		process_lines2(File).

	process_lines2(File) ->
		case io:get_line(File, '') of
			eof -> file:close(File);
			_ -> process_lines2(File)
		end.

Now the Scala code:


object LineReader {

  def foreachline(in: BufferedReader, f: String => Unit): Unit = {
    val line = in.readLine()
    if (line == null) return
    else f(line)
    foreachline(in, f)
  }

  def forLines(filename: String, f: String => Unit) = {
    val in = new BufferedReader(new FileReader(filename))
    foreachline(in, f)
    in.close()
  }

}

OK, so these aren’t exactly the same. The Scala example is dispatching to a function, so Scala is even at a disadvantage.

The timings, three runs each, on my MacBook Pro 2.2 Ghz Intel Core 2 Duo. Erlang is the BEAM emulator 5.5.5 and Scala is 2.6 running on JDK 1.5 on Mac OS X. Erlang code was compiled with HIPE.

I am reading in a 1028071833 bytes file with 10037355 lines.

Code Run 1 Run 2 Run 3
Erlang 205.830 sec 208.999 sec 207.454 sec
Java 36.094 sec 39.917 sec 34.337 sec

Performance with Scala Arrays and Lists

Friday, June 22nd, 2007

As I continue to tinker with Scala, I was wondering about the performance differences between an Array and List. This post will detail what I’ve found, but as always YMMV and I could be doing it all wrong. If there’s a better (in this case, better == faster) way to do this in Scala, please let me know.

My application performs a lot of collection iteration as it combines the values of two collections into a new collection by addition. For instance, I need to combine [1,2] and [3,4] into [4,6]. I wanted to find out if the collections should be an Array or List.

Intuition tells me that the Array will perform better, but this is Scala, and Lists reign supreme. So we’ll go head to head.

For each test, I wanted to write a function that combined the two collections using tail recursion.

Test One - Two Lists Into a Third

First up, I am adding two lists together while forming a third. One problem here is, due to the way the algorithm is structured, the resulting list is built backwards. So there’s a call to reverse at the end. (Question: How to rewrite this using normal List methods such as :: without having to call reverse at the end?)


  def add(x: List[Long], y: List[Long],
             agg: List[Long]): List[Long] = x match {
    case Nil => agg.reverse
    case x1 :: xs => y match {
      case y1 :: ys => add(xs, ys,  x1 + y1 :: agg)
    }
  }

To call it:

add(List(1,2), List(3,4), Nil)

Test Two - Two Arrays Into a List

Next up, I add two Arrays into a List. The guess here is that accessing the arrays by index will help speed it up.


  def add2(x: Array[Long], y: Array[Long], agg: List[Long],
               counter: Int): List[Long] = {
    if (counter == 0) agg
    else add2(x, y, x(counter-1) + y(counter-1) :: agg, counter-1)
  }

To call it:

add2(Array(1,2), Array(3,4), Nil, 2)

Test Three - Two Arrays Into a Third Array

This should be the fastest.


  def add3(x: Array[Long], y: Array[Long], agg: Array[Long],
               i: Int): Array[Long] = {
    if (i == x.length) agg
    else {
      agg(i) = x(i) + y(i)
      add3(x, y, agg, i+1)
    }
  }

To call it:

add3(Array(1,2), Array(3,4), new Array(2), 0)

Methodology

I ran each function 1 million times and captured the times with System.currentTimeMillis. I ran the entire test suite five times to generate an average. I am running Scala 2.5.1 on Java 1.6 on Windows XP. I have a Pentium 4 2.8GHz with 2GB RAM.

Results

The results are in, and sure enough, on average, the third option (pure Arrays) is the fastest.

* Test 1 - 1172 ms
* Test 2 - 781 ms
* Test 3 - 687 ms

So, for my purposes, using Arrays results in faster execution. However, if you are looking to do traditional functional programming, you should write your methods to create zero side effects. Using Arrays like this seems anti-functional programming.

Converting Array to List in Scala

Wednesday, June 20th, 2007

Now, this has to have a built-in somewhere in Scala, because it just seems too common. So, how to convert an Array to a List in Scala?

Why do I need this? I needed to drop to Java for some functionality, which in this case returns an Array. I wanted to get that Array into a List to practice my functional programming skillz.

**Update**: I figured out how to convert Arrays to Lists the Scala way. Turns out it’s a piece of cake.

val myList = List.fromArray(Array(”one”, “two”, “three”))

or

val myList = Array(”one”,”two”,”three”).elements.toList

The call to elements returns an Iterator, and from there you can convert to a List via toList. Nice.

Because my first version wasn’t actually tail recursive, what follows is a true tail recursive solution, if I were to implement this by hand. The above, built in mechanism is much better, though.


object ArrayUtil {
  def toList[a](array: Array[a]): List[a] = {
    def convert(arr: Array[a], aggregator: List[a]): List[a] = {
      if (arr == null || arr.length == 0) aggregator
      else convert(arr.slice(0, arr.length-1), arr(arr.length-1) :: aggregator)
    }
	convert(array, Nil)
  }
}

The above code is interesting because it demonstrates a nested function. The convert function is nested inside toList. Scala encourages the decomposition of your problem into smaller and smaller functions.

*What follows is my original attempt.* Left here for a historical, “what not to do” perspective.

Here’s my implementation of it, but if you know if there’s a built-in function already implemented, please let me know.


object ArrayUtil {
  def toList[a](array: Array[a]): List[a] = {
    if (array == null || array.length == 0) Nil
    else if (array.length == 1) List(array(0))
    else array(0) :: toList(array.slice(1, array.length))
  }
}

To quickly explain this, an object in Scala is a singleton instance of its class. The method toList is parameterized with type a. This is similar to generics in Java. Lastly, the :: operator (pronouned cons in Scala) creates a new List from a single item (the head, on the left) and another List (the tail, on the right). Oh, and Nil represents an empty List.

That’s a Lot of Actors

Tuesday, June 19th, 2007

As I continue to explore Scala, I wondered just how many (react based) actors I could create in a single JVM. The answer, apparently, is a lot.

Before I canceled it, the count was up to 13,500,000 actors. This is on an old Centrino laptop running the Sun 1.6 JVM. I did have to turn up the memory limit a bit, but I never saw memory go above 20MB. Also, I wasn’t doing anything inside the Actors.

Still, that’s enough for me to not have to worry about it.

Scala: It’s as if Java and Erlang had a baby. Fun stuff.

links for 2007-06-20

Tuesday, June 19th, 2007