Tuesday, January 29, 2008

Erlang fast :)

I want to write about this right now cause it gets kind of lost on me, so I want to be able to refer back to it. Why is Erlang so fast? Imagine you write a Java program that spawns two threads, and you run said program on a multi-core machine. Pretend both threads have a reference to the same Integer (of the proper Object sort), and at every moment of each thread's existence they are going to increase the value of that Integer by one over and over. Each time each thread attempts to increase the value of the Integer it must perform a check/lock/update/unlock to make sure it can access and update the memory that the Integer refers to. That check/lock/update/unlock slows your program down. On a single core system, the check/lock/update/unlock isn't necessary since only one thread is running at a time, but on multi-core machines threads run concurrently, so the check is necessary. In Erlang you can't update variables. Variables can be given a value once and only once, and since you can't change the value, that check isn't necessary, and that's why Erlang is faster.

I guess this begs the question: how would you write the same program in Erlang? Well in my Programming Erlang book they are very careful to explain that Erlang doesn't have threads, it has processes, and the difference is that threads share information (the Integer in the example given), and processes don't (well, you can pass processes the same variable, but they can't update it, so I guess that's the same thing). So, I guess you'd go about solving the problem in an entirely different manner. I am new to Erlang, so I don't feel comfortable giving a definitive answer, but it seems to me, that you'd simply have to solve problems with a different approach than you would in Java... not drastically different, but different, nonetheless... maybe, one process spawns two processes that return the message "1" every moment which is added to the parent processes' message queue (all processes in Erlang have message queues), which it iterates through to accumulate values. That sounds good to me :) Maybe I'll write both programs this weekend and see which one can get to Integer.MAX_VALUE the fastest.

2 comments:

Thomas Lackner said...

From my experience, when you have two processes that need to participate in updating the same data, you generally either:

1) Maintain the state of the data in one process, and send messages to that process to update it. So you'd start one process that contains the current value, with a loop like:

fun loop(N) ->
receive
{update, New} ->
loop(New);
_ ->
% else do nothing
loop(N)
end.

2) Have the data "shared" between processs. Remember, there is no mutable state, so you have to pass it back and forth between the processes to complete the task..

fun loop(N, OtherPid) ->
receive
{update, N} ->
OtherPid !! {update, N+1}
end.

What's better in this case? I don't know. In real life, it depends where the bulk of the processing is. If you're factoring a number, for instance, you would probably want each process to do work over a given number range and report the results back to a centralized process for tallying. However, if you're implementing a cooperative image four-step transform, you'd probably want the processes to be signalling back and forth to each other as they produce and consume data.

This task of counting to INT_MAX is so trivial that I don't think it's interesting to consider the speed of it: after all, any intelligent optimizing JVM will simply factor out the intermittent thread activity.

Dustin Ted Whitney said...

hmm... I wish I had noticed this comment when it was left.

Thanks Thomas. Anyway, yes I implemented the program using the first method you described. It was somewhat uninteresting, as you stated, but the exercise was to solidify the concept in my head, and it's quite solid now. Thanks for your input!