Introduction to Distributed Erlang Programming

September 24, 2008

Author: jart — Originally Posted 674 Days Ago Article Tags « erlang tutorial »

This tutorial is a "dive-in off the deep end" introduction to the best features offered by Erlang, a programming language developed at Ericsson. I love Erlang because its built-in support for distributed concurrency, time constraints, and changing production code (without service interruption) is simply unmatched; however, Erlang isn't a very expressive language and leaves much to be desired in terms of library support.

Even if you don't need to write highly concurrent, low-latency services, I still highly recommend experimenting with Erlang, if for no reason than seeing how it handles threading. Multi-threading is one of, if not the the most difficult problems programmers face. Bugs inherent in traditional approaches to concurrency, such as race conditions and deadlocks are almost always difficult to detect/reproduce, and can even make you reconsider changing careers.

Erlang takes a much more natural approach to solving the issue of concurrency. Rather than having threads share the same memory, and locking access to specific variables, all concurrency is done by passing messages between threads containing immutable variables. While this approach may sound less efficient, it pays off in the long run. Not only will you be happier, but you can have greater assurance that your application will perform consistently as it grows in usage--2 + 2 will equal 4, instead of 3.5.

Before we get started, you need to install Erlang on your system:

Debian/Ubuntu:
sudo apt-get install erlang-base-hipe erlang-manpages

MacPorts:
sudo port install erlang

Windows:
See http://www.erlang.org/download.html

A Pythagorean Triple are any three integers where a2 + b2 = c2. The simplest way to find triples in Erlang is to use list comprehension. What is the point of Pythagorean Triples? I have absolutely no idea. Anyway, open up a command prompt, run 'erl'. You should see some information about Erlang and a "1>" prompt. This is an interactive Erlang interpreter. You can type code into this shell and Erlang will tell you the result. For example, if you type "5." (without quotes) and press enter, Erlang will evaluate your expression and output "5". Now copy the following code and paste into the interpreter:

%% make N a placeholder for 20
N = 20.
%% create a list of all possible positive integer combinations up to
%% N, and only include combos where X*X + Y*Y is equal to Z*Z
Triples = [{X, Y, Z} || X <- lists:seq(1, N),
                        Y <- lists:seq(1, N),
                        Z <- lists:seq(1, N),
                        X*X + Y*Y =:= Z*Z].

If everything worked correctly, Erlang will output the output of the list comprehension, which was assigned to the Triples variable, as a list of 3-item tuples: [ {3,4,5}, {4,3,5}, ... ]

The syntax above is beautifully concise; however, it is a bit different from what a C or Java programmer would consider normal. The way the above code works, is Erlang will take three lists (1..10) and create and even bigger list containing every possible combination of those three lists. The size of this list at the beginning is 1000 (10*10*10 =:= 1000) items. The list will then be filtered by the expression at the end (X*X + Y*Y =:= Z*Z) to remove any XYZ combinations that do not meet the criteria for being pythagorean triples. Finally, if an iteration passes the test, it will be combined into a tuple ({X, Y, Z}) and added to the final list. The final list is then assigned to the Triples variable.

List comprehension in it's most basic form is just a fancy way to take a list, and perform and operation on each item, as well as filter out certain items. It is sort of like a map() and filter() function put together. The syntax of list comprehension is as follows:

%% 1-dimensional list comprehension
[ *expression-to-be-evalutated-during-loop* || *variable* <- *list* ]

%% 2d
[ *expression* || *variable1* <- *list1*, *variable2* <- *list2* ]

%% 3d
[ *expression* || *variable1* <- *list1*, *variable2* <- *list2*, *variable3* <- *list3* ]

%% 1-dimensional list comprehension with filter
[ *expression* || *variable* <- *list*, *remove-any-items-not-matching-this-expression* ]

To help us understand list comprehensions further, we can have our way with the Erlang shell.

erl Interpreter Session
1> MyList = lists:seq(1,10).
[1,2,3,4,5,6,7,8,9,10]
2> [X + 1 || X <- MyList].
[2,3,4,5,6,7,8,9,10,11]
3> [X * 2 || X <- MyList].
[2,4,6,8,10,12,14,16,18,20]
4> [X + 1 || X <- MyList, X >= 5].
[6,7,8,9,10,11]
5> [{X,Y} || X <- lists:seq(1,3), Y <- lists:seq(1,3)].
[{1,1},{1,2},{1,3},{2,1},{2,2},{2,3},{3,1},{3,2},{3,3}]

Now that we have our code working on the command line, let's make a module. Create the file "triples.erl" in your favorite text editor and enter the following code:

File: triples.erl
%% module name must be same as filename
-module(triples).
%% when exporting, you must specify the number of arguments.
%% Functions with a different number of args are considered
%% different functions.
-export([triples/1]).

%% functions take the form of:
%%
%% func_name(arg1, arg2) ->
%%     do_something(),
%%     do_more().
%%
%% the last line of a function always implicitly acts as the
%% return value of the function
triples(N) ->
    [{X, Y, Z} || X <- lists:seq(1, N),
                  Y <- lists:seq(1, N),
                  Z <- lists:seq(1, N),
                  X*X + Y*Y =:= Z*Z].

Now go back to the erl interpreter on the command line and type:

1> c(triples).
{ok,triples}
2> triples:triples(10).
[{3,4,5},{4,3,5},{6,8,10},{8,6,10}]

Here we are compiling the triples module, and then we call the triples function, in the triples module. If you want to get more triples, try entering a higher number as the argument to triples(). If Erlang takes too long to process your request, type Ctrl-C and type abort to kill the interpreter.

Great, we now have a functioning triples calculator. Now let's take the next step and make it run on multiple threads. To do this, we first need to break down the equation into chunks that we can assign to each thread to calculate. In this example, we're going to create a new thread for each X. So if N is 10, then 10 threads will be spawned, each of which will calculate all possible combinations of Y and Z and then report back to the main process, which will then print them to the screen.

To do this, we need to introduce a new concept called "tail recursion". Erlang does not support looping constructs like "while" and "for", instead Erlang programmers will process lists by recursively calling a function, processing the first item of the list at a time. For example:

%% the semi-colon tells erlang there are more definitions for this
%% function to come, in this case there are two.  Erlang will invoke
%% different versions of the function based on the arguments passed
%% when the function is called.  This is called matching.

%% when the first argument is an empty list, this version of hello() is called
hello([]) -> ok; %% 'ok' is an atom, see notes below

%% when the first arg is a list with at least one item, this version is called.
%% the first item of the list is assigned to head, and the rest of the list is
%% assigned to tail
hello([Head|Tail]) ->
    io:format("Hello ~p!!!!~n", [Head]),
    hello(Tail).

Save the above function to triples.erl, and add hello/1 to your export statement. Now open up a shell again:

1> c(triples).
{ok,triples}
2> triples:hello(["kitty", "world"]).
Hello kitty!!!!
Hello world!!!!
ok

Notes

  • Symbols starting with an uppercase letter are variables
  • Symbols starting with a lowercase letter are "atoms" or arbitrary identifiers. Atoms are very similar to defines and enums in C
  • Punctuation in Erlang may seem very confusing at first because there are no braces. I find it helps to think of things the same way as in the English language: periods separate statements, semi-colons separate related but independent clauses, and commas separate terms or expressions. Also keep in mind for future examples that like in Pascal, there is no statement terminator before the 'end' keyword.

In the above example, we are passing a list of strings to hello(). We specified two alternative definitions of our function in the module. One which matches an empty list, and the second which matches a list with at least one element. When the latter function is matched, Erlang will take the first item from the list and assign it to the Head variable. It will then take the rest of the list, and assign it to Tail. (This type of pattern matching is a fundamental concept in Erlang, but is out of the scope of this tutorial. I like to think of pattern matching as a hybrid between an assert statement and regular expressions.)

Our function will then process the "head" or first item of the list. In the above example, we use a printf()-like function to output the list item. We then recursively call hello() again, passing the "tail" of the list (also known as a list of all items excluding the head.)

You may also be wondering how tail recursion won't blow the program's stack by recursing over and over again. This is because Erlang is smart enough to know that when a function calls itself at the end of the function code, that it can be optimized as a jump back to the beginning. Please keep in mind that if you put your tail call inside an expression, the function will recurse normally and consume stack memory. (See "Bad Integer Accumulator" in Exploring Erlang.)

Now let's write a function that will calculate and send each calculated triple as a message to the master process:

Excerpt: triples_threaded.erl
-module(triples_threaded).
-export([triples/1]). % function will be shown below

send_triples([]) -> ok;
send_triples([Triple|Tail]) ->
    %% send Triple as a message to the master process
    master ! Triple,
    send_triples(Tail).

calc_triples(N, X) ->
    io:format("Process ~p started~n", [X]),
    send_triples([{X, Y, Z} || Y <- lists:seq(1, N),
                               Z <- lists:seq(1, N),
                               X*X + Y*Y =:= Z*Z]),
    %% tell master process we're done sending messages
    master ! {done, X}.

In the above example, calc_triples/2 will calculate all possible triples for a given X, up to N. Once a list of triples has been created, it will pass this list to send_triples/1, a tail-recursive function. send_triples/1 will take each triple and send it to the master process. Finally, a "done" atom will be sent to the master process to let it know that we have completed our triples calculations.

If we try to run our function, Erlang will complain because we have not yet registered a process named 'master'.

% name the current process ID "master" for easy reference
register(master, self()).

If we run our function again, it will not fail, however our messages will fall on deaf ears as the master process is not listening for any messages. The messages are there, but there are accumulating in a "mailbox". Call flush() to clear these messages so they will not affect any future code we run in our Erlang session. Now let's write a series of functions to launch the threads and listen for messages:

Excerpt: triples_threaded.erl
spawn_threads(N, 0) -> ok;
spawn_threads(N, X) ->
    %% fun creates an anonymous or lambda function
    spawn(fun() -> calc_triples(N, X) end),
    spawn_threads(N, X - 1).

%% print triples as we receive them and return when last process
%% has terminated
wait_for_triples(0) -> ok;
wait_for_triples(Workers) ->
    %% each receive clause will pattern match messages the same way
    %% function arguments are pattern matched
    receive
        {done, X} -> io:format("Process ~p complete~n", [X]),
                     wait_for_triples(Workers - 1);
        {X, Y, Z} -> io:format("~p, ~p, ~p~n", [X, Y, Z]),
                     wait_for_triples(Workers)
    end.

triples(N) when N > 0 ->
    %% save the process id of our main thread to an easy to remember
    %% name that can be accessed by all processes
    spawn_threads(N, N),
    wait_for_triples(N).

In the above example, spawn_threads loops backwards to create X number of threads. You may also have noticed that the spawn() function which creates a new thread takes a lambda function as an argument. Lambda functions are essentially inline, anonymous, on-the-fly functions. For example:

1> Hello = fun(Str) -> io:format("hello ~s!~n", [Str]) end.
#Fun<erl_eval.6.13229925>
2> Hello("kitty").
hello kitty!
ok
3> Hello2 = fun(Str) -> io:format("hello ~s!~n", [Str]), io:format("blah~n", []) end.
#Fun<erl_eval.6.13229926>
4> Hello2("chococat").
hello chococat!
blah
ok

Lambda functions are especially useful when spawning threads because the code contained within the fun() definition is not executed until it is called by Erlang in the newly spawned process environment. Now assemble the previous two code examples into a file named triples_threaded.erl and open up a new Erlang interpreter:

1> c(triples_threaded).
{ok,triples_threaded}
2> register(master, self()).
true
3> triples_threaded:triples(5).
Process 5 started
Process 4 started
Process 3 started
Process 2 started
Process 1 started
Process 5 complete
4, 3, 5
Process 4 complete
3, 4, 5
Process 3 complete
Process 2 complete
Process 1 complete
ok

You have now successfully broken the calculation down into 10 independent tasks. If you take out the superfluous print statements, you can also try to set the N argument to an even higher number to find more triples. If you have a multi-core CPU, you can speed up computations by enabling Erlang's SMP support:

erl -smp enable

If you try to calculate triples where the N argument is larger than 200, you will notice that Erlang is using significant CPU resources from all the processors on your computer. As you increase N, you will start to see exponential slow-downs in speed. To speed things up even further, we can change our code to run on many computers simultaneously.

One of Erlang's greatest strengths is that it is just as easy to spawn a thread on a separate computer as it is to spawn a thread locally. To run distributed Erlang code, you need to setup an Erlang shell on each computer. Each shell must have a name, and a cookie. The cookie is a password used to authenticate hosts for security reasons.

The simplest way to demonstrate this concept is to run two instances of Erlang on the same machine, as some readers may not have access to multiple computers. Open up two terminal windows and enter in the following commands:

# in first terminal window
$ erl -sname bob@localhost -setcookie some_password
Eshell Vx.x (abort with ^G)
(bob@localhost)1>

# In second terminal window
$ erl -sname alice@localhost -setcookie some_password
Eshell Vx.x (abort with ^G)
(alice@localhost)1> spawn(bob@localhost, fun() -> io:format("hello kitty!~n") end).
<5055.44.0>
hello kitty!
(alice@localhost)2>

In the above example, Alice spawned a lambda function containing code to print "hello kitty!" to Bob. Although this code was executed on Bob's node, Alice sees the output. This is not an accident as Erlang is designed to send all terminal I/O back to the executing host. If this wasn't the case it could be quite a nuisance to have to run back and forth between computers to examine error messages and other program output that was not explicitly sent back to the main node.

To make our triples calculator be able to perform computations across multiple nodes, we'll need to modify our code to accept a list of nodes, and take turns spawning each process on a different node. We also must pass the master PID as an argument to spawned threads, as other nodes may not have access to registered process names.

File: triples_dist.erl
-module(triples_dist).
-export([triples/2]).

send_triples(Master, []) -> ok;
send_triples(Master, [Triple|Tail]) ->
    Master ! Triple,
    send_triples(Master, Tail).

calc_triples(Master, N, X) ->
    send_triples(Master, [{X, Y, Z} || Y <- lists:seq(1, N),
                                       Z <- lists:seq(1, N),
                                       X*X + Y*Y =:= Z*Z]),
    Master ! {done, X}.

spawn_threads(N, 0, Nodes) -> ok;
spawn_threads(N, X, [Node|Nodes]) ->
    %% if self() is placed inside the fun(), it will evaluate
    %% to the spawned PID, rather than the master PID
    Master = self(),
    spawn(Node, fun() -> calc_triples(Master, N, X) end),
    %% ++ is the list concatenation operator
    spawn_threads(N, X - 1, Nodes ++ [Node]).

wait_for_triples(0) -> ok;
wait_for_triples(Workers) ->
    receive
        {done, X} -> wait_for_triples(Workers - 1);
        {X, Y, Z} -> io:format("~p, ~p, ~p~n", [X, Y, Z]),
                     wait_for_triples(Workers)
    end.

triples(N, Nodes) when N > 0 ->
    spawn_threads(N, N, Nodes),
    wait_for_triples(N).

In the module above, you'll notice that triples() now accepts a list of nodes. When that list is passed to spawn_threads(), it will take the first node from the list and spawn the process to that node. It will then call itself again, adding the node we just used to the end of the list using the list concatenation operator. This way, we can specify as many nodes as we like, and the computing load will be distributed evenly across all of them. Now let's take our distributed calculator for a spin:

# in first terminal window
$ erl -sname bob@localhost -setcookie secret
Eshell Vx.x (abort with ^G)
(bob@localhost)1> c(triples_dist).

# in second terminal window
$ erl -sname alice@localhost -setcookie secret
Eshell Vx.x (abort with ^G)
(alice@localhost)1> c(triples_dist).
(alice@localhost)2> triples_dist:triples(100, [bob@localhost, alice@localhost]).
...lots of triples...
ok
3>

In the above example please note that it is necessary to compile your code on both hosts. If you are using multiple computers, it is important that you copy your modules verbatim to every node involved in processing.

But what if this isn't fast enough? Well, one option Erlang has available is to use the native code compiler: HiPE. To enable HiPE with level o3 optimization, use the compilation function below and you'll see a significant increase in performance. Please note that this may not work on certain platforms, such as Macintosh.

c(triples_dist, [native, {hipe, [verbose, o3]}]).

So now you've built a distributed Erlang program with just 30 lines of code. If you'd like to do more with your calculator, I recommend the following exercises:

  • Enhance wait_for_triples() to disregard any triple combinations that have already been displayed.
  • Check out triples_dist2.erl below which includes a triples calculator that does not use list comprehension. This version uses less memory and reports triples to the master process in real time.
  • Write a program in Erlang that does something more useful than calculating triples.

Download Examples

[txt] triples.erl
[txt] triples_threaded.erl
[txt] triples_dist.erl
[txt] triples_dist2.erl

Comments

No comments found.

Post Comment