I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Erlang: linking processes

11.14.2012
| 3011 views |
  • submit to reddit

Let it fail is one of Erlang design techniques: given that processes are cheap and readily available, we can create one of them to deal with every potentially failing operation and let them crash when encountering harsh conditions.

Thanks to this technique, the code for the happy path and for managing errors can usually be separated into different processes, like we do with try/catch constructs in imperative languages.
Of course, it is assumed that when a process fail, some other code exists that will take care of it. Erlang provides some primitives that help in catching failures in the right place, to log it and isolate the failure from the rest of the system.

Link sets

One of the possibilities for handling failure in Erlang is process linking, a bidirectional relationship declared between processes.

Basically, instead of spawning a new process and let it follow its own road:

spawn(my_module, function_name, [Arg]).

You can spawn the same process while atomically linking it to the current one:

spawn_link(my_module, function_name, [Arg]).

When two processes are linked, the exit event of one of them will propagate to the other one. This construct makes it easy to clean up, as if processes should always be executed in cooperation, one of them can cause the other to crash too (so that eventually someone else can respawn all of them on a clean slate).

Exit events are transitive, so is a process B fail because of the linked A, the processes linked to B will receive the event in turn, and so on, in a chain of failures:

A -> B -> C1
  `-> C2

Of course, you're not forced at all to link all of your processes.

The interesting thing, however, is that a process can trap exits, transforming the exit event coming from a linked process into a message that can be received and handled. This is particularly useful when building servers with children processes for each client:

  1. the server starts to trap exits.
  2. The server spawns new linked child processes when receiving a request.
  3. If the server crashes, all child processes are linked to it and will be terminated.
  4. If a child crashes, the server receives a message to handle it gracefully; the server doesn't exit.

Enough with the theory of process linking! Let's write some code.

Sample code

Let's define what we want from our routines parent (the server) and child (processes that deal with single requests).

In the success case, we tell the server to divide its constant (42) for a number that we choose. Then, we want to receive a result tuple containing the right number.

happy_path_test() ->
  Server = spawn(fail_07, parent, []),
  Server ! {divide, 2, self()},
  receive
  {result, Result} -> Result
  end,
  ?assertEqual(21, Result).

The failure case tries to divide by zero. After waiting a bit with no response, we understand there has been an error; however, the server should still be alive as only the child process has crashed.

divide_by_zero_test() ->
  Server = spawn(fail_07, parent, []),
  Server ! {divide, 0, self()},
  Status = receive
  {result, _} -> ok
  after 2000 -> error
  end,
  ?assertEqual(error, Status),
  Server ! {ping, self()},
  receive
  {pong, Server} -> ok
  end.

So how do we build the parent? First, we declare we want to trap exits, and then we enter the main loop.

parent() ->
  process_flag(trap_exit, true),
  parent_loop().
In the loop, we receive possible messages from clients, or the 'EXIT' tuple.
parent_loop() ->
  receive
  {'EXIT', _Pid, {ErrorCode, _}} -> ErrorCode;
  {divide, Divisor, AnswerTo} -> spawn_link(fail_07, child, [Divisor, AnswerTo]);
  {ping, AnswerTo} -> AnswerTo ! {pong, self()}
  end,
  parent_loop().

So we can log an error (not shown) in case we receive 'EXIT'; spawn a new linked child if we receive divide, and answer to ping immediately.

The child simulates division as a costly operation. Note that it has no error handling: exceptions during execution will reach the parent process in the values contained in the 'EXIT' tuple.

child(Divisor, AnswerTo) ->
  timer:sleep(1000),
  Result = 42 div Divisor,  % integer division
  AnswerTo ! {result, Result}.

Conclusions

Linking processes is one of the ways to deal with failures, with richer possibilities than letting children fail in their own right. The bidirectional relationship ties parent and children processes together, while some of them can decide to trap exists and surviving the crashes of their neighbors.
Next time we will see another way of catching failures, with Erlang's monitoring primitives.

The code contained in this article is available on Github .

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)