I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Parallel PHPUnit

02.04.2013
| 6405 views |
  • submit to reddit

PHPUnit is the standard testing framework for PHP code: always available through Pear or Composer, following xUnit conventions on tests and providing many features from grouping to code coverage to logging of results. There's even an extension for running Selenium tests (that I maintain), which allows you to run browser-based tests.

Parallelism

What PHPUnit lacks is parallelism: tests are run one after the other, usually in the same process. This means that when you have more available resources, such as a multicore CPU, some of the computational power is not used as the PHPUnit process may reach 100% utilization while the other cores are not working at all. This is not surprising.

PHP does not have multithreading capabilities, but it can start new processes at the OS level. So many developers have came up with the same idea: starting multiple PHPUnit processes, each working on a different subset of tests, and aggregate the results. This could theoretically give you a N times speedup when working with N different cores, for example passing from 10 minutes on a single core to 2'30'' on a quad core CPU.

Caveats

Of course the cost of coordinating different processes is always going to be present, so we will never reach the theoretical speedup. I'll report later in this article some simulations.
The most important constraints come from the design of our test suites. I can only think of two categories of tests as easily parallelizable:

  • unit tests, which only use memory and CPU as resources and not disk or other external infrastructure.
  • Selenium tests, which run against a live HTTP server that must be able to serve multiple requests without race conditions if your application is going to work.

By design, these two kinds of tests are always capable to run in parallel. However, other intensive and long-running tests such as end-to-end tests and integration ones usually conflict with each other:

public function setUp()
{
  $this->pdo = new PDO(...);
  $this->pdo->query('DELETE FROM users');
}

public function testUsersCanBeAddedWithAllDetails()
{
  $this->request->post('/users', ...);
  $this->assertEquals(1, $this->request->get('/users'));
}

public function testUsersCanBeDeletedByAnAdmin()
{
  $this->insertAnUser();
  $this->assertEquals(1, $this->request->get('/users'));
  $this->request->delete('/users', ...);
  $this->assertEquals(0, $this->request->get('/users'));
}

These API-based tests are never going to run in parallel (on the same machine) when written in this way, due to the race condition on the users table. If you have a slow suite that you want to speed up, chances are that it contains many end-to-end tests like these. Some of these tests can be isolated with RDBMS transactions, but it's difficult for black-box tests to intervene on the transaction isolation inside the application.

The tools

PHPUnit is due to support parallelism since 2007, but it has never come up in the package and pull requests for the feature have never been accepted. So we have to resort to external tools.
Probably the most complete tool working on top of PHPUnit is Paratest , which has two peculiarities:

  • It uses reflection to compose a list of all of your tests instead of grepping *Test.php files.
  • It reads PHPUnit JUnit-format logs to aggregate results from different tests, which makes it difficult to break than tools that parse the output of the command itself.

The only limitations of it are that it poses some stronger constraints on your tests, for example they have to follow the PSR-0 convention. However, it delegates much to PHPUnit and lets you use many of the same command line switches such as --configuration and --bootstrap.

Experiments

To experiment with Paratest, I created a simulated unit test suite that only works with the CPU. I have 10 test of the form :

public function testExample()
{
    for ($i = 0; $i < 1024*1024; $i++) { $this->assertTrue(true); }
}

I then tried to run this suite on a dual core CPU, on a physical (not virtual) home machine. I have tried different options, too:

  1. vanilla PHPUnit, serial execution
  2. Paratest, single process execution (to find out if it has an high overhead).
  3. Paratest with 2 parallel processes.

These are the results:

[21:13:25][giorgio@Desmond:~/paratestexample]$ ./compare.sh
PHPUnit 3.7.13-5-g6937c46 by Sebastian Bergmann.

..........

Time: 03:04, Memory: 3.25Mb

OK (10 tests, 20971520 assertions)

Running phpunit in 1 process with /home/giorgio/paratestexample/vendor/bin/phpunit

..........

Time: 03:01, Memory: 3.75Mb

OK (10 tests, 20971520 assertions)

Running phpunit in 2 processes with /home/giorgio/paratestexample/vendor/bin/phpunit

..........

Time: 02:15, Memory: 3.75Mb

OK (10 tests, 20971520 assertions)

The difference is a 25% decrease in total time, which is really worth investigating further.

Conclusions

I'm going to experiment more with Paratest to see if it's possible to speed up also batteries of end-to-end tests, for example making different processes using different databases or offloading the PHPUnit commands to different machines.

Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

László Jánszky replied on Mon, 2014/04/07 - 11:36am

"However, other intensive and long-running tests such as end-to-end tests and integration ones usually conflict with each other"

This phenomenon is because integration tests and end to end tests are using the same resources: files, database, etc... This can cause concurrency issues in PHP, for example locking 2 files in reverse order, in different controller actions causes deadlock by a web application under heavy loading... What do you think, is there a way to test those concurrency issues in your integration tests?

(To do that I guess you have to run parallel tasks  from your testcases in the exact timing and gather the information about them from xdebug. But maybe there is a better way, I am not an expert of this topic...)

Giorgio Sironi replied on Mon, 2014/04/07 - 2:51pm in response to: László Jánszky

 I tend to say you can't test your way to thread safety, it's not an efficient mean to discover design problem unless it's your application primary focus (you develop a database).

I give different resources to processes to avoid these concurrency issues (database 1..N to each PHPUnit process simultaneously run) or exercise different aggregates (a new customer for each test) to avoid interactions. Tests must be independent to be reliable.

László Jánszky replied on Mon, 2014/04/07 - 3:54pm in response to: Giorgio Sironi

I am not sure, I'll try to write a parallel task runner, and write at least some checks to my application. Do you have other solution to discover resource locking, transaction merging, etc... concurrency issues in your PHP code? Maybe it is possible to random pair concurrent tasks automatically, I will think about...

Yes, with multiple databases you can run integration tests parallel. You can trust a test runner with that...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.