I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 638 posts at DZone. You can read more from them at their website. View Full User Profile

Practical PHP Refactoring: Separate Query from Modifier

11.14.2011
| 4156 views |
  • submit to reddit
In the (recurring) scenario of today, a method modifies the state of an object and returns something at the same time. In this case, the method is an hybrid between:
  1. a query, which allows to retrieve part of the state of an object (we're not talking about SQL here);
  2. a modifier, which may change the observable state.

The problem usually manifests with queries that have side-effects, and less often with commands that return something. The side effects in the former case are unclear and hidden in a method that someone may call a different number of times, or in many different points, without knowing the state is being modified.

Observable what?

The observable state of an object is the state that can be accessed from its client code, or that in some way affects what messages exit the application (files which are written, or rows which are inserted in the database).

Observable state does not comprehend lazy loaded properties: it is not an infraction to implement methods with lazy loading, as from the outside you can call the query methods (getters) as many times as you want; at the same time, you'll still get the same functional behavior, which will just be a bit slower at the first call.

The same goes for any kind of cache: their internal state is not observable, so if they modify it at the first call it's no problem from the client's point of view. It's natural for them to assign $this fields and return something in the same method.

A first rule of thumb to identify infraction to CQS is looking at methods that:

  • modify the state assigning something to $this or forwarding calls to modify state of collaborators;
  • and have a @return type different from void.

Then you should question the visibility of changes from the outside.

The CQS principle

The Command Query Separation principle was formalized by Bertrand Meyer, author of the Open/Closed Principle and of the Eiffel language. This principle enables you to duplicate calls to a getter method without worrying, or move the calls to other objects knowing that they can't do any harm.

The principle is reflected elsewhere: the HTTP protocol separates GET as a safe method from POST which can modify the state of the server. So proxies can cache responses to GET requests and always forward POST instead; and crawlers can follow links without filling your application database of garbage as a side-effect.

The same advantage is present in client code that calls one of these methods. You don't want to wonder if your getXXX() will modify something; especially while debugging or writing a test, you assume that you can call it as many times as you want, inspecting the state without affecting a bug's presence.

A famous exception to the principle (cited by Greg Young in one of his talk on CQRS) is that of queues and their dequeue() method. However it's just an exception, not the rule.

Steps

  1. Create a public Query method which returns what the original method was returning.
  2. Modify the original method: it should now delegate to the query, which is idempotent.
  3. Check the test suite.
  4. Replace calls to the original method by separating them in calls to the Command method and the new Query method.
  5. Eliminate the return values from the original method, which is now called but without assigning its result (as void).
I find these steps, enumerated by Fowler, incomplete in the case we want to extract the Command first. This second case is more appropriate if we want to eliminate side-effects from a Query.

Example

This domain object represents an User with a calculated field containing its full name. The situation is that __toString() is called somewhere in the application; if the call is removed, the incomplete object will be available for getXXX() calls and a NULL could be printed instead of the name.

<?php
class SeparateQueryFromModifier extends PHPUnit_Framework_TestCase
{
    public function testTheDomainObjectProvidesACalculatedField()
    {
        $user = new User('Giorgio', 'Sironi');
        $this->assertEquals('User: Giorgio Sironi', $user->__toString());
    }
}

/**
 * @Entity
 */
class User
{
    /**
     * @Column
     */
    private $firstName;
    /**
     * @Column
     */
    private $lastName;
    /**
     * @Column
     */
    private $fullName;

    public function __construct($first, $last)
    {
        $this->firstName = $first;
        $this->lastName = $last;
    }

    public function getFirstName() { return $this->firstName; }
    public function getLastName() { return $this->lastName; }
    public function getFullName() { return $this->fullName; }

    public function __toString()
    {
        $this->fullName = $this->firstName . ' ' . $this->lastName;
        return 'User: ' . $this->fullName;
    }
}

Let's write a test that exposes the CQS problem while being agnostic with respect to how the value of the field is calculated.

<?php
class SeparateQueryFromModifier extends PHPUnit_Framework_TestCase
{
    public function testTheDomainObjectProvidesACalculatedField()
    {
        $user = new User('Giorgio', 'Sironi');
        $this->assertEquals('User: Giorgio Sironi', $user->__toString());
    }

    public function testTheDomainObjectQueriesShouldNotModifyTheObservableStateOfTheObjectItself()
    {
        $user = new User('Giorgio', 'Sironi');
        $oldFullName = $user->getFullName();
        $user->__toString();
        $this->assertEquals($oldFullName, $user->getFullName());
    }
}

We extract a Command, which delegates the return part to the Query method (which has been cleaned up.) This time I would prefer to extract the command instead of extracting the Query because I want to preserve the __toString() name for it.

Thus, we should also modify calls now.

<?php
class SeparateQueryFromModifier extends PHPUnit_Framework_TestCase
{
    public function testTheDomainObjectProvidesACalculatedField()
    {
        $user = new User('Giorgio', 'Sironi');
        $user->completeFields();
        $this->assertEquals('User: Giorgio Sironi', $user->__toString());
    }

    public function testTheDomainObjectQueriesShouldNotModifyTheObservableStateOfTheObjectItself()
    {
        $user = new User('Giorgio', 'Sironi');
        $user->completeFields();
        $oldFullName = $user->getFullName();
        $user->__toString();
        $this->assertEquals($oldFullName, $user->getFullName());
    }
}

/**
 * @Entity
 */
class User
{
    /**
     * @Column
     */
    private $firstName;
    /**
     * @Column
     */
    private $lastName;
    /**
     * @Column
     */
    private $fullName;

    public function __construct($first, $last)
    {
        $this->firstName = $first;
        $this->lastName = $last;
    }

    public function getFirstName() { return $this->firstName; }
    public function getLastName() { return $this->lastName; }
    public function getFullName() { return $this->fullName; }

    public function completeFields()
    {
        $this->fullName = $this->firstName . ' ' . $this->lastName;
        return $this->__toString();
    }

    public function __toString()
    {
        return 'User: ' . $this->fullName;
    }
}

Now we totally separate them, transforming the command into a method which does not returns anything.

<?php
class SeparateQueryFromModifier extends PHPUnit_Framework_TestCase
{
    public function testTheDomainObjectProvidesACalculatedField()
    {
        $user = new User('Giorgio', 'Sironi');
        $user->completeFields();
        $this->assertEquals('User: Giorgio Sironi', $user->__toString());
    }

    public function testTheDomainObjectQueriesShouldNotModifyTheObservableStateOfTheObjectItself()
    {
        $user = new User('Giorgio', 'Sironi');
        $user->completeFields();
        $oldFullName = $user->getFullName();
        $user->__toString();
        $this->assertEquals($oldFullName, $user->getFullName());
    }
}

/**
 * @Entity
 */
class User
{
    /**
     * @Column
     */
    private $firstName;
    /**
     * @Column
     */
    private $lastName;
    /**
     * @Column
     */
    private $fullName;

    public function __construct($first, $last)
    {
        $this->firstName = $first;
        $this->lastName = $last;
    }

    public function getFirstName() { return $this->firstName; }
    public function getLastName() { return $this->lastName; }
    public function getFullName() { return $this->fullName; }

    public function completeFields()
    {
        $this->fullName = $this->firstName . ' ' . $this->lastName;
    }

    public function __toString()
    {
        return 'User: ' . $this->fullName;
    }
}

Finally, since the Command is always called after the construction, I can't avoid to move the call inside the class.

<?php
class SeparateQueryFromModifier extends PHPUnit_Framework_TestCase
{
    public function testTheDomainObjectProvidesACalculatedField()
    {
        $user = new User('Giorgio', 'Sironi');
        $this->assertEquals('User: Giorgio Sironi', $user->__toString());
    }

    public function testTheDomainObjectQueriesShouldNotModifyTheObservableStateOfTheObjectItself()
    {
        $user = new User('Giorgio', 'Sironi');
        $oldFullName = $user->getFullName();
        $user->__toString();
        $this->assertEquals($oldFullName, $user->getFullName());
    }
}

/**
 * @Entity
 */
class User
{
    /**
     * @Column
     */
    private $firstName;
    /**
     * @Column
     */
    private $lastName;
    /**
     * @Column
     */
    private $fullName;

    public function __construct($first, $last)
    {
        $this->firstName = $first;
        $this->lastName = $last;
        $this->completeFields();
    }

    public function getFirstName() { return $this->firstName; }
    public function getLastName() { return $this->lastName; }
    public function getFullName() { return $this->fullName; }

    private function completeFields()
    {
        $this->fullName = $this->firstName . ' ' . $this->lastName;
    }

    public function __toString()
    {
        return 'User: ' . $this->fullName;
    }
}
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)