Agile Zone is brought to you in partnership with:

I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 637 posts at DZone. You can read more from them at their website. View Full User Profile

Practical PHP Refactoring: Extract Class

07.20.2011
| 5953 views |
  • submit to reddit

Sometimes there is too much logic to deal with in a single class. You tried extracting methods, but they are so many that the design is still complex to understand.

The next step in the refactoring quest is Extract Class, the creation of a new class whose objects will be referenced from the original class. Fields and methods may be moved in the new class, in order for the original to get smaller and more manageable.

Why class inflation happens?

This refactoring is always caused by classes growing in responsibilities. My personal hypothesis is that we as developers have a bias for the add field|method operations preferring it to the add class. Usually, creating a new class also mean adding an entire new file (hopefully) and more design considerations like its namespace and name. The mental cost for the developer is heavier, but the results are often better than for smaller extractions, as classes can be reused independently; extracted methods are instead clustered together.

Moreover, our libraries (e.g. ORMs such as Doctrine 2) reinforce this bias by making significatively more difficult to extract a Value Object (should you serialize it? write a custom DBAL type?) or even another Entity (should I link to it with a @OneToOne? @OneToMany? Which cascade options will work? Which constraints the relationships has?)

By the way, this solution is a manifestation of composition over inheritance in the refactoring realm (while there are also other options where the class is made smaller by introducing superclasses instead of an unrelated type.)

What are the signs it's time to extract a class?

You may encounter a subset of methods and fields that cluster together: for example, they are identified by a prefix; or they have have a temporal coupling which makes them change together faster or slower than the other fields. They may be of similar types (scalar or superclass).

Another option to target higher cohesion is to simply see which fields are used together by each method in the class.

Fowler's suggestion is to try removing each field (conceptually), and think about which other fields become useless. Repeat this and you'll find the subsets of fields to extract if they exist.

Steps

  1. Divide responsibilities in source class and extracted class: fields and methods should be classified to one or the two targets. This is true for public and private members, but the latter may change scope to public to be visible from the source class.
  2. Create a new class, and check the names of the source and the new one. In the extracted class, you decide the name on the fly; in the source class, you have to change the old name if it's no longer applicable (and later, also the name of references in the system to its objects). It may be the case that the extracted class steals the name of the source one.
  3. Place a field in the source class referencing an object (or more) of the extracted class. The field may be initialized in the constructor, and also injected with a constructor parameter if the change is not too invasive.
  4. Move Field iteratively from the source to the extracted class.
  5. Move Method iteratively. If you bundle steps 4 and 5, you'll be faster, but the point is you should be able to go in smaller steps when it is necessary. Likewise, TDD is taught with baby steps because it gives you the ability to make them when required: everyone is capable of cutting a giant piece of code and fiddling with it for hours until it works again. But that involves a rewriting part, not only refactoring.
  6. Reduce the interfaces that each class exposes. Often the extracted one only needs public methods for what the original class uses, while the original class maintains its old protocol to avoiding ripple effects towards the rest of the object graph.

In fact, it's not even said that you should expose the extracted class. In many cases, you don't have to; but you'll be able to use it independently by creating other objects, while there is often no need to expose this particular object, composed by the source class (and the Law of Demeter says you shouldn't.)

Tests should be executable after each movement of fields or methods.

Example

In the example, we pass from an initial state where formatting and HTML logic is crammed into the same class:

<?php
class ExtractClassTest extends PHPUnit_Framework_TestCase
{
    public function testDisplaysMoneyInAHumanFormat()
    {
        // using strings for representation to avoid loss of precision
        $moneyAmount = new MoneyAmount('10000');
        $this->assertEquals('<span class="money">10,000.00</span>', $moneyAmount->toHtml());
    }
}

class MoneyAmount
{
    /**
     * @param int $amount
     */
    public function __construct($amount)
    {
        $this->amount = $amount;
    }

    public function toHtml()
    {
        $amount = $this->amount;
        $formatted = '';
        while (strlen($amount) > 3) {
            $cut = strlen($amount) % 3;
            $cut = $cut == 0 ? 3 : $cut;
            $formatted .= substr($amount, 0, $cut) . ',';
            $amount = substr($amount, $cut);
        }
        $formatted .= $amount . '.00';
        $html = "<span class=\"money\">$formatted</span>";
        return $html;
    }
}
To two separated classes, one modelling the logical amount and its formatting, one taking care of printing HTML tags.
<?php
class ExtractClassTest extends PHPUnit_Framework_TestCase
{
    public function testDisplaysMoneyInAHumanFormat()
    {
        // using strings for representation to avoid loss of precision
        $moneyAmount = new MoneySpan(new MoneyAmount('10000'));
        $this->assertEquals('<span class="money">10,000.00</span>', $moneyAmount->toHtml());
    }
}

class MoneySpan
{
    /**
     * @param int $amount
     */
    public function __construct(MoneyAmount $amount)
    {
        $this->amount = $amount;
    }

    public function toHtml()
    {
        $html = '<span class="money">' . $this->amount->format() . '</span>';
        return $html;
    }
}

class MoneyAmount
{
    private $amount;

    public function __construct($amount)
    {
        $this->amount = $amount;
    }

    public function format()
    {
        $amount = $this->amount;
        $formatted = '';
        while (strlen($amount) > 3) {
            $cut = strlen($amount) % 3;
            $cut = $cut == 0 ? 3 : $cut;
            $formatted .= substr($amount, 0, $cut) . ',';
            $amount = substr($amount, $cut);
        }
        return $formatted . $amount . '.00';
    }
}


You can see the four intermediate steps in the Github history of the file.
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)