I am a programmer and architect (the kind that writes code) with a focus on testing and open source; I maintain the PHPUnit_Selenium project. I believe programming is one of the hardest and most beautiful jobs in the world. Giorgio is a DZone MVB and is not an employee of DZone and has posted 638 posts at DZone. You can read more from them at their website. View Full User Profile

Practical PHP Patterns: Repository

07.12.2010
| 15731 views |
  • submit to reddit

A Repository is an higher level abstraction put between client code and a Data Mapper's infrastructure. The Repository provides a collection (or array) domain abstraction over the crude, generic Data Mapper API. It also introduces a decoupling point between the Domain Model and the infrastructure, with an interface that can be crafted in the domain layer instead for example of being imposed by the generic ORM. Your domain objects depend on an EntityManager instance? The Repository pattern is the cure.

Intent

A Data Mapper, when built in a generic way, is an abstraction over a persistent storage (usually a relational database), which presents a retrieval interface based on queries or finders and the possibility to add objects of certain classes called entities.

What if we wanted, without distorting the mechanics of this interface, to provide a richer abstraction, more in line with the concept of the in-memory object graph illusion? And of a non-invasive persistence layer? Some kind of abstraction has to be put in between the Data Mapper and the domain layer to make the latter capable of using the former, without depending on it.

A Repository has an interface analogue of a collection, which holds in memory all the objects instance of a certain entity class, and it is the only component the client code from the domain layer sees. There are many differences between using a Repository and a Data Mapper.

  • API: a Repository can compose a Data Mapper to accomplish its job, but its operations span is more limited, and presents a narrow interface which will be described in the next points. In this case, many of the facilities of the Data Mapper like Query Objects are hidden and used inside of the Repository implementations (while Lazy Loading Proxies usually can leave their internals because of their transparency.) With a persistence layer that consists of Repositories, you can even change the underlying Data Mapper without the rest of the code noticing.
  • Specificity: a Repository class is specific to a particular entity, and has its own API, which is typically domain-oriented. Data Mappers tend to a generic implementation, which provides a sink where you store or retrieve any kind of object. That said, in Domain-Driven Design a Repository is needed by definition only for entities that are Aggregate Roots. Nevertheless, in many Domain Model for data-driven applications all entities are Aggregate Roots.
  • Decoupling: the Domain Model code shouldn't depend on the rich interface of Data Mapper, but only on the narrow one of Repository (gaining also only one entity to refer to).
  • Semantics: the Data Mapper is the abstraction of a storage, Repository of a collection of objects.

In practice, when it composes a Data Mapper, a Repository condensates the various queries related to a particular entity, providing an interface derived from the Domain Model requirements.

The typical methods of a repository are derived from collection-like interfaces:

  • retrieval: only the retrieval operations actually needed, and a small segregated interface easy to mock in order to test domain code that depends on it in isolation. Repositories are the death of findBy() methods, which are overly generic, or arbitrary queries (again, to simplify the client code and encapsulate complex queries in the Repository itself).
  • Addition: add(EntityClass $object), if addition is permitted by the client code.
  • Deletion: delete(EntityClass $object), if deletion is permitted by the client code.
Note that updating operations are out of the scope of a Repository: since you have a (virtual) in-memory collection, there is no point in adding methods to persist it. The changes will be stored along with the additions and deletions when the correspondent Unit Of Work is committed. In a PHP application, this means at the end of the script, after all the domain code has been executed.

The less methods you insert in a repository, the less are the chances that client code will be allowed to perform an unauthorized operation (like storing objects when they should only be persisted due to particular events). Moreover, you can produce easily a mock for the Repository automatically by defining expectations only for the methods of interest, insulating only the domain class which accesses it and testing in real isolation.

Testing

A Repository implementation is a great example of an adapter for a Domain Model. The Repository interface would reside in the Domain Model itself, while its different implementations are only adapters that insulate different kinds of storages (real database, or in-memory for testing)

A common solution is usually to test with the same repository used in production, but substituting the configuration of a generic, well-tested, outsourced Data Mapper with a in-memory option (using an sqlite database instead of a MySQL one for example), which provides greater speed and ease of management operations like resetting the state. The outsourced Data Mapper is usually frozen, so in won't introduce failurs in unit test of the Domain Model code; at the same time, the Repository is used because you would end up reimplementing it if you were substituting a Fake implementation.

Examples

Doctrine 2, the open source, real PHP Orm, offers two choices for using Repositories in your application:

  • integrated repositories which are injected into the Entity Manager, but which can only extend the basic interface by adding methods, not providing a segregated interface nor injection of custom collaborators. It is a quick approach, and does not achieve all the advantages of a pure version of this pattern, but it is a good start to begin encapsulating queries behind a interface oriented to the Domain Model.
  • External repositories which you manage on your own, and compose the EntityManager instance. If you need anything else as a collaborator (a Mailer to send mail during comments addition?), you're free to add it to the signature of your constructor: Repository is a Plain Old PHP Object.
We will see an example of the first technique, which is short enough to fit in an online article like this.

The Doctrine 2 manual shows how easy defining a custom Repository class is:

<?php

namespace MyDomain\Model;

use Doctrine\ORM\EntityRepository;

/**
* @entity(repositoryClass="MyDomain\Model\UserRepository")
*/
class User
{
// ...
}

class UserRepository extends EntityRepository
{
public function getAllAdminUsers()
{
return $this->_em->createQuery('SELECT u FROM MyDomain\Model\User u WHERE u.status = "admin"')
->getResult();
}
}

When a custom class is not defined, Doctrine 2 will instantiate an EntityRepository passing it the entity class name. It will be perfectly functional but it won't have any domain-specific methods.

You can then obtain the Repository, whose lifecycle and injection is managed for you:

<?php

// $em instanceof EntityManager
$admins = $em->getRepository('MyDomain\Model\User')->getAllAdminUsers();
Published at DZone with permission of Giorgio Sironi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)