Practical PHP Patterns: Identity Map
Join the DZone community and get the full member experience.
Join For FreeThe Identity Map pattern is a Map implementation related to a Data Mapper usage. A map in the computer science sense is also called dictionary, or associative array; although in PHP associative arrays are very powerful, this kind of Map can be implemented as an object to present a specific interface to client code.
The purpose of an Identity Map in the Data Mapper context is to keep a list of all the references to the in-memory domain objects that has been reconstituted by the Data Mapper internal mechanism, or are somehow managed by the Data Mapper itself (for example they have been scheduled for persistence).
The Identity Map solves the problem of multiple loading of objects, which leads to performance issues and inconsistencies like two different objects with different states (but whose identity is the same, since they have for example an equal user id) that has to be stored in the back end. Ideally has a reference to every single object of the domain (that contains state, and thus is managed by the Data Mapper instead of being created by infrastructure or domain factories), in practice it is an array of references to the loaded objects.
PHP implementation
In PHP, the Identity Map is not unique troughout the whole application, but it is an object whose scope is limited to the single HTTP request (and so for example different requests have different Identity Maps which can become inconsistent with each other.) This limited scope, which is part of the nature of PHP and its scalable architecture, requires careful handling of objects that have been detached from the Data Mapper. In general, you can serialize or store in a cache domain objects for performance boost or simplification of business logic. You have, however, the obligation of reattaching a domain object to the Data Mapper with a special method (in Doctrine 2 EntityManager::merge()) to subsequently persist it, so that it can be reinserted in the Identity Map instead of being considered new or being duplicated. Remember that here duplication is more an issue of consistency than performance: a Data Mapper which accepts two different objects that points to the same place in the data store is not reliable.
In fact, an Identity Map is a fundamental part of a non-naive Data Mapper: before recreating an entity the mapper looks for it in the Identity Map, to check if it is already available. Only if the object is not there, the Data Mapper creates a new one and inserts it in the Map for later reuse. Thus the Identity Map bridges the gap between the storage and the memory, keeping track of which parts of the object graph have been brought in memory and which are still on disks or external database machines, since we are forced by the technology to actually reconstitute a very small part of the application state in the form of objects (to be able to work on them). This approach is particularly suited to PHP's shared nothing mentality: there are other solutions for languages like Java and C#, like keeping the whole object graph in memory (some gigabytes) and dealing with persistence by taking a periodical snapshot of the graph, which is then freezed and stored on slower-but-larger memories like disks or SSD.
In Doctrine 2
From the technical point of view, the Identity Map is an object or an associative array, with a single instance that exists for the entire request. This data structure is composed by the Entity Manager (the Facade of the Data Mapper) or by a Unit of Work, or even by some internal class of the Data Mapper. Even when an object is reconstituted as part of a query and not requested by its primary key, the loader class has to extract a unique identifier for the domain object and ask the Identity Map. In the sample code we will see at the end of this article, Doctrine 2 choice has been to keep the Identity Map as a private property (a multidimensional associative array) of the Unit of Work, which has a set of public methods available to access the Map to act as the unique Facade for the internal code.
The tipical key used for the indexing is a combination of the class name of the domain object and of its unique identifier (usually the primary key used in storage, reduced to a serialized value if constituted by multiple fields.) This indexing implementation is generic enough to deal with most of the use cases, even with inheritance strategies. Another supplemental indexing is based on the spl_object_hash() function result, which returns a unique identifier for every in-memory object; this indexing is used to quickly check if an object originated from somewhere is in the Identity Map, without extracting its identifier and class name.
The sample code is part of the Unit of Work of Doctrine 2. I cut all the aspects which did not involve its internal Identity Map as we have already described it in its own article.
<?php
namespace Doctrine\ORM;
use Exception,
Doctrine\Common\Collections\ArrayCollection,
Doctrine\Common\Collections\Collection,
Doctrine\Common\NotifyPropertyChanged,
Doctrine\Common\PropertyChangedListener,
Doctrine\ORM\Event\LifecycleEventArgs,
Doctrine\ORM\Proxy\Proxy;
/**
* The UnitOfWork is responsible for tracking changes to objects during an
* "object-level" transaction and for writing out changes to the database
* in the correct order.
*
* @since 2.0
* @author Benjamin Eberlei <kontakt@beberlei.de>
* @author Guilherme Blanco <guilhermeblanco@hotmail.com>
* @author Jonathan Wage <jonwage@gmail.com>
* @author Roman Borschel <roman@code-factory.org>
* @internal This class contains highly performance-sensitive code.
*/
class UnitOfWork implements PropertyChangedListener
{
//...
/**
* The identity map that holds references to all managed entities that have
* an identity. The entities are grouped by their class name.
* Since all classes in a hierarchy must share the same identifier set,
* we always take the root class name of the hierarchy.
*
* @var array
*/
private $_identityMap = array();
/**
* Map of all identifiers of managed entities.
* Keys are object ids (spl_object_hash).
*
* @var array
*/
private $_entityIdentifiers = array();
/**
* INTERNAL:
* Registers an entity in the identity map.
* Note that entities in a hierarchy are registered with the class name of
* the root entity.
*
* @ignore
* @param object $entity The entity to register.
* @return boolean TRUE if the registration was successful, FALSE if the identity of
* the entity in question is already managed.
*/
public function addToIdentityMap($entity)
{
$classMetadata = $this->_em->getClassMetadata(get_class($entity));
$idHash = implode(' ', $this->_entityIdentifiers[spl_object_hash($entity)]);
if ($idHash === '') {
throw new \InvalidArgumentException("The given entity has no identity.");
}
$className = $classMetadata->rootEntityName;
if (isset($this->_identityMap[$className][$idHash])) {
return false;
}
$this->_identityMap[$className][$idHash] = $entity;
if ($entity instanceof NotifyPropertyChanged) {
$entity->addPropertyChangedListener($this);
}
return true;
}
/**
* INTERNAL:
* Removes an entity from the identity map. This effectively detaches the
* entity from the persistence management of Doctrine.
*
* @ignore
* @param object $entity
* @return boolean
*/
public function removeFromIdentityMap($entity)
{
$oid = spl_object_hash($entity);
$classMetadata = $this->_em->getClassMetadata(get_class($entity));
$idHash = implode(' ', $this->_entityIdentifiers[$oid]);
if ($idHash === '') {
throw new \InvalidArgumentException("The given entity has no identity.");
}
$className = $classMetadata->rootEntityName;
if (isset($this->_identityMap[$className][$idHash])) {
unset($this->_identityMap[$className][$idHash]);
$this->_entityStates[$oid] = self::STATE_DETACHED;
return true;
}
return false;
}
/**
* Checks whether an entity is registered in the identity map of this UnitOfWork.
*
* @param object $entity
* @return boolean
*/
public function isInIdentityMap($entity)
{
$oid = spl_object_hash($entity);
if ( ! isset($this->_entityIdentifiers[$oid])) {
return false;
}
$classMetadata = $this->_em->getClassMetadata(get_class($entity));
$idHash = implode(' ', $this->_entityIdentifiers[$oid]);
if ($idHash === '') {
return false;
}
return isset($this->_identityMap[$classMetadata->rootEntityName][$idHash]);
}
}
Opinions expressed by DZone contributors are their own.
Comments