15 March 2022
Active Record vs Data Mapper patterns in PHP – a practical comparison
Active Record and Data Mapper are among the most popular patterns for mapping application objects to their database representations. Choosing one of them depends on the size and complexity of your application, intended quality relative to development time and cost as well as many other aspects. I’m going to go over all of them today.
Power outage, memory exhaustion, process shutdown, potential time and cost savings as a result of avoiding repeated SQL code – there are many reasons that can lead you to looking for ways to store your objects in a relational database. I think you already know what’s coming…
That’s right – you’re going to need Object-Relational Mapping (ORM) for that! And to access your data in the database, you will most likely use one of the popular patterns, in particular Active Record and Data Mapper.
Web applications are typically CRUD- or domain-based. Most of them are a mix of these two. Which of these two patterns will be better if you need to choose just one? I will try to answer this and other related questions in this article.
Let’s start with a quick reminder of what ORM is all about.
ORM – when to use it?
Using the object-oriented paradigm in programming brings about a situation where you need to map objects to their representations in the database. That’s what you call ORM.
ORM is a special data mapping layer, which provides database access between the app and database, i.e. it facilitates the communication of an application with a database. The solution has just as many loyal followers as vocal critics. Choosing to use it (or not) should ultimately depend on what the drivers of the application are. Primary drivers include:
- performance,
- scalability,
- maintainability,
- productivity.
Using an ORM well requires a lot of knowledge, but even if you choose not to use it, you’re going to have to provide another solution to the same problems. When making your decision, consider the time and effort it takes to provide an alternative.
In my opinion, ORMs are amazing tools. They aid your productivity a lot because you don’t have to write complicated mapping mechanisms yourself and also code is easier to maintain. The best way to use it is when you have separate read and write models.
The write model works well with ORM, the read model only does so in some cases.
The Query Builder, which is one of the core components of the ORM, is very helpful when it comes to creating optimized and easy-to-read queries. That’s why it can be put to good use in the read model. At the same time it’s vital to remember that the read model should not use domain objects. These are part of the write model. Going this route will lead you to more getters and affect performance by unnecessary object hydration, which is inefficient.
Using the ORM entities in the read model makes it possible to modify the state of aggregate in queries, which is not desirable due to the Command Query Separation (CQS) rule. It says that asking a question should not change the answer.
As I said before, there are two main patterns for implementing Object-Relational Mapper.
Active Record pattern
In Active Record, a model is just a class that represents a table or view from a database. This mechanism is really simple and easy to understand. Every instance of class represents one row of a table.
Easy solutions usually have a lot of limitations. AR fits the simplest models, i.e. those that do not contain too much business logic. It is an ideal tool for simple CRUD applications. Things get complicated when more complex models come into play. In that case, it is much more difficult to develop such an application.
Data Mapper pattern
In Data Mapper a model is just a model. It is a class that represents business logic and incorporates both behavior and data. There is a layer of code that separates the in-memory objects from the database. It does not know how and where it will persist.
So you have a separate layer that is responsible for transferring data between code and database. It should also take care about isolating them. To that end, Data Mapper provides flexibility and what is more important – independence.
Learn more about: design patterns in microservices
Consequences of using Active Record pattern
Using any of these patterns has some consequences.
Stepping into the procedural territory
Active Record is following the Database-First approach. You create a database and then model it in the code. Perhaps using Active Record will guide you to the Anemic Domain Model? Just to be clear, it is not an advantage. It means that entities do not contain logic. You put the entire logic into services (not into controllers, I hope). Ultimately, the model is just data bags that are controlled by other classes. What’s wrong with that? It is getting dangerously close to a procedural style, rather than an object-oriented one.
No encapsulation
Additionally, the model does not guarantee that it is in the correct state because you can manipulate it by getters and setters in the wrong way. There is no encapsulation, which is the basis of the OOP paradigm. I’m not trying to say that getters or setters are necessarily wrong but they should represent behavior that results from business rules.
No SRP
Your code will depend heavily on the database. It also breaks the Single Responsibility Principle (SRP) rule. How? The model contains database interaction logic. It is not the best design, but it does speed up development.
Testability issues
The last consequence that comes to my mind is poor testability. This model is the most difficult one to test but it is not required in all cases.
Do you test simple objects like DTOs? That’s what I thought. You don’t need to do it here either. However, it does not mean that you should skip testing altogether. The logic is elsewhere – in services. Consequently, you are going to have more integration tests (Test Diamond).
Consequences of using Data Mapper pattern
Layer separation
Data Mapper is following the Code-First approach. You can postpone the decision about saving data until later and focus on the code. The model does not know anything about other layers – it is pure. A well-designed model has no database coupling. Persistence is another layer and the domain should not depend on it.
In my opinion, separating these layers is the main advantage of Data Mapper. On the other hand, no separation is the main disadvantage of Active Record.
I do not want to say that every application has a rich domain model that should only have behaviors. However, I do believe that Data Mapper is also good for simple models. You can design it the same way as in Active Record but you gain layer separation. Of course, not without any costs – such an approach is more complex and has a higher barrier to entry.
Improved testability
Using Data Mapper does not guarantee a well-designed model. In fact, I saw quite a few bad models that employed Data Mapper (some of them were mine!). The important thing here is that it does not interfere with the creation of fine models with encapsulation. Thanks to this, you gain better testability with a predominance of unit tests (Test Pyramid).
The most popular implementations in PHP
People from the PHP ecosystem probably know both Eloquent and Doctrine. If you are not familiar with them, all you need to know for now is that they are ORMs. Eloquent is based on the Active Record pattern and Doctrine is based on the Data Mapper pattern.
I believe that this is what the domain model should look like – pure, with no external dependencies. A model that does not care how persistent it will be. Thanks to this, it can work regardless of whether it is stored in RDBS, NoSQL, or wherever. Which pattern will allow this? Active Record is out of the question, but Data Mapper might just be the solution.
Domain model with Eloquent
If you use Eloquent, you immediately have a dependency on Illuminate\Database\Eloquent\Model. And, of course, a hidden database coupling. You do not have clearly declared fields. At least not explicitly, because you can use solutions like PHPDoc to fix it. What would it look like in Laravel? It would use a simple Employee model and a service class (e.g. Employer) with the dismiss() method, which sets appropriate values.
Domain model with Doctrine
If you use Doctrine, you can use exactly this model. I love it. To achieve that, you have to use XML mapping defined in external files. What would it look like in Symfony? Call the dismiss() method on the Employee instance and then save it using EntityManager or EmployeeRepository. That model is clean. However, there is one external dependency in Doctrine that cannot be avoided: Doctrine\Common\Collections\ArrayCollection.
I hope it will be fixed in the future. For now, I accept that it is a separate package that will be easily swapped in the future if it is well encapsulated. Client code should not use it. Neither does it not know that the domain model uses it.
The repository pattern
One option to help Active Record be more elegant is using the Repository Pattern, which is following the Dependency Inversion Principle (DIP) rule. It allows you to separate layers. In the future, it will be easier to replace Eloquent with another tool.
Rather, it will not be a common operation. Sometimes you want to store your data in another storage, e.g. NoSQL. Regardless, the Repository Pattern allows you to write in-memory fake implementations for easier testing without a database connection.
I know that it looks like another unnecessary layer, but it is not. The repository helps you avoid the duplication of queries. The cool thing is, when you need to change some fields, it’ll be easier to find them in one place.
Model independent of ORM
One of the possibilities I have seen is creating different models for domain and database – regardless of whether the implementation used is Doctrine (Data Mapper) or Eloquent (Active Record).
But… it adds yet another layer – an unnecessary one, in my opinion. What I also do not like in this solution is the transformation from one object to another. All of the fields in the domain model need to have getters. You also should create an object using a constructor only when it is the first initialization – that is hard in this approach. It is a better solution for Eloquent. However, Doctrine allows for separating it without separate models.
Active Record vs Data Mapper – what is the right choice?
And that’s it for the comparison. To sum it up:
- The Data Mapper pattern allows you to create a simple CRUD app with no problems, but it can hold its own in more complex apps too.
- At first glance, Data Mapper seems to be a better solution. Or to be more precise – more qualitative. On the other hand, quality is not always the most important thing. I know – it sounds counter-intuitive. Quality is tied to both time of delivery and cost. You can always do better but you need to define an acceptable level of quality with regards to time and cost and strive for it.
In defense of Active Record, I have seen many projects that were supposed to be simple apps with straightforward actual business domain, but the way they were implemented made them hard to understand. Data Mapper makes it easier to adhere to good programming practices, clean code, and architecture in a web application, but it is also more complex. If you can’t use it properly, the Active Record implementation might turn out to be a better choice for simple projects.
- Solving complex problems is important and rewarding, but you should also strive to limit the creation of new ones. That is why I like modularity because you can use various solutions in different modules without creating an unnecessary mess. Furthermore, it gets simpler to find a solution for a given problem.
Programmers like to pretend that they only work on complex problems. The truth is that a big part of their job is to solve typical challenges such as creating simple CRUD applications. That is why I do not like to say that something is inherently a bad solution (an antipattern). Nothing is always good or bad – a proper context is needed. A programmer’s job is to simplify solutions to make code more readable and maintainable for other programmers.
Honestly speaking, I almost always use Data Mapper, which is suitable for simple and complex applications. However, if I were to use Active Record, it would be for an app with relatively simple logic.