The Software House team was chosen to create one online platform to rule all current and future marketplaces in Pet Media Group’s ecosystem.
One of the biggest challenges we faced was designing a data migration process that would move millions of records without losing data integrity or any of the marketplaces going offline. Check out what solutions we tried, what failed, and why we eventually set on ETL.
Partnership goal:
→ To build an interconnected ecosystem of websites and applications that would unify all already acquired and future brands under one roof
Pet Media Group
Pet Media Group started as an initiative of two friends who wanted to help animals find homes. It gradually grew into an international network of marketplaces that help users find and buy pets and other animals.
The company gained over $12 million in investment, and today, they are on a path to becoming one of the largest companies of its kind in Europe.
INDUSTRY
Marketplaces
COUNTRY
Europe (Multiple markets)
SERVICE
Data migration, Platform modernization
Challenge
PMG operated multiple separate platforms across different European markets. Each marketplace had its own codebase, database structure, and maintenance requirements, creating significant operational complexity.
The company planned more acquisitions but faced technical problems - each platform would have to be managed separately, which was unacceptable for effective operations of the system as a whole. Maintaining all of them simultaneously would be almost impossible or extremely expensive.
Their team sought a unified solution that could:
-> Handle millions of records from multiple databases
-> Maintain zero downtime during migration
-> Preserve data integrity across all platforms
-> Support different marketplace features and regional requirements
-> Enable rapid integration of future acquisitions
The clients wanted to integrate all platforms under one roof and carry out this process without shutting down any of the acquired marketplaces.
This posed a significant challenge in terms of data integrity
Solution
In this article, we’ll take a closer look at one element of the project – the data migration process.
Our goal was to design and implement a comprehensive data migration strategy that addressed the complex technical challenges of unifying multiple marketplace platforms while maintaining zero downtime.
Together with the client's engineers, we explored various options and chose the best-fit solution.
Process
For the PMG data migration project, we’ve established the following path:
Analyzing the input and output diagrams. We look for differences and mark places for discussion – e.g. potential lack of some data.
Determining how to fill in these blanks. What information and from what source do we provide there?
Preparing a migrator prototype. We use a few pieces of data to test-run the migration on a certain data scale. This helps with estimating the time needed for full migration.
Analyzing the test results. This will determine how the data migration will be carried out and what will be needed from the main application page (e.g. temporary maintenance).
Performing a dry migration. It’s a verification of data integrity, and it’s crucial to see if all data is correctly migrated.
Running the actual data migration process on the real platform.
First data migration plan – failed idea
Initially, we considered a differential migration approach: create database backups on separate servers, perform initial data migration, then run differential migrations (DIFF) to sync remaining data just before go-live.
However, this approach faced critical challenges - mapping migrated data to source data and resuming migrations from specific points proved impossible without an ideal source data structure.
The two biggest issues were figuring out where we left off and mapping migrated data to source data, making these migrations heavily dependent on source data quality.
ETL approach
After evaluating the limitations of differential migration, we implemented a robust ETL (Extract, Transform, Load) process that could handle the complexity of multiple data sources while maintaining system availability and ensuring complete data integrity.
Analysis Phase
A typical ETL implementation requires a thorough analysis phase before execution. We conducted a detailed examination of existing data structures across all Pet Media Group platforms, identifying:
- Different data schemas and formats across marketplaces
- Dependencies between data entities
- External service integrations that needed to be preserved
- Regional customizations and business rules
Strategy & Implemention
In order to ensure that all data will be migrated correctly with no loss to data quality, we’ve established a data migration strategy for PMG that can take place in several ways, and in several different modes.
Currently, the “normal” mode is active, let us tell you how it works:
downloading data from Source Data Base (database of a new marketplace being added to the ecosystem),
mapping data to an acceptable structure specific for PMG Server,
saving the mapped data in Migrator Data Base,
uploading mapped data to PMG Server,
PMG server checking if the data has appropriate structure,
YES: data is saved in the Target Data Base.
NO: the Migrator Server communicates to the Migrator Database that it failed to transfer the rejected data correctly.
end.
The data is migrated in the correct order defined by the developers, based on the initial analysis of the data structures in the source and target databases.
For example, we usually migrate users first before we migrate their listings because the user is the “parent” record of the ad, so needs priority.
If a record has not been migrated correctly, you can browse Migrator DB and check out what it looked like originally, and what error occurred during its migration.
:quality(75))
Schema of the data migration process
Additional modes
We’ve mentioned before that the currently active mode of data migration is called “Normal”. However, there are also four additional modes prepared specifically to solve the business requirement that the new marketplace in the ecosystem must be functional during migration:
“Diff” – migration of data that has changed during the process,
“LocalDiff” – migration of data that has changed only locally
“RetryFailed” – re-migrating data that previously failed
“FindMissing” – entering missing data
Check similar case studies:

→ PMG modernized their marketplace that is used by 7M users
→ GCI improved scalability by migrating terabytes of legacy system’s data to AWS
Problem resolution during migration
We've learnt that ETL can come with a few struggles. Luckily, we've managed to overcome them.
-> Problem #1 – the old marketplace keeps adding data
A couple of solutions are available here. You can use systems synchronizing data in real-time (if something appears on the old marketplace, it is immediately transferred to the new platform), or you can switch to MT mode in the old marketplace for a certain period and run the migration.
In our case, we examined two data migration tools: AWS DMS (Database Migration Service) and the MT mode. DMS allows you to synchronize in real-time, which is an amazing benefit, but in our case, some data had to be supplemented by making queries to external services, so we decided to go in the direction of MT.
Whichever option you choose here, more problems arise.
-> Problem #2 – data is incomplete
A typical ETL should be preceded by a very important analysis phase.
Learn from our mistakes – we could have avoided A LOT of issues if we spent more time on the initial scheme analysis. Eventually, it made a difference.
You compare the input schema (old system) and the target schema (new system) and figure out what to convert. If you’re able to map everything 1:1 then you’re all set. Unfortunately, very often it turns out that a field in the new scheme doesn’t exist in the old one, and you need to establish rules deciding what and how should be completed.
-> Problem #3 – you may need to restart the migration
Data migration must be:
idempotent
– if you upload the same set of information to the system, it will be duplicated and there will be no error that such information already exists. Just nothing will happen (or crash) and the system will be coherent.possible to resume from a specific moment – resuming import from a given point in time is crucial in modern systems. Imagine if you had to run a 5-day operation and an error occurred on a single record on day 4, and because of that you had to start all over and wait another 5 days again. his really made our blood boil because we underestimated how much data there is in the old services.
Outcome
The data migration process contributed to the overall platform launch and had a direct impact on the company’s revenues growing over 400%
→ 400% revenue growth following platform unification
→ Zero downtime during migration of millions of records
→ Unified data schema enables rapid marketplace integration
