In our previous articles, we explained why we had to move 800,000 databases from one datacentre to another, 300 kilometres away. So here we are… My team and I did it! It was a real brain-burner, so I hope our story will help you to address more of the huge technical projects we love to play with.
- To reduce latency, a database must be migrated at the same time as the website using it.
- Because databases are spread across all available MySQL servers, the granularity of a migration has to be the database, not the MySQL instance. In other words, we cannot migrate a whole MySQL server. We must only move a part of it.
- As the link between a website and its database is not necessarily referenced, a website in Gravelines should be able to contact a database in Paris (for example), and vice-versa.
- To contact its database, a website uses a host name, a user name and a password. We want the migration to be transparent, so nobody should have to change any of those elements to contact their new database.
- Databases’ platforms change between Paris and Gravelines, as shown below.
To sum up, this is what we have before a web cluster migration:
And this is what we want after the migration:
A few more things…
- Obviously, we had to keep in mind one of the most important thing when dealing with databases: consistency. For each database, we had to define a consistency point. Before this point on the timeline, read/writes were made in Paris. After it, read/writes were made in Gravelines.
- We believe in transparency and reversability. These are both key parts of our SMART cloud. That’s why we wanted to give you access to this consistency point, as a dump on your OVHcloud Control Panel. For every database migrated, we chose to give you access to the dump for one month.
- Migrating 800K databases in about 60 nights meant we had to be very fast and scalable. Our record was on the 1st of July 2019, when we successfully migrating 13,502 databases in 1 hour, 13 minutes and 31 seconds.
- If you’re used to being on duty, you know that your attention and efficiency are lower during the night. Repeating the migration process about 60 times in one year would amplify that, so we wanted everything to be as automated and as simple as possible. As we will see later, to launch the database migrations, we just had to run one command on a single host:
Now you know rules, it’s time to start the game!
The first level is always an easy one, where you discover how the game works through a kind of tutorial! So, let’s begin with a small database migration. This is how we do it:
1. At the source (Paris)
- Set read-only mode. We absolutely need to avoid writes during the migration, to avoid the famous split-brain. The easiest way to do this is to put the database in read-only mode. In most cases, websites need only to read the databases, but in some cases, they need read and writes, and they will therefore be broken. That’s not a problem, because the website is currently migrated and is closed. We’ll block write access, in case the database is used by another host that isn’t affected by the nighttime migration.
- Dump the database and put the dump somewhere. We choose to store the dumps on OVHcloud’s Public Cloud Storage (PCS), as we already use this solution to store 36 million dumps per month. Adding 800,000 dumps in one year is no problem for this awesome platform!
2. At the destination (Gravelines)
- Retrieve the dump and import it.
- Create the user and permissions, with write access.
3. Switch to the new database
- At this point, the website is still calling the database at Paris. So the website (whether it’s hosted in Paris or Gravelines) can contact the new database, we’ll update the DNS so that the name points to the Gravelines MySQL instance rather than the Paris one.
- The read access to the Paris database is also removed.
- Finally, we’ll update our information system, so you can retrieve the dump from PCS via your Control Panel. This update also allows us to redirect all actions available from the Control Panel (e.g. change a password, create a dump…) to the new database at Gravelines.
Level 2: “Decentralised state machine”
To deliver the proof of concept for the migration, we first ran all those steps manually and sequentially. The natural way to automate this is to write a script to do the same thing, but faster. That’s a centralised method, but such methods experience bottlenecks sooner or later, and imply a single point of failure.
To prevent this, and fulfil our scalability requirements, we have to be decentralised. We must imagine the migration of a single database as a state machine. Here is a simplified version of the state graph of a database migration, as described above:
Using this state machine, we can perform those three big steps by different machines to parallelise the workload:
- The source
- The destination
- The one updating the DNS
These three hosts can perform their tasks in an independent and decentralised way. All they have to do is watch the state graph to see if they have something to do, and if so, update it and perform the tasks.
The brain of the migration: CloudDB
We love the ‘eat your own food’ concept! That’s the best quality control ever, and with the feedback you give us, our first source of feature requests. So it’s no surprise that we used our own CloudDB product to store the state graphs of the databases migrations.
Technically, a state graph is a row in a table. The simplified structure of this table looks like this:
- database_name VARCHAR(255) PRIMARY KEY, - source VARCHAR(255), - destination VARCHAR(255), - status VARCHAR(255) NOT NULL DEFAULT 'Waiting', - dump_url TEXT
Except for the dump_url, all the fields are filled before the migration begins. In other words, we know where the databases are, and where they will be.
We’ve beaten all the challenges of this level. Now it’s time to beat the final monster!
Level 3: Migrate 800K databases
Now that we know how to migrate a single database in a decentralised way, let’s fill the CloudDB with all the databases we want to migrate! This is what the migration now looks like:
About once a minute*, each host of the 780 database servers asks the CloudDB if they have something to dump. The
status columns of the table are used to retrieve this information:
SELECT … WHERE source = me AND status = 'To dump';
If so, they execute their tasks and update the CloudDB about what they’re doing. When they’re done, they pass the baton for this migration to Gravelines:
UPDATE … SET status = 'To import' WHERE database_name = '…';
At the same time, 300 kilometres away, hundreds of database servers are also asking the CloudDB if they have something to import. As in Paris, they query the CloudDB about once a minute*. The
status columns of the table are used to retrieve this information:
SELECT … WHERE destination = me AND status = 'To import';
If so, they execute their tasks and update the CloudDB about what they’re doing. When they’re done, they pass the baton to a third robot, who will change the DNS records for this database migration:
UPDATE … SET status = 'DNS to update' WHERE database_name = '…';
(*) To avoid flooding the CloudDB, we use a random frequency to query the database with the state graphs. This way, the connections are globally distributed over time.
The robot responsible for the DNS update is the third player in the migration process, and works in the same way as the dump and import robots described above.
Not that easy…
Of course, the actual game was more complex. This was a simplified version of the migration, with some steps missing or not detailed enough, such as:
- Preventing writes to the source database
- Updating the IS (among others) so you can see the dump in the Control Panel
- Setting the password on the destination (the same as the one on the source) without knowing it
- And many others
But now that you’ve got the concept for the main steps, you can imagine how we handled the others.
Cheat code: Iterate!
Do you know the law of truly large numbers? It states that with a large enough number of samples, any unlikely thing is likely to be observed.
This is one of the first lessons you learn when you host 1.2 million databases. Every day, we face the many improbable things that can happen with databases, so we knew that, despite the tests we conducted, we would encounter difficulties, weird cases, and improbable bottlenecks.
But there’s a cheat code to beat that boss: iterate!
- Begin the migration
- Face a problem
- Fix it definitively (not just for the particular case that failed, but also for all similar cases across the whole platform)
- Then try again, faster!
This method is possible thanks to two things:
- The magic command
- The big red button
The magic command
As mentioned above, to launch the database migrations, we had to run one command on a single host:
This magic command has one parameter: the number of parallel migrations you want to do. We used
400 for this parameter.
migrate-p19 --max-procs 400
This means that 400 databases are being dumped or imported at the same time – no more, no less.
migrate-p19 command is a scheduler. It updates the CloudDB every 10 seconds, so we always have those 400 migrations performed in parallel:
SELECT COUNT(database_name) … WHERE status in ('To dump', 'Dumping', 'Dump failed', 'To import', …); 42 UPDATE … SET status = 'To dump' WHERE status = 'Waiting' LIMIT (400 - 42);
Pause the game: The big red button
In every machine, it is mandatory to have a big red button to press when something’s wrong. To interrupt the migration, for a reason or another, we just have to kill the
migration-p19 script. When we do this, the ongoing migrations terminate themselves, after which no new ones are launched.
The interruption is automatic. If something’s really bad, the scheduler can see that there are too many operations in an error state, and then decide to stop the migration.
Our database migration adventure will continue on this blog. Indeed, the world is not perfect, and we’ve had to handle some interesting cases…
To be continued…
Want to be part of the team bringing this migration to life, and keeping this huge platform in an operational state? We are hiring an open-source SRE and automation addict in France and Canada!