How to 100% recover a table that was deleted by accident in your database?

Jinpeng Zhang
5 min readOct 10, 2022

--

Ooops

It would be a disaster to database administrators if they truncated/dropped an important table by accident that shouldn’t be deleted. If they can’t get this table back, they may lose their jobs. So, many database administrators usually live under pressure, more or less. They will very carefully audit those operations that will apply to the online production database, and auditing manually is a time and spirit wasting work.

MySQL

Let’s come back to the issue this article title mentioned. If you are using MySQL as your production database, when a wrong table deletion has happened, I pray you have regular full backup for your database or those specified tables, and the MySQL binlog was opened in the meantime. If you just have regular full backup for this table, like do full backup per day, and the MySQL binlog wasn’t open, you have lost the incremental update for this table, you only can recover to a stale status for this table from the latest full backup.

Full backup essentially is a snapshot of your database at some past point in time. And, MySQL binlog is a series of log files that contain incremental operations that have happened to your MySQL, and these operations are recorded in sequential order.

MySQL regular full backup and incremental binlog

Restore the last full backup

As an example, suppose that around 15:50:00 10/09/2022 (let’s call it T in the left part of this article) we deleted a table 'game'.'user' by accident. In order to get table 'game'.'user' back, we should restore the last full backup created before time T. You can make it by executing the following statement. BTW, the full backup should contain the last binlog position (let's assume it is 1000 in log file bin.008964), which will be used as the start position when we restore the incremental data from binlog.

mysql -u [user] -p [database_name] < last_backup.sql

Restore the incremental data from binlog

After you restored the full backup of your database, the next step is to apply these incremental updates that happend before you executed the table deletion statement to your database. Before doing this, you should find the precise binlog event position corresponding to the point in time up to which you want to restore your database. In our example, we can find the binlog event position by checking the log contents around T using the mysqlbinlog utility. Use the --start-datetime and --stop-datetime options to specify a short time period around T, and then look for the event in the output.

$> mysqlbinlog --start-datetime="2022-10-09 15:45:00" \
--stop-datetime="2022-10-09 15:55:00" \
--verbose /var/lib/mysql/bin.008946 | grep -C 15 "DROP TABLE"
...
# at 3980
SET TIMESTAMPE=...
SET @@session...
SET @@seesion...
...
DROP TABLE `game`.`user` /* generated by server */
/*!*/;
# at 4000
...

From the output of mysqlbinlog, the DROP TABLE `game`.`user` statement can be found in the segment of binary log between the line #at 3980 and #at 4000, which means the statement takes place after log position 3980.

The next step is to apply the events in the binlog file, starting with the log position we found in the last full backup and ending at the position we found by mysqlbinlog which is 3980 in our example.

$> mysqlbinlog --start-position=1000 --stop-position=3980 \
/var/lib/mysql/bin.008946 | mysql -u root -p

At this point, you should get table `game`.`user` back.

TiDB

Flashback a table in TiDB

As we can see, the recovery procedure in MySQL is a bit complicated since you should have good knowledge about how MySQL binlog works, and also should carefully handle the log position in MySQL binlog.

When it comes to TiDB, you can use the FLASHBACK statement to recover a table that was deleted by accident. In the above example, if you deleted the `game`.`user` table, you can get this table back in few seconds by executing the following statement:

tidb> FLASHBACK TABLE `game`.`user`

That is all, no more extra operations. The full syntax of FLASHBACKis as follows.

tidb> FLASHBACK TABLE table_name [TO other_table_name]

Besides recovering a deleted table, you also can use this statement to recover a truncated table, for example:

tidb> TRUNCATE TABLE t
tidb> FLASHBACK TABLE t to t1

Let’s do it now:

Flashback a dropped table in 0.1 second

How does it work in TiDB?

As we can see, the flashback procedure is pretty fast, just use 0.1 second to flashback a table with more than 2 millions rows. How TiDB made it? Briefly(logically) speaking, in TiDB, a table contains two parts of data, data and metadata. The data part contains all the data that the user has stored in the database. The metadata part contains table id, table schema information. (see how TiDB mapping table data to key value for more information about TiDB’s data composition, since TiDB uses TiKV — a distributed KV database as its storage layer)

When deleting a table in TiDB, TiDB just makes the metadata part of this table invisible. And the real data deleting procedure is taken by the GC worker. By default, the GC worker only triggers the deletion of data that has been deleted for more than 10 minutes. So all the data of this table still exists in TiDB by that time you have executed the DROP TABLE statement. When you execute the FLASHBACK statement, TiDB just makes this table's metadata visible again. See flashback implementation for more implementation details at physical level.

What’s more?

As you can see, TiDB’s GC worker will trigger the data deletion procedure after the table has been deleted for more than 10 minutes. How about if I want to get back a table that has been deleted more than 10 minutes? Actually, you can’t use FLASHBACK in this situation.

But there are 2 work arounds to achieve such approach in TiDB. One is to set the tidb_gc_life_time (How to set tidb_gc_life_time) larger than 10 minutes, like 1 hour. By doing this, you can use FLASHBACK to get any table that has been deleted for less than 1 hour back. But changing tidb_gc_life_time to a large value may cause some performance regression issues in some specified scenarios, like high concurrency UPDATE scenario. Another work around is using TiDB's PiTR to recover your database to a point-in-time just before you executed the table deletion operation. The drawback of this work around is you need to use a new TiDB cluster to restore your database.

--

--

Jinpeng Zhang
Jinpeng Zhang

Written by Jinpeng Zhang

Built distributed SQL database TiDB from scratch with other colleagues, focus on building large scale distributed system and high performance engineering team.

No responses yet