Secrets Behind TiDB Serverless Architecture
Try TiDB Cloud Serverless For Free: https://tidbcloud.com
TiDB
TiDB is a distributed SQL database which compatible with MySQL. You can think TiDB as a size-unlimited MySQL database with other promising features like online schema change, distributed execution framework, etc. TiDB was inspired by Google Spanner & F1 NewSQL databases, and TiDB is disaggregated compute and storage architecture, so users can scale compute resources or storage resources accordingly.
Over-provisioning Challenge
After offering TiDB database service to thousands of customers from different industries, we got an observation that most applications’ workload have a periodic QPS peak. In order to handle these peak QPS, the common solution is over-provisioning your database. I would say there might be over 95% of databases instances in the world have over-provisioning issue.
Interference Between Different Tasks
We have a customer who maintains more than 700 MySQL instances for their 400 internal applications. This customer has such a large number of MySQL instances is because they preferred to use dedicated database instance to serve individual application in the consideration of isolation. Similar situation happened in many other customers.
But this customer suffered a lot for maintaining such a large number of MySQL instances with daily operations like backup, upgrade, schema change, configuration management, data integration, etc. And then this customer came to TiDB, consolidated these 400 applications into few TiDB clusters because TiDB’s good horizontal scalability.
After consolidating multiple applications into one TiDB clusters, the maintenance cost was reduced significantly. But a new challenge came out: noise neighbor issue.
There are multiple background jobs like data compaction, backup, analyze, data ingestion and others running simultaneously when TiDB is handling online traffics. In order to mitigate these background jobs’ interference to online traffics, we did a lot of effort to throttle and smooth the resource usage of these background jobs, https://dataturbo.medium.com/how-tikv-mitigate-background-compaction-jobs-interference-2e6e5edd2082 is an example.
Meanwhile, potentially there might be some interference between different applications since they share the same TiDB cluster. In order to eliminate the interference we delivered the resource control feature to isolation the resource usage of different applications.
TiDB Serverless
In order to providing a scalable, affordable database service for each individual developer as well as each enterprise customer. At the same time, we want the new architecture can resolve above challenges like over-provisioning issues and performance isolation more thoroughly, we start the journey of developing our TiDB Serverless product. Based on the observations and experiences we have got before, we set 2 principles for the architecture of TiDB Serverless:
Fine grained architecture that auto-scale as demand.
- Cost-effective, support pay-as-you go pricing model, so each individual developer can have their database service with limited cost.
- Can auto-scale to over 1M QPS in few minutes, so it can serve sudden QPS increasing without compromise service quality.
Native multi-tenancy support, each tenant has the similar experience comparing to using a dedicated database instance.
- Data of different tenant should be physically isolated. So each tenant can bring their own encryption key to secure their data.
- No interference between online traffic and background tasks as well as between different tenant.
The above diagram shows the overview of TiDB Serverless architecture. It is mainly comprised of two parts: Control Plane and Data Plane. The Control Plane part including management and operational functionalities like monitoring, auditing, metering, backup management, etc. The Data Plane part is where the users’ data is placed and where users’ traffics are handled. In the next few sections I will introduce detailed design of the Data Plane part from the aspects of multi-tenancy support, elastic compute pool and S3 based shared storage engine.
TiDB Serverless: Multi-tenancy
Data of different tenant is physically isolated, it means each tenant has their own data files (LSM-Tree SST Data files) in the storage layer.
As the meta data center of a TiDB cluster, PD provides multiple functionalities like meta data management, timestamp service (TiDB is a transactional SQL database), load statistics and dynamic scheduling, etc. Like data of tenant, meta data is also isolated for different tenant. Meanwhile we evolved the centralized PD to disaggregated PD services in order to achieve better scalability and isolation of different PD services.
TiDB Serverless: Elastic Compute Pool
We introduced a SQL Gateway(TiProxy) layer between applications and TiDB instances. User’s applications connect to this SQL Gateway layer TiProxy, the TiProxy then route traffics to the correct backend TiDB instances for different tenants. In this way, TiDB Serverless just need to expose the Gateway layer which can apply better secure strategies.
The 2nd advantage of adding this SQL Gateway layer is when there is no traffic for a specific tenant last for a while, the backend TiDB instance of this tenant can be reclaimed and reused by other tenants. When there is new traffic of this tenant come again, a new hot stand-by TiDB instance is allocated from the elastic compute pool to serve the traffic. Because the TiProxy layer holds applications’ connections and store session contexts, this is transparent to the application layer.
The 3rd advantage of this SQL Gateway layer is that upgrading of backend TiDB cluster is also transparent to the application layer.
The elastic compute pool is not just used by the TiDB layer, but also is shared by some offline background jobs like data compaction and statistic collection. We also support AWS spot instances types for this elastic compute pool to reduce cost further.
TiDB Serverless: S3-Based Shared Storage Engine
The last part of TiDB Serverless architecture I want to share here is the S3-Based shared storage engine. We choose S3 as the backend storage layer is because its unlimited volume, high durability, high IO bandwidth and low cost. The majority of data files are stored in S3. The incremental WAL is stored in the cloud disk EBS volume and replicated through raft to achieve higher durability and availability. SST files also are cached in multiple TiKV instances’ local NVMe disks to achieve single-digit millisecond read latency.
Because the IO (both read and write) to S3 is not on the performance-critical path, utilizing S3 as the shared storage layer doesn’t compromise any latency.
With this S3-based Shared Storage Engine, if one TiKV node in the storage layer is gone, we can replace it by copying SST files from S3 to a new allocated TiKV node.