New evolution of database shading architecture

Written By notebooktabletphone

Link to the original (Posted date: 2022/01/12)

Mobile phones and the Internet have become called everyday necessities, and it is not unusual for websites and business services to receive billions or more access in a week.

Sales days, such as North America Black Friday and Asian double -elevens (also known as the "single day"), are a good example of the traditional retail industry adapted to the digital world.These companies needed to work on new needs and issues to achieve their business goals.

The same question is asked to all companies -it is necessary to promote digital sales on this black Friday.But if it succeeds and incredibly traps come to the database cruster, will our database handle it?

Regarding database solutions, there are many options for various business cases.The options range from NOSQL products (MongoDB, Cassandra, Amazon Dynamodb, etc.) to NewSQL products (such as Amazon Aurora, Cockroachdb, which have recently been noted).

In addition to such excellent solutions, in some industries, Transparent Sharding on existing database cluster will also be considered.

According to DB-Engines in the Databa Rend Ranking, the conventional relational database is still a big share while new database products are on the market.

When considering the issues that the database is newly facing, is there a more efficient and expensive way to use these old databases to strengthen through new practical ideas?The database transparent shaking is one of the optimal solutions to this question.

DB-Engine database popularity ranking

One of the best techniques for this problem is to divide data into a separate line and a column.In this way, a method of dividing a large database stable into multiple small tables is called "Shard".The original table is divided into vertical shaards or horizontal (Horizontal) shaards.In order to represent these tables, they are often used for vertical shakes and 'HS1' for horizontal shakes.The numbers are the first table or schema, followed by a few.The subset of these data is referenced as the original schema of the table.

So what is the difference between sharing and partitioning?Both are decomposed in large -scale datasets, but the major differences are implicitly contain that the disassembly data is distributed into multiple computers, both in the horizontal or vertical vertical.It is a point.On the other hand, in party shoning, even if the data is broken down into a subset in various forms, it is kept in a single database.This is sometimes called a database instance.

Sharing provides the following advantages by dividing the data into many parts and storing it in a number of machines.

That said, the shaking architecture is not perfect.There are several disadvantages.

Sharing: From one to multiple shards

In both life and technology, there is no versatile solution in most of them.You need to make a thorough analysis to fully understand your needs and scenarios.Only after doing so, it is possible to select the best solution.

In general, the merits of sharing architecture are more popular, so many of the best products that play an important role in the database industry are based on this architecture.CITUS or vitesses are based on database shading architecture, although they have their own definitions.

CITUS manages the coordinator (proxy) cluster to distribute the PostgreSQL cluster.VITES shares MySQL in the same way.All of these are specialized in providing low -cost and efficient distributed solutions using the mainstream relational database so far.In fact, the sharing architecture is the basis of most NOSQL and NEWSQL products, which will be another topic that focused on shaking in NOSQL and Newsql.In this article, I would like to focus on sharing in the relational database.In this field, there are a number of innovations in traditional shaking technology.

Sharting was born from the need for database dispersion.In recent years, new problems related to databases, such as privacy protection, SQL audits, tenants, and distributed authentication, are increasing.

These indicate a new demand for the database in the real world.How to deal with these problems is a question that can not be escaped, regardless of the type of database.Can a database shading solution can deal with these issues?For that, it seems necessary to evolve the shaking.What is the next evolution of the database shading architecture -this article is covered?

My answer is DataBase Plus, a concept that is a guideline for creating a distributed database system that goes beyond the shaking on a DBMS.

DataBase Plus was designed for the purpose of building standard layers and ecosystem on existing diverse databases and providing standardized unified database use specifications.At the same time, the higher -level applications provide functions and at the same time, minimize the fragmentation of the lower database to deal with the various issues facing the business.As a result, an environment is realized that applications should be used only for standardized services, not services that are different for each database.

This idea is based on the idea of Apache ShardingSphere PMC (Project Management Comittee), which implements this concept in architecture 5..0.It takes about a year to release 0 GA.

3.X and 4.Apache ShardingSphere in the release stage of X was defined as a mere distributed database midorware (shaking architecture) to solve the problem of sharsing, but new database issues and communities are data encryption.It has been promoted to innovative projects with functions such as shadow databases, distributed authentication, and distributed governance.All of these changes are transcends the conventional shaking framework.Sharting is just a part of DataBase Plus.

Evolution of Shartingsphere Database Plus

The example of Apache ShardingSphere supports my claim that a simple old shaking architecture can be more than a shaving.The kernel mechanism leads all traffic that passes through a proxy or driver.So if you can analyze SQL and know the location of all databases, it will not be difficult to perform the following processing:

So what does these processing mean to end users?By based on such a karnel job, Apache ShardingSphere products can relieve user pain in the database.

Sharting, data encryption, shadow database, distributed authentication, distributed governance, etc. were originally based on the necessary steps described above.The architecture proposed by the DataBase Plus concept of Apache ShardingSphere provides these extensions with flexibility.

All functions are plug -in, and you can always add or delete this distributed system.Some people want to shake the database, while others may want to choose data encryption.Based on the evolution and diversification of the user's needs, DataBase Plus makes user demands by enabling complete customization and continuously accepting new plugins (functions).It is possible to respond clearly and flexibly one by one.

architecture

ShardingSphere's architecture contains the following four layers as shown in Figure 1 below.

SHARDINGSPHERE 4 -layer architecture

Foundation Layer: Providing various access terminals such as drivers and proxies flexibly respond to the diverse needs of users.

Storage Layer: Supports all of these databases and at the same time, can include more functions.

Function Layer: Providing various functional plugins that meet the needs of the user, achieving advanced flexibility by selecting and combining plugins.

Solution Layer: We provide end users to industry -oriented (financial, e -commerce, entertainment industry, etc.) and specific scenario -oriented standard product solutions (distributed database solutions, encrypted database solutions, database gateway, etc.)。

ShardingSphere JDBC and ShardingSphere Proxy have been released as a product that can be operated after five years of development and testing periods.Many community users have provided their own operation cases and have been conducting operational figures.

The structure in which various ShardingSphere clients share the core function, enables hybrid deployment to achieve query composition and management convenience, as shown in Fig. 2 below.

SHARDINGSPHERE JDBC and Proxy hybrid deployment

The Apache ShardingSphere community proposes DistSQL (Distributed SQL) as one of the allocated ShardingSphere functions and managed and managed.

SQL is a standard method for dialogue with the database, but since this distributed database system has many new functions, it was necessary to set them and consider the SQL dialect for using it.

DistSQL is a SQL -like command that allows you to generate, modify and delete distributed databases and tables, encrypt and decrypt databases.The functions described so far can be executed with this distributed SQL.Here are some DistSQL snippets.

Example of use of DistSQL

The governance function of the distributed database system is necessary to reduce the difficulty of distributed cluster management.In the Eco System of ShardingSphere, where computing and storage are separated, this feature has been greatly enhanced in a new version.

データベースシャーディングarchitectureの新たな進化

In addition, a new function will be released nearby.

Shardsphere distributed governance

Before the deployment

We have mentioned a lot of benefits so far, but we should also mention restrictions and restrictions.Before introducing ShardingSphere, you need to take into account the following items.

Illustration

In this chapter, I would like to introduce two examples: "How to build a distributed database" and "how to build a encryption table" using the SQL dialect, a SQL dialect that combines all elements of the ShardingSphere ecosystem.

This part introduces how to build a distributed database using DistSQL.Users and applications are connected to a proxy to access logical tables (distributed tables), which are shards between a variety of servers.You do not need to pay attention to these shards.In the application, you can operate and manage the logical table.

Prerequisites:

process:

  1. Execute the SQL command and log in to Proxy CLI

MySQL -H127.0.0.1 -UROOT -P3307 -Proot

  1. Register two MySQL databases using DistSQL

Add Resource DS_0 (Host = 127.0.0.1, Port = 3306, db = DEMO_DS_0, user = root, password = root);

Add Resource DS_1 (Host = 127.0.0.1, PORT=3306, DB=demo_ds_1, USER=root, PASSWORD=root );

  1. Create a shalding rule by DistSQL

Worker-ID "= 123)))))

  1. Generate a shalling table by the shaarding rules mentioned earlier

CREATE TABLE `t_order` ( `order_id` int NOT NULL, `user_id` int NOT NULL, `status` varchar(45) DEFAULT NULL, PRIMARY KEY (`order_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

  1. Display resources, shaking databases, and shaking tables

SQL Show Schema Resources;

Show Databases;

Show Tables;

  1. Display the shaking table

Show Tables;

The following is a table in MySQL.

The following is the table in the ShardingSphere Proxy.

  1. Drop the shaking table

Drop Table T_ORDER;

This example shows how to generate an encryption table using DistSQL.Data encryption is a feature of the ShardingSphere Proxy, and the data is encrypted and decrypted.There is no need to change the coding of the application, simply sending the plain text to Proxy will encrypt the plain text and send the encryption to the database.The user can configure which row of which tables and what algorithm are encrypted.

Prerequisites:

process:

  1. Execute the following command and log in to Proxy CLI

MySQL -H127.0.0.1 -UROOT -P3307 -Proot
  1. Add resources with DistSQL

Add Resource DS_0 (Host = 127.0.0.1, Port = 3306, db = ds_0, user = root, password = root);

  1. Generate encryption rules

Create Encrypt Rule T_ENCRYPT (COLUMNS ((Name = User_id, Plain = User_plain, Cipher = User_cipher, Type (Name = Aes

Show Encrypt Table Rule T_ENCRYPT;

  1. Generate an encryption table

CREATE TABLE `t_encrypt` ( `order_id` int NOT NULL, `user_plain` varchar(45) DEFAULT NULL, `user_cipher` varchar(45) DEFAULT NULL, PRIMARY KEY (`order_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The execution result of MySQL is as follows

Insert data into this table

INSERT INTO `T_ENCRYPT` VALUES (1," ABC ");

The contents of MySQL are as follows

  1. Change the encryption rules.

Alter Encrypt Rule T_ENCRYPT (COLUMNS ((Name = User_id, Plain = User_plain, Cipher = User_cipher, Type (Name = MD5)))

Show Encrypt Rules;

  1. Delete encrypted rules

Drop Encrypt Rule T_ENCRYPT;

The database distributed system, which has a shaking, encryption, and other additional functions, is a practical and effective way to realize the needs of changing users at a low cost.These solutions remove concerns about unstable and overwhelming workloads caused by the introduction of a completely new distributed database.

I may have a relatives as a member of ShardingSphere PMC (Project Management Committee), but the reason I chose a contribution to this open project was that it was a database -related problem and operation in the real world.It is also true that it was a wonderful innovation that had the potential to solve a scenario.

Among the professional careers, I have become a member of a company that manages and utilizes a huge amount of data in one of the most popular society in the world.We fully understand the gaps caused by data pikes and the gap between the operation needs and the realized database solution.

I don't mean that DataBase Plus is the only way to solve new issues in the cloud era, but I would like to recommend it as a real and innovative solution.

Finally, a word about sharing.Sharing is one of the many solutions for new tasks caused by the Internet revolution.Some experts say that shaking database architectures are outdated, but the facts are completely different.

It may not be flashy, and it will not be promoted like other solutions, but it is definitely an effective and practical solution.

Recent Sharing has evolved so much that it can not even imagine a while ago by receiving new important and innovative contributions.Perhaps that's why blockchain companies that seek scalability are becoming increasingly popular.

About the author

もっと見るより少なく