Conducting load testing
YDB CLI has a built-in toolkit for performing load testing using standard benchmarks:
| Benchmark | Reference |
|---|---|
| TPC-C | tpcc |
| TPC-H | tpch |
| TPC-DS | tpcds |
| ClickBench | clickbench |
Also, there is a possibility to run user-defined testing scenarios, which are initiated via the ydb workload query command, see the description. The details are provided in the corresponding section.
All mentioned methods emulate the user's workload under the given scenarios. The detailed description of each method is provided in the corresponding sections, the links to which are provided above.
All commands for working with benchmarks are organized into corresponding groups, and the database path is specified in the same way for all commands:
ydb workload tpcc --path path/in/database ...
ydb workload tpch --path path/in/database ...
ydb workload tpcds --path path/in/database ...
ydb workload clickbench --path path/in/database ...
ydb workload query --path path/in/database ...
Load testing consists of three stages:
Data preparation
It consists of two steps: initializing tables and filling them with data.
Initialization
Initialization is performed by the init command:
ydb workload tpcc --path tpcc/10wh init
ydb workload clickbench --path clickbench/hits init --store=row
ydb workload tpch --path tpch/s1 init --store=column
ydb workload tpcds --path tpcds/s1 init --store=external-s3
ydb workload query --path user/suite1 init --suite-path /home/user/user_suite
At the stage of creating tables, for tpch, tpcds, and clickbench, you can configure the tables to be created:
- Select the type of tables to be used: row, column, external, etc. (parameter
--store); - Select the types of columns to be used: some data types from the original benchmarks can be represented by multiple YDB data types. In such cases, it is possible to select a specific one with
--string,--datetime, and--float-modeparameters.
You can also specify that tables should be deleted before creation if they already exist using the --clear parameter.
For more details, see the relevant reference sections:
Loading data into the tables
The import command is used to load data into the benchmark tables. This command is specific to each benchmark, and its behavior depends on the subcommands. However, there are also parameters common to all benchmarks.
For more details, see the relevant reference sections:
Examples:
ydb workload tpcc --path tpcc/10wh import
ydb workload clickbench --path clickbench/hits import files --input hits.csv.gz
ydb workload tpch --path tpch/s1 import generator --scale 1
ydb workload tpcds --path tpcds/s1 import generator --scale 1
ydb workload query --path user/suite1 import --suite-path /home/user/user_suite
Testing
The performance testing is performed using the run command. Its behavior is mostly the same across different benchmarks, though some differences do exist.
Examples:
ydb workload tpcc --path tpcc/10wh run
ydb workload clickbench --path clickbench/hits run --include 1-5,8
ydb workload tpch --path tpch/s1 run --exсlude 3,4 --iterations 3
ydb workload tpcds --path tpcds/s1 run --plan ~/query_plan --include 2 --iterations 5
ydb workload query --path user/suite1 run --plan ~/query_plan --include first_query_set.1.sql,second_query_set.2.sql --iterations 5
The command 'run' for each benchmark has a number of additional parameters for configuring the types of reports generated, collecting statistics, and other results of load testing.
For more details, see the relevant reference sections:
Cleanup
After all necessary testing has been completed, the benchmark's data can be removed from the database using the clean command:
ydb workload tpcc --path tpcc/10wh clean
ydb workload clickbench --path clickbench/hits clean
ydb workload tpch --path tpch/s1 clean
ydb workload tpcds --path tpcds/s1 clean
ydb workload query --path user/suite1 clean
For more details, see the relevant reference sections: