Conducting load testing

YDB CLI has a built-in toolkit for performing load testing using several standard benchmarks:

Benchmark Reference
TPC-C tpcc
TPC-H tpch
TPC-DS tpcds
ClickBench clickbench

They all function similarly. For a detailed description of each, refer to the relevant reference via the links above. All commands for working with benchmarks are organized into corresponding groups, and the database path is specified in the same way for all commands:

ydb workload tpcc --path path/in/database ...
ydb workload tpch --path path/in/database ...
ydb workload tpcds --path path/in/database ...
ydb workload clickbench --path path/in/database ...

Load testing can be divided into 3 stages:

  1. Data preparation
  2. Testing
  3. Cleanup

Data preparation

It consists of two steps: initializing tables and filling them with data.

Initialization

Initialization is performed by the init command:

ydb workload tpcc --path tpcc/10wh init
ydb workload tpch --path tpch/s1 init --store=column
ydb workload tpcds --path tpcds/s1 init --store=column
ydb workload clickbench --path clickbench/hits init --store=column

At this stage, for tpch, tpcds, and clickbench, you can configure the tables to be created:

  • Select the type of tables to be used: row, column, external, etc. (parameter --store);
  • Select the types of columns to be used: some data types from the original benchmarks can be represented by multiple YDB data types. In such cases, it is possible to select a specific one with --string, --datetime, and --float-mode parameters.

You can also specify that tables should be deleted before creation if they already exist using the --clear parameter.

For more details, see the description of the commands for each benchmark:

Loading data into the tables

The import command is used to load data into the benchmark tables. This command is specific to each benchmark, and its behavior depends on the subcommands. However, there are also parameters common to all benchmarks.

For a detailed description, see the relevant reference sections:

Examples:

ydb workload tpcc --path tpcc/10wh import
ydb workload tpch --path tpch/s1 import generator --scale 1
ydb workload tpcds --path tpcds/s1 import generator --scale 1
ydb workload clickbench --path clickbench/hits import files --input hits.csv.gz

Testing

The performance testing is performed using the run command. Its behavior is mostly the same across different benchmarks, though some differences do exist.

Examples:

ydb workload tpcc --path tpcc/10wh run
ydb workload tpch --path tpch/s1 run --exсlude 3,4 --iterations 3
ydb workload tpcds --path tpcds/s1 run --plan ~/query_plan --include 2 --iterations 5
ydb workload clickbench --path clickbench/hits run --include 1-5,8

The command allows you to select queries for execution, generate various types of reports, collect execution statistics, and more.

For a detailed description, see the relevant reference sections:

Cleanup

After all necessary testing has been completed, the benchmark's data can be removed from the database using the clean command:

ydb workload tpcc --path tpcc/10wh clean
ydb workload tpch --path tpch/s1 clean
ydb workload tpcds --path tpcds/s1 clean
ydb workload clickbench --path clickbench/hits clean

For a detailed description, see the corresponding sections: