Overloaded shards
Data shards serving row-oriented tables may become overloaded for the following reasons:
-
A table is created without the AUTO_PARTITIONING_BY_LOAD clause.
In this case, YDB does not split overloaded shards.
Data shards are single-threaded and process queries sequentially. Each data shard can accept up to 10,000 operations. Accepted queries wait for their turn to be executed. So the longer the queue, the higher the latency.
If a data shard already has 10000 operations in its queue, new queries will return an "overloaded" error. Retry such queries using a randomized exponential back-off strategy. For more information, see Overloaded errors.
-
A table was created with the AUTO_PARTITIONING_MAX_PARTITIONS_COUNT setting and has already reached its partition limit.
-
An inefficient primary key that causes an imbalance in the distribution of queries across shards. A typical example is ingestion with a monotonically increasing primary key, which may lead to the overloaded "last" partition. For example, this could occur with an autoincrementing primary key using the serial data type.
Diagnostics
-
Use the Embedded UI or Grafana to see if the YDB nodes are overloaded:
-
In the DB overview Grafana dashboard, analyze the Overloaded shard count chart.
The chart indicates whether the YDB cluster has overloaded shards, but it does not specify which table's shards are overloaded.
Tip
Use Grafana to set up alert notifications when YDB data shards get overloaded.
-
In the Embedded UI:
-
Go to the Databases tab and click on the database.
-
On the Navigation tab, ensure the required database is selected.
-
Open the Diagnostics tab.
-
Open the Top shards tab.
-
In the Immediate and Historical tabs, sort the shards by the CPUCores column and analyze the information.
Additionally, the information about overloaded shards is provided as a system table. For more information, see History of overloaded partitions.
-
-
-
To pinpoint the schema issue, use the Embedded UI or YDB CLI:
-
In the Embedded UI:
-
On the Databases tab, click on the database.
-
On the Navigation tab, select the required table.
-
Open the Diagnostics tab.
-
On the Describe tab, navigate to
root > PathDescription > Table > PartitionConfig > PartitioningPolicy
. -
Analyze the PartitioningPolicy values:
SizeToSplit
SplitByLoadSettings
MaxPartitionsCount
If the table does not have these options, see Recommendations for table configuration.
Note
You can also find this information on the Diagnostics > Info tab.
-
-
In the YDB CLI:
-
To retrieve information about the problematic table, run the following command:
ydb scheme describe <table_name>
-
In the command output, analyze the Auto partitioning settings:
Partitioning by size
Partitioning by load
Max partitions count
If the table does not have these options, see Recommendations for table configuration.
-
-
-
Analyze whether primary key values increment monotonically:
-
Check the data type of the primary key column.
Serial
data types are used for autoincrementing values. -
Check the application logic.
-
Calculate the difference between the minimum and maximum values of the primary key column. Then compare this value to the number of rows in a given table. If these values match, the primary key might be incrementing monotonically.
If primary key values do increase monotonically, see Recommendations for the imbalanced primary key.
-
Recommendations
For table configuration
Consider the following solutions to address shard overload:
-
If the problematic table is not partitioned by load, enable partitioning by load.
Tip
A table is not partitioned by load, if you see the
Partitioning by load: false
line on the Diagnostics > Info tab in the Embedded UI or theydb scheme describe
command output. -
If the table has reached the maximum number of partitions, increase the partition limit.
Tip
To determine the number of partitions in the table, see the
PartCount
value on the Diagnostics > Info tab in the Embedded UI.
Both operations can be performed by executing an ALTER TABLE ... SET
query.
For the imbalanced primary key
Consider modifying the primary key to distribute the load evenly across table partitions. You cannot change the primary key of an existing table. To do that, you will have to create a new table with the modified primary key and then migrate the data to the new table.
Note
Also, consider changing your application logic for generating primary key values for new rows. For example, use hashes of values instead of values themselves.