I/O bandwidth
A high rate of read and write operations can overwhelm the disk subsystem, leading to increased data access latencies. When the system cannot read or write data quickly enough, queries that rely on disk access will experience delays.
Diagnostics
-
Open the Distributed Storage Overview dashboard in Grafana.
-
On the DiskTimeAvailable and total Cost relation chart, see if the Total Cost spikes cross the DiskTimeAvailable level.
This chart shows the estimated total bandwidth capacity of the storage system in conventional units (green) and the total usage cost (blue). When the total usage cost exceeds the total bandwidth capacity, the YDB storage system becomes overloaded, leading to increased latencies.
-
On the Total burst duration chart, check for any load spikes on the storage system. This chart displays microbursts of load on the storage system, measured in microseconds.
Note
This chart might show microbursts of the load that are not detected by the average usage cost in the Cost and DiskTimeAvailable relation chart.
Recommendations
Add more storage groups to the database.
In cases of high microburst rates, balancing the load across storage groups might help.