TLI logging

Transaction lock invalidation (TLI) logging lets you identify which query had its locks broken (the victim) and which query broke them (the breaker).

Enabling logging

To get detailed logs about lock conflicts, set the INFO level (numeric value 6) for the TLI component.

Add the following to the cluster configuration:

log_config:
  entry:
    - component: "TLI"
      level: 6  # INFO

The log_config parameter supports dynamic updates without node restarts.

Once enabled, the server writes logs on every lock conflict, and VictimQuerySpanId begins appearing in SDK error messages.

Note

The base log size for a single TLI event is 10-20 KB. The actual volume depends on the size of the text of the logged requests (the breaker and the victim) and the transactions of which they are part, since all this data is recorded in the log.
If conflicts are infrequent, the logging overhead is negligible. At high TLI rates, monitor the total log volume.

To exclude tables with expected conflicts from TLI diagnostics, use the tli_config.ignored_table_regexes parameter.

Log structure

When TLI occurs, the server writes four entries — one from each component for each side of the conflict.

Example scenario:

Two sessions are established: victim and breaker.

Time QuerySpanId Role Query
T1 1111111111111111 Victim SELECT * FROM Orders WHERE OrderId = 42 (acquires lock)
T2 2222222222222222 Breaker UPDATE Orders SET Status = 'done' WHERE OrderId = 42 (breaks lock)
T3 3333333333333333 Victim UPDATE Orders SET Amount = 100 WHERE OrderId = 42 (commit fails)

Breaker log (DataShard):

Component: DataShard, TabletId: <tablet-id>,
BreakerQuerySpanId: 2222222222222222, VictimQuerySpanIds: [1111111111111111],
Message: Write transaction broke other locks

Breaker log (SessionActor):

Component: SessionActor, Message: Query had broken other locks,
BreakerQuerySpanId: 2222222222222222,
BreakerQueryText: UPDATE Orders SET Status = 'done' WHERE OrderId = 42,
BreakerQueryTexts: [QuerySpanId=2222222222222222 QueryText=UPDATE Orders SET Status = 'done' WHERE OrderId = 42]

Victim log (DataShard):

Component: DataShard, TabletId: <tablet-id>,
VictimQuerySpanId: 1111111111111111, CurrentQuerySpanId: 3333333333333333,
Message: Write transaction was a victim of broken locks

Victim log (SessionActor):

Component: SessionActor, Message: Query was a victim of broken locks,
VictimQuerySpanId: 1111111111111111, CurrentQuerySpanId: 3333333333333333,
VictimQueryText: SELECT * FROM Orders WHERE OrderId = 42,
VictimQueryTexts: [QuerySpanId=1111111111111111 QueryText=SELECT * FROM Orders WHERE OrderId = 42 |
                   QuerySpanId=3333333333333333 QueryText=UPDATE Orders SET Amount = 100 WHERE OrderId = 42]

Log fields

Field Description Where it appears
VictimQuerySpanId Identifier of the query whose locks were broken Victim logs
BreakerQuerySpanId Identifier of the query that broke the locks Breaker logs
CurrentQuerySpanId Identifier of the query at the time of the error (may differ from the victim) Victim logs
VictimQuerySpanIds Array of identifiers of all victim queries Breaker DataShard logs
VictimQueryText SQL of the victim query that acquired the locks Victim SessionActor logs
BreakerQueryText SQL of the breaker query Breaker SessionActor logs
VictimQueryTexts All queries of the victim transaction Victim SessionActor logs
BreakerQueryTexts All queries of the breaker transaction Breaker SessionActor logs

Log analysis

Using the VictimQuerySpanId from the SDK error message, you can find all related events:

  1. Find the victim query: search for VictimQuerySpanId: <value from the error> in the logs — this shows which SELECT acquired the broken locks, with the query text in the VictimQueryText field.

  2. Find the breaker query: the breaker DataShard log's VictimQuerySpanIds field contains the same value. From this log, take BreakerQuerySpanId and find the breaker's SessionActor log — it contains BreakerQueryText with the full query text.

  3. Get full transaction context: VictimQueryTexts and BreakerQueryTexts contain all queries of the respective transactions in execution order.

find_tli_chain utility

For automatic TLI log analysis, the YDB repository includes the find_tli_chain.py utility.

This utility takes the following parameters:

  • VictimQuerySpanId - identifier of the broken query from the SDK error message;
  • LogFile - name of the log file for log analysis.

It is assumed that TLI logging was previously enabled in the configuration, log files were collected from database nodes and merged into one file.

python3 find_tli_chain.py <VictimQuerySpanId> <LogFile>

Example:

python3 find_tli_chain.py 1111111111111111 ydb.log
================================================
  TLI Chain
================================================

VictimQuerySpanId: 1111111111111111

VictimQueryText: SELECT * FROM Orders WHERE OrderId = 42

BreakerQuerySpanId: 2222222222222222

BreakerQueryText: UPDATE Orders SET Status = 'done' WHERE OrderId = 42

================================================
  VictimTx
================================================

SELECT * FROM Orders WHERE OrderId = 42

UPDATE Orders SET Amount = 100 WHERE OrderId = 42

================================================
  BreakerTx
================================================

UPDATE Orders SET Status = 'done' WHERE OrderId = 42

The utility outputs:

  • TLI ChainVictimQuerySpanId, VictimQueryText, BreakerQuerySpanId, BreakerQueryText;
  • VictimTx — all queries of the victim transaction in execution order;
  • BreakerTx — all queries of the breaker transaction.

Additional parameters:

Parameter Description
--window-sec N Time window for searching around the event (default: 10 s)
--no-color Disable colored output

Note

The utility correctly handles unsorted and mixed log files: filtering is based on timestamp values, not line positions.