Decommissioning a cluster part

Decommissioning is the procedure for moving a VDisk from a PDisk that needs to be decommissioned.

Data is moved to PDisk clusters where there is enough free space to create new slots. Moving is performed only when there is a possibility of moving with at least partial preservation of the failure model. It may happen that the system can't strictly follow the failure model during decommissioning. However, it does its best to ensure fault-tolerance as fully as under normal operation. For example, when encoding mirror-3-dc, a situation may arise when a group is located in four rather than three data centers.

Decommissioning is done asynchronously, meaning that data is not moved immediately while handling the command. If the failure model allows, SelfHeal moves slots one by one in the background to completely release the specified PDisks.

Managing decommissioning

To manage the state of decommissioning, set the DecommitStatus parameter for the appropriate PDisk. The parameter can take the following values:

  • DECOMMIT_NONE: The disk is not being decommissioned and is running normally, according to its condition.
  • DECOMMIT_PENDING: Disk decommissioning is scheduled. Data is not transferred from the disk. However, slots for new groups won't be created and the slots of the previously created groups won't be moved.
  • DECOMMIT_IMMINENT: Disk decommissioning is required. Data is transferred in the background to disks that have the DECOMMIT_NONE status and satisfy the failure model.

The DECOMMIT_PENDING and DECOMMIT_IMMINENT values shouldn't be removed under normal decommissioning, since the equipment is removed from the cluster by running the DefineBox command.

To cancel decommissioning, just change the disk status to DECOMMIT_NONE. In this case, BS_CONTROLLER won't take any additional actions: the previously moved VDisks remain where they are. To return them, you can use commands to move slots point by point, depending on the specific situation.

By managing the DECOMMIT_PENDING and DECOMMIT_IMMINENT states, you can perform cluster decommissioning in parts.

For example, you need to move equipment from data center-1 (DC-1) to data center-2 (DC-2):

  1. The DC-2 hosts buffer equipment to transfer the first chunk of data to.

  2. Switch the status of all DC-1 disks to DECOMMIT_PENDING so that no data can be moved inside the DC-1.

  3. Switch the status of all DC-1 disks to DECOMMIT_IMMINENT on the equipment that is equivalent to the buffer one.

    Wait until all the disks in the DECOMMIT_IMMINENT status are released.

  4. Move the released equipment from the DC-1 to the DC-2 and switch the status of its disks to DECOMMIT_NONE.

Repeat the above steps for the next set of equipment in the DC-1 until all the equipment is moved.

To set the desired state of disk decommissioning, use the YDB DSTool utility. The command below sets the DECOMMIT_IMMINENT status for the disk with the ID 1000 on the node with the ID 1:

ydb-dstool -e  <bs_endpoint> pdisk set --decommit-status DECOMMIT_IMMINENT --pdisk-ids "[1:1000]"