Replacing Node FQDN
This procedure describes how to replace the FQDN (Fully Qualified Domain Name) of a YDB cluster node without downtime.
Prerequisites
Note
A YDB cluster is fault-tolerant. Temporary node shutdown does not lead to cluster unavailability. For more details, see YDB Cluster Topology.
Warning
Errors in metadata subsystem configuration (including the domains_config section) or an incorrect sequence of changes can lead to YDB cluster unavailability.
Procedure Overview
The FQDN replacement process involves:
- Preparation: Verify cluster health and prepare a new node configuration
- Node shutdown: Gracefully stop the node to be replaced
- Configuration update: Update the cluster configuration with a new FQDN
- Node restart: Start the node with new FQDN
- Verification: Confirm successful FQDN change
Step-by-Step Instructions
Step 1: Verify Cluster Health
Before starting the replacement, ensure the cluster is healthy:
ydb monitoring healthcheck
Step 2: Prepare New Node Configuration
- Update DNS records to point the new FQDN to the same IP address
- Update TLS certificates if they include hostname verification
- Prepare updated configuration files with the new FQDN
Step 3: Stop the Target Node
Gracefully stop the node that needs FQDN replacement:
# For systemd-managed nodes
sudo systemctl stop ydbd-storage
# For manually started nodes
kill -TERM <ydbd_pid>
Step 4: Update Cluster Configuration
Update the cluster configuration to reflect the new FQDN:
# Example configuration update
hosts:
- host: new-hostname.example.com # Updated FQDN
host_config_id: 1
port: 19001
location:
unit: "1"
data_center: "DC1"
rack: "1"
Step 5: Apply Configuration Changes
Apply the updated configuration to the cluster:
ydb admin config replace --config-file updated-config.yaml
Step 6: Start Node with New FQDN
Start the node using the new FQDN:
# Update hostname if necessary
sudo hostnamectl set-hostname new-hostname.example.com
# Start the node
sudo systemctl start ydbd-storage
Step 7: Verify the Change
Confirm the FQDN change was successful:
# Check node status
ydb monitoring healthcheck
# Verify node registration
ydb admin config fetch | grep new-hostname
Troubleshooting
Common Issues
- DNS resolution problems: Ensure new FQDN resolves correctly
- Certificate validation errors: Update certificates if they include hostname verification
- Node registration failures: Check network connectivity and firewall rules
Recovery Procedures
If the FQDN replacement fails:
- Revert DNS changes to the original FQDN
- Restore the original configuration
- Restart the node with the original settings
- Investigate and resolve the underlying issue
Best Practices
- Test in staging: Always test FQDN replacement in a non-production environment first
- Backup configurations: Keep backups of working configurations before making changes
- Monitor during change: Watch cluster health metrics during the replacement process
- Document changes: Maintain records of FQDN changes for future reference
- Coordinate with the team: Ensure all team members are aware of the planned change