How to change the AS cluster CIDR at installation time?
Issue Description:
How to change the Automation Suite Embedded cluster CIDR at installation time
Background:
In our installer for AS we set the cluster CIDRs to 10.42.0.0/16 and 10.43.0.0/16. If DNS or SQL services in the environment use IPs on this CIDR, the cluster will not be able to connect to them.
Currently, changes that are made to the CIDRS at install time are not persisted during upgrades. As such before an upgrade the Installer needs to be updated again as described in this document. As of now, we are planning on making the CIDR persistent in 23.10.8 and 24.10.2. Overall its rarely required to change the CIDR.
Changing the CIDR:
- At installation time:
- If an installation has taken place, uninstall the cluster per: https://docs.uipath.com/automation-suite/automation-suite/2023.10/installation-guide/how-to-uninstall-the-cluster
- If the CIDR has already been changed, and this is an upgrade skip this step.
- As long as the Service installer has not executed or completed, no DB role back should be needed.
- Choose the new CIDRs to be used. The subnets must remain /16. The new CIDRs would replace the default 10.42.0.0/16 and 10.43.0.0/16.
- If the CIDR has already been updated, the same CIDR must be used.
- From the installation directory run the following command:
-
sed -i 's/10.42.0.0/XX.XX.0.0/' Infra_Installer/infra-installer.sh sed -i 's/10.43.0.0/YY.YY.0.0/' Infra_Installer/infra-installer.sh
-
- Post this step, we can verify the changes:
-
grep 'XX.XX.0.0' Infra_Installer/infra-installer.sh grep 'YY.YY.0.0' Infra_Installer/infra-installer.sh - Example output is below. In this case the CIDR was changed to 10.92 and 10.93
-
[root@autosuite installer]# grep '10.92.0.0' Infra_Installer/infra-installer.sh cluster_cidrs_array+=("10.92.0.0/16") [root@autosuite installer]# grep '10.93.0.0' Infra_Installer/infra-installer.sh service_cidrs_array+=("10.93.0.0/16")
-
-
- After this re-install the cluster. (Or upgrade if the CIDR has already been changed)
- If an installation has taken place, uninstall the cluster per: https://docs.uipath.com/automation-suite/automation-suite/2023.10/installation-guide/how-to-uninstall-the-cluster
- If the cluster has already been installed and its not feasible to re-install, we can reset one of the master nodes and rejoin the cluster. However we highly recommend making sure to have a backup and contacting our support team before doing this.
- The reference guide for reseting a cluster can be found here: https://docs.rke2.io/datastore/backup_restore
- First, stop all of the nodes in the cluster (this should be done on each node in the cluster)
-
/opt/node-drain.sh rke2-kill.sh
-
- On the master nodes, modify /etc/rancher/rke2/config.yaml with the new CIDRs
- Make sure that the etcdb is mounted (In some versions rke2-kill.sh would unmount the etcdb directory).
-
mkdir -p /var/lib/rancher/rke2/server/db systemctl restart var-lib-rancher-rke2-server-db.mount
-
- Run the cluster reset
-
rke2 server --cluster-reset
-
- Once it is complete, restart the master node:
-
systemctl restart rke2-server
-
- Rejoin the other server nodes.
- Do these steps one server at a time.
- Clean up ETCD and restart the server
-
mkdir -p /var/lib/rancher/rke2/server/db systemctl restart var-lib-rancher-rke2-server-db.mount -
rm -rf /var/lib/rancher/rke2/server/db/* -
systemctl restart rke2-server
-
- Check the node was added after the server is started.
-
kubectl get nodes
-
- On the other nodes restart agents services.
-
systemctl restart rke2-agent
-