How to debug connectivity issues in Automation suite

How to debug connectivity issues in Automation suite?

Issue description:

How to debug connectivity issues in Automation suite

Resolution:

When facing connectivity issues in UiPath Automation Suite, it's essential to check the network status between services, nodes, and external connections. Below are several key debugging steps and commands to help you identify and resolve connectivity problems.


1. Check URL Connectivity

Use nc (Netcat) to check if a service is reachable on a specific port:

if nc -z -v -w5 loki:3100 &>/dev/null; then echo "connected"; else echo "not able to connect"; fi

This command checks whether the loki service is reachable on port 3100.


2. Check if IP and Port Combo is Reachable

Use telnet to verify whether an IP address and port can be reached:

telnet 127.0.0.1 8080

This command attempts to connect to IP 127.0.0.1 on port 8080.


3. Port Forwarding a Service

You can port-forward services for troubleshooting by redirecting a local port to a service’s port inside the Kubernetes cluster:

kubectl -n rabbitmq port-forward service/rabbitmq 8800:15672

In the example above, port 8800 on your local machine forwards to port 15672 in the rabbitmq service.

For another service, like rook-ceph:

kubectl -n rook-ceph port-forward service/rook-ceph-mgr-dashboard 8800:8443

4. Check if Load Balancer is Resolvable via Host

To test if the Load Balancer is properly resolving via the host:

curl -m 5 -v -k -i --resolve LB_URL:443:IP_OF_HOST https://LB_URL

Replace LB_URL with the actual load balancer URL and IP_OF_HOST with the IP of the host. This ensures that the Load Balancer is resolving correctly.


5. Check it in Loop

To repeatedly check connectivity (e.g., during troubleshooting a service under load), you can use a loop:

for i in {1..100}; do curl -m 5 -v -k -i --resolve LB_URL:443:IP_OF_HOST https://LB_URL; done

This will run the check 100 times to monitor connectivity stability over time.


6. Check Connectivity Between Nodes / Overlay Test

For advanced networking checks between nodes (e.g., in Rancher environments), you can perform overlay network tests to confirm communication paths.

https://ranchermanager.docs.rancher.com/v2.5/troubleshooting/other-troubleshooting-tips/networking

Alternatively, use the Diagnostics Tool in case of an air-gapped environment where the network has limited access to external resources.


7. Resolve a URL via DNS Server

To ensure DNS resolution is functioning correctly, you can check your DNS configurations:

cat /etc/resolv.conf

For specific DNS resolution using a particular DNS server:

nslookup github.com FIRST_DNS_SERVER_IP

Replace FIRST_DNS_SERVER_IP with the IP address of the DNS server you want to query.

8. In scenarios where DNS/Nameservers might not be working. To test, bypass the nameserver by adding the URLs in /etc/hosts file as below -



Example:
1.2.3.4 automationsuite.mytest.com


9. In scenarios where Automation Suite Products Fail with "Unable to Establish Connection Because an Error Was Encountered During Handshakes Before Login"

Example Error:

vbnetCopy code[Error: [Microsoft][ODBC Driver 17 for SQL Server]Client unable to establish connection because an error was encountered during handshakes before login. Common causes include client attempting to connect to an unsupported version of SQL Server, server too busy to accept new connections or a resource limitation (memory or maximum allowed connections) on the server.] {
sqlstate: '08001',
code: 26,
severity: 0,
serverName: '',
procName: '',
lineNumber: 0
},

Common Causes and Troubleshooting Steps:

  1. Improper CA Certificate or Connection String Configuration:

    • The connection string has TrustServerCertificate=False but does not have the proper CA cert imported.
    • Solution: Modify the connection string in cluster_config.json to TrustServerCertificate=True or ensure that the CA Certs are correct, or that the SQL cert is imported into the trust-store in Automation Suite.
  2. SQL Server is Not Running or Unreachable:

    • The target SQL Server may not be running or is unreachable.
    • Solution: Verify the availability of the SQL Server by pinging/telnetting the SQL port from the Automation Suite server. Alternatively, consult your internal SQL/Infrastructure team to validate.
  3. Istio Interception and Connection Reset: