HAA - conflicting status reports

Hi,
We have a fresh installation of the HAA module and would like some help in understanding what we are seeing.

Through the web UI we can see that all is OK:

However when we check from the command line using rladmin status nodes
we see:

where one of the nodes is marked as UNREACHABLE.
What could be causing this?
We did have issues with port opening between our datacentres but are now confident that all documented ports are now open across the datacentres.
The unreachable node is in the other datacentre. Do we need to open yet another port to finalise this installation? If so, which one?

Any help or an explanation would be much appreciated.
Thanks

Hey @JasioB

You are talking about multi node installation here ?

Thanks
#nK

Hi Nithinkrishna!

In our installation we have a cluster with 3 nodes, each on a separate server. Two of the servers are in one data centre (DC1), the final server is in a different data centre (DC2).
The above results are from DC1 where two servers are located.
If I try the same from DC2 then I get 2 nodes with a status of UNREACHABLE.
So it is almost 100% certain that it is a firewall issue - but I also convinced that all the documented ports have been opened - hence my question.

Cheers

1 Like

Yes most probably the port thing.

Yes - but which one?
Anyone have any ideas?

Cheers

I was just suspecting the ports opened ?

Is it so? If yes please check if there is some firewall or proxy block ? And your IT team should do an RCA once as well.

Hi Nithinkrishna,
The ports are definitely open. I stopped redis across the cluster and tested using “nc -l -p 1968” on the destination server and “nc -4zvw5 dest-servername 1968” on the source server. That worked across all the main ports which I tested; namely:
53
1968
3333
3334
3335
5353
8002
8004
8006
8070
8071
8080
8443
8444
9080
9081
9443
10050
20000
36379
36380

ALSO: When I shutdown one node it is reported as being DOWN not UNREACHABLE. So when its not up its reported as being DOWN, when it is up it is reported as UNREACHABLE.

Looks to me like a bug or perhaps a cluster configuration problem and not a firewall issue.

Anyone have a different theory?

We have a solution!

The user gkorland on the Redis Community Forum suggested that the servers need to be able to ping each other [See Rladmin: node status is UNREACHABLE - port issue? - Redis administration - Redis Community Forum].
When cross-datacentre ICMP traffic was enabled the issue was resolved and all nodes are now reporting a status of “OK”

Thanks for your interest and support.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.