HAA - conflicting status reports

JasioB · July 1, 2022, 2:15pm

Hi,
We have a fresh installation of the HAA module and would like some help in understanding what we are seeing.

Through the web UI we can see that all is OK:

However when we check from the command line using rladmin status nodes
we see:

where one of the nodes is marked as UNREACHABLE.
What could be causing this?
We did have issues with port opening between our datacentres but are now confident that all documented ports are now open across the datacentres.
The unreachable node is in the other datacentre. Do we need to open yet another port to finalise this installation? If so, which one?

Any help or an explanation would be much appreciated.
Thanks

Nithinkrishna · July 2, 2022, 3:58pm

Hey @JasioB

You are talking about multi node installation here ?

Thanks
#nK

JasioB · July 4, 2022, 8:11am

Hi Nithinkrishna!

In our installation we have a cluster with 3 nodes, each on a separate server. Two of the servers are in one data centre (DC1), the final server is in a different data centre (DC2).
The above results are from DC1 where two servers are located.
If I try the same from DC2 then I get 2 nodes with a status of UNREACHABLE.
So it is almost 100% certain that it is a firewall issue - but I also convinced that all the documented ports have been opened - hence my question.

Cheers

Nithinkrishna · July 4, 2022, 8:52am

Yes most probably the port thing.

JasioB · July 4, 2022, 10:02am

Yes - but which one?
Anyone have any ideas?

Cheers

Nithinkrishna · July 10, 2022, 3:21am

I was just suspecting the ports opened ?

Is it so? If yes please check if there is some firewall or proxy block ? And your IT team should do an RCA once as well.

JasioB · July 11, 2022, 11:35am

Hi Nithinkrishna,
The ports are definitely open. I stopped redis across the cluster and tested using “nc -l -p 1968” on the destination server and “nc -4zvw5 dest-servername 1968” on the source server. That worked across all the main ports which I tested; namely:
53
1968
3333
3334
3335
5353
8002
8004
8006
8070
8071
8080
8443
8444
9080
9081
9443
10050
20000
36379
36380

ALSO: When I shutdown one node it is reported as being DOWN not UNREACHABLE. So when its not up its reported as being DOWN, when it is up it is reported as UNREACHABLE.

Looks to me like a bug or perhaps a cluster configuration problem and not a firewall issue.

Anyone have a different theory?

JasioB · July 20, 2022, 8:02am

We have a solution!

The user gkorland on the Redis Community Forum suggested that the servers need to be able to ping each other [See Rladmin: node status is UNREACHABLE - port issue? - Redis administration - Redis Community Forum].
When cross-datacentre ICMP traffic was enabled the issue was resolved and all nodes are now reporting a status of “OK”

Thanks for your interest and support.

system · July 23, 2022, 8:03am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Redis HAA install for Multinode setup Orchestrator orchestrator , question , azure , haa , firewall	3	1673	June 21, 2022
High Availability Add on - one node is showing down in portal Orchestrator orchestrator , haa	1	1078	February 4, 2022
HAA website is not loading while installing multi node Orchestrator orchestrator , question , haa	5	1919	April 25, 2021
HAA - “API authentication failed” - notification is getting piled up Orchestrator orchestrator , question , haa , multi-node , active-passive	2	606	May 30, 2023
HAA - “Cannot allocate nodes for shards” when creating Redis DB Orchestrator orchestrator , error , haa	3	1796	December 30, 2021

Most Active Users - Yesterday
ashokkarale
Anil_G
sharazkm32
afna
dylan.kelly
patent-atanaka
ta.shiraki
ramadhani.sheffi
rpa_jasindo
NNR
More details...

HAA - conflicting status reports

Related topics