We use akka v2.4.10 and have 4 nodes in an akka cluster (ex. A, B, C, D) and we observe the below cluster split scenario.
When the actor system from a node (i.e D) gets restarted, we received “Associate timed out after [15000 ms]” message when tried connecting with other nodes.
The cluster.state().members() contains only node D when the actor system is completely started.
Is there a way to get the node D back into the cluster (where A, B and C present) ?
We receive “Associate timed out after [15000 ms]” message during Association failure for a node from akka.remote.ReliableDeliverySupervisor.
Guess, the timeout value is set as default by Akka, is there any setting to configure this time out value manually (i.e from configuration file) ?
Do you mean that you see this when D tries to join to ABC? That shouldn’t happen. Difficult to guess what could be causing it without inspecting logs. Could it be that your network is not configured for peer-to-peer, so that connections can only be opened in one direction (firewalls).
BTW you are using a very old version that is end-of-life.
In our case, when we start D node, it tries to join the cluster(ABC). But ABC remains unreachable due to some network unavailability. So D forms a separate cluster.
At this point, the Cluster.state().members() on D has only D node and not other nodes.
No firewalls present at our end and even when the network becomes stable at later time, D remains in a separate cluster.
Now to recover from this, we restart the node D (actor system) manually to join the ABC cluster.
So the query is, any option available in Akka to rejoin these cluster islands without restarting the nodes?