Akka Cluster - Concurrent Seed Start Up / Race Condition

I have the following cluster members:

  • m1 : seeds = [m1, down_member]
  • m2 : seeds = [m2, m1, down_member]

down_member is a member that just crashed.

If I start m1 and m2 pretty much at the same time, both members try to contact the seeds, and since no other seed is up, eventually m1 starts to join to itself. In the meantime, m2 sends a Join request to m1, which replies with InitJoinNack because it hasn’t fully initialized yet. The m2 upon receiving the InitJoinNack stops retrying to join m1, and eventually joins to itself, creating a separate cluster.

I’d like to understand why m2 doesn’t re-try after the InitJoinAck, and if there’s anyway I can force this.

I’m not able to ensure the seeds are correct all the time, because of the dynamic nature of the cluster (intances going down or up any time, i.e. autoscale/downscale)

Thanks

If you’re using statically configured seeds you should make sure they all contain the same list, particularly the first node should be the same. Nodes first try and contact the seed nodes and if that process fails then the first node in the list is allowed to start a cluster so if you have two different lists with a different first node you can end up with two clusters.

1 Like

Unfortunately, the seeds are not static. Instances are EC2s that can go up and down any time, so seed listing has to be dynamic.

In that case you will not be able to use a seed node list in configuration to form a cluster. One option would be Akka Cluster Bootstrap for such scenarios.