The ShardCoordinator was unable to get an initial state

Olafur · November 25, 2019, 5:26pm

Hi guys

I recently updated my akka cluster to use a homebrewed implementation of the KeepMajority split brain resolver. After this update, I’m seeing an increasing amount of “The ShardCoordinator was unable to get an initial state” errors in my logs

Previously I used to have a specific shutdown hook:

val cluster = Cluster(actorSystem)
CoordinatedShutdown(acotrSystem).addJvmShutdownHook {
  cluster.leave(cluster.selfAddress)
}

but according to the docs (if I understood them correctly) I should not need to do this, the CoordinatedShutdown module should handle nodes leaving gracefully

Is this because I am now using a custom downing-provider-class? Do I need to specifically handle members exiting in the downing provider? (Currently I am listening to the MemberRemoved() signal where I schedule a quorum check after a 7 second stable-after period)

Cheers,
Oli

chbatey · November 26, 2019, 7:29am

You’re right that you don’t need to call leave, Coordinated shutdown should handle this.

A downing provider shouldn’t need to be involved in any graceful shutdown, only when nodes crash or if there are network partitions. Typically that means you’ll react to Unreachable rather than Member* events.

I’d recommend you either use an existing downing provider as there are many edge cases. E.g https://doc.akka.io/docs/akka-enhancements/current/split-brain-resolver.html

Topic		Replies	Views
"The ShardCoordinator was unable to get an initial state" during rolling update Akka Cluster	3	3150	December 18, 2018
How can I re-start actor immediately from downed node? Akka Cluster akka-cluster	3	1691	April 26, 2019
Downing vs Leaving Akka Cluster	2	870	October 9, 2019
2.6.x CoordinatedShutdown reason Akka java , akka-cluster	3	1086	January 5, 2022
Rebalance Akka Cluster if One Of Shard Is not Resolving Akka Cluster	0	366	June 3, 2022

The ShardCoordinator was unable to get an initial state

Related Topics