Akka-Cluster: "Failed to persist event" => "Retry request for shard"

Hi everyone,

we are using Akka-Cluster-Sharding with the following components

akka: 2.5.31
scala: 2.11
akka-persistence-cassandra: 0.102

We got an temporary perstistence issue after a GC pause on one of the cassandra instances.

Failed to persist event type [akka.cluster.sharding.Shard$EntityStopped] with sequence number [96688] for persistenceId [/sharding/REGIONShard/3684].

The nodes getting the error happend are logging this every few seconds:

<REGION>: Retry request for shard [3684] homes from coordinator at [Actor[akka://systemname/system/sharding/REGIONCoordinator/singleton/coordinator#-192485395]]. [232] buffered messages.

Two nodes where affected. On one node the log message went away after 14 hours. After restarting the other node (containing the shard coordination actor) the log messages dispeared, too.

Expected: The affected Shard actors should be restarted after a while.

Actual: The Shard could not be reached anymore.

I would be grateful for help.

Thanks in advance
Thomas