Hey guys! hope you all fine. We’ve a microservice using akka classic with persistence actor through Mongo (using reactivemongo).
We’ve yesterday seen
akka.persistence.RecoveryTimedOut: Recovery timed out, didn't get snapshot within 30000 milliseconds
during a deploy. During the deploy we estimate that each node was trying to recover ~6k actors.
I’ve been thinking on two parameters:
connection pool through our mongo datastore (this is controlled by nbChannelsPerNode)
Does those values should be equals (or at least similar)? In our case we have them set on 250
concurrent recoveries through 70 connections per node.
PS: We’re aware of this config but 30 seconds is a lot of time for our SLA, se we need to perform quite faster