Remembering entities with cluster sharding

I would like to use the remember entities feature to automatically recover my persistent actors every time my application starts up (after it was stopped or crashed or whatever other reason).

I am currently testing this locally but it does not work as expected. I am not sure if I am doing something wrong or if I miss-understand the feature in general.

What I am testing:

  • run my app with sbt run
  • send one of my persistent entities a message and let it store some event
  • stop the app by stopping sbt
  • start the app again with sbt run

I expected that the entity is recovered when I restart the app but this did not happen.
In the event journal table I see that there is an event stored for the entity and I am using the following code on app startup.

val sharding = ClusterSharding(system)
val shardingSettings = ClusterShardingSettings(system).withRememberEntities(true)

val TypeKey: EntityTypeKey[S#Command] = typeKeyForSaga(sagaTypeKeyName)
sharding.init(
  Entity(TypeKey)(ctx => saga.behaviour(ctx.entityId).asInstanceOf[Behavior[S#Command]])
    .withSettings(shardingSettings)
)

Am I missing something? or is this not supposed to work like this?

I am using akka-persistence-typed and akka-cluster-sharding 2.6.4 as part of the newest lagom framework version. So my app is a lagom service but in this case I did create the entities “myself” instead of using the PersistentEntity which comes with lagom.

Hi @leozilla

Yes the entity should start again.
The storage used by default is LevelDB via distributed data. If the same storage isn’t present when you start again the entities won’t be restarted.

Is your sbt run using a dynamic port? If so set akka.cluster.sharding.distributed-data.durable.lmdb.dir as the default directly includes the remoting host and port.

Full details at:
https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#behavior-when-enabled

1 Like

@chbatey thanks for the reply, unfortunately I still didnt managed to get the entity recovered after startup.

I tried with the following ddata config now:

akka.cluster {
  sharding.state-store-mode = ddata
  distributed-data {
    durable.keys = ["*"] # not sure which key I need to configure here so I just used * here to get something working
    durable.lmdb.dir = "/home/david/tmp/lmdb"
  }
}

and I can see that there is a mdb file created in the configured directory. But I dont know howto inspect it.

I am expecting that my entity continues with its operation after startup but I cannot see any logs which indicate this. Here are my signal handlers of the entity.

receiveSignal {
        case (State.WaitingForTicketFilesBeeingGenerated(orderId, tickets), RecoveryCompleted) =>
          logger.debug("Saga recovery completed, continuing saga with requesting file generation")
          fetchTicketInfoAndRequestFileGeneration(orderId, tickets)
        case (_, RecoveryFailed(ex)) => logger.error("Saga recovery failed", ex)
        case (_, signal)             => logger.info(s"Signal $signal")
      }

This is what I see from akka debug logs:

2020-06-29 13:50:19.086 [INFO ] LmdbDurableStore - Using durable data in LMDB directory [/home/david/tmp/lmdb]
2020-06-29 13:50:19.296 [INFO ] Cluster - Cluster Node [akka://ticket-application@127.0.0.1:34997] - Node [akka://ticket-application@127.0.0.1:34997] is JOINING itself (with roles [dc-default]) and forming new cluster
2020-06-29 13:50:19.298 [INFO ] Cluster - Cluster Node [akka://ticket-application@127.0.0.1:34997] - is the new leader among reachable nodes (more leaders may exist)
2020-06-29 13:50:19.317 [INFO ] Cluster - Cluster Node [akka://ticket-application@127.0.0.1:34997] - Leader is moving node [akka://ticket-application@127.0.0.1:34997] to [Up]
2020-06-29 13:50:19.464 [INFO ] TicketServiceConfig - Config loaded: TicketServiceConfig(AwsConfig(AwsS3Config(Some(http://127.0.0.1:4572),eu-central-1,AwsCredentialsConfig(AAAAARXFKRBPZBIIKUPA,*****),ew-tickets),AwsLambdaConfig(http://127.0.0.1:3100/ticket)))
2020-06-29 13:50:19.512 [DEBUG] LmdbDurableStore - Init of LMDB in directory [/home/david/tmp/lmdb] took [427 ms]
2020-06-29 13:50:19.610 [DEBUG] LmdbDurableStore - load all of [1] entries took [97 ms]
2020-06-29 13:50:19.613 [DEBUG] Replicator - Loading 1 entries from durable store took 537 ms, stashed 6
13:50:19.896 INFO eventworld.gateway.events.scanapp.EntryManagementServiceActor [{}] - Starting to consume from ticket status updates topic
2020-06-29 13:50:20.042 [INFO ] NewsletterServiceConfig - Config loaded: NewsletterServiceConfig(http://localhost:3000/newsletter/verify,*****,ewtest@eventworld.com,AwsConfig(AwsSesConfig(eu-west-1,AwsCredentialsConfig(AKIAUBCVZ55H3KD44P2C,*****))))
13:50:20.073 INFO play.api.Play [{}] - Application started (Dev) (no global state)
2020-06-29 13:50:20.252 [INFO ] ClusterSharding - Starting Shard Region [GenerateTicketFilesSaga]...
2020-06-29 13:50:20.322 [DEBUG] ShardRegion - Idle entities will not be passivated because 'rememberEntities' is enabled.
2020-06-29 13:50:20.325 [DEBUG] ShardRegion - GenerateTicketFilesSaga: Coordinator moved from [] to [akka://ticket-application@127.0.0.1:34997]
2020-06-29 13:50:20.354 [INFO ] ClusterSingletonManager - Singleton manager starting singleton actor [akka://ticket-application/system/sharding/GenerateTicketFilesSagaCoordinator/singleton]
2020-06-29 13:50:20.356 [DEBUG] TimerScheduler - Cancel timer [RegisterRetry] with generation [2]
2020-06-29 13:50:20.361 [INFO ] ClusterSingletonManager - ClusterSingletonManager state change [Start -> Oldest]
2020-06-29 13:50:20.371 [DEBUG] Replicator - Received Get for key [GenerateTicketFilesSagaCoordinatorState].
2020-06-29 13:50:20.372 [DEBUG] Replicator - Received Get for key [shard-GenerateTicketFilesSaga-all].
2020-06-29 13:50:20.379 [INFO ] DDataShardCoordinator - ShardCoordinator was moved to the active state State(Map())
2020-06-29 13:50:20.619 [DEBUG] DDataShardCoordinator - ShardRegion registered: [Actor[akka://ticket-application/system/sharding/GenerateTicketFilesSaga#-442245352]]
2020-06-29 13:50:20.620 [DEBUG] DDataShardCoordinator - Publishing new coordinator state [State(Map())]
2020-06-29 13:50:20.633 [DEBUG] Replicator - Received Update for key [GenerateTicketFilesSagaCoordinatorState].
2020-06-29 13:50:20.636 [DEBUG] DDataShardCoordinator - The coordinator state was successfully updated with ShardRegionRegistered(Actor[akka://ticket-application/system/sharding/GenerateTicketFilesSaga#-442245352])
2020-06-29 13:50:20.637 [DEBUG] TimerScheduler - Cancel timer [RegisterRetry] with generation [4]
2020-06-29 13:50:20.638 [DEBUG] DDataShardCoordinator - New coordinator state after [ShardRegionRegistered(Actor[akka://ticket-application/system/sharding/GenerateTicketFilesSaga#-442245352])]: [State(Map())]

After this I dont see any relevant logs anymore about the GenerateTicketFilesSaga

I think I am close to getting it working but now I dont know what else I can try…

I think you have used the wrong config. It should be akka,cluster.sharding.distributed-data (confirm in docs).

You should not configure durable-keys, that is automatic.

I managed now to get it working by using the config shown below. It seems I had to configure the keys, I guess this config setting was disabled by lagom.

akka {
  cluster.sharding {
    state-store-mode = ddata

    distributed-data.durable {
      keys = ["shard-*"]
      lmdb.dir = "/tmp/lmdb"
    }
  }
}

btw: I also implemented a new DurableStore which uses Jdbc