Inefficiencies in recovering with remember entities

justinpeel · May 9, 2018, 6:57am

I run an Akka cluster sharded app with a large number of entities using distributed data for the storage of those ids. The recovery strategy is constant. I’ve noticed when rolling the app that there are a bunch of messages about StartEntityAck going to dead letters. This is due to the acks not coming back within 2 seconds and being sent again. Personally I don’t see the point of resending the StartEntitys, but that is more of a symptom that led me to to the inefficiency. This app has many messages being sent to the shards right after the cluster is joined.

I looked in the DDataShard and found something that I think is majorly slowing down the processing of messages by the shard actors. When a message is received by a shard for an entity that hasn’t been started, the message is sent to a buffer and the entity id is added to the ddata. The shard becomes a state where it waiting for acknowledgement that the ddata has been updated and stashes all messages. When the acknowledgement is received, all messages are unstashed. The main inefficiency I see with this is that there isn’t a check to see if the entity id is in the remembered entities before an attempt is made to add it to the ddata. This means that there is a lot of unnecessary stashing and unstashing going on with delays in between to wait for ddata to be updated. With a large number of messages in the mailbox, this can be very inefficient.

On a separate note, I’m not sure that it is necessary to wait for the ddata to be updated before processing more messages in the shard. Is it necessary? Is it possible to at least have a config that allows the shards to not wait for the ddata to be updated?

patriknw · May 13, 2018, 8:27am

In general the remember entities facility is rather heavy because it must keep track of each and every entity. I would explore other solutions before using it on a large number of entities that are frequently started, stopped or rebalanced.

That said, performance improvements are welcome. Please create an issue with a reproducer example that illustrates the problem, or a pull request with a reproducer test.

Topic		Replies	Views
Remembering entities with cluster sharding Akka Cluster lagom	4	2214	July 1, 2020
Strategy for deleting events and distributed data state of short lived entities Akka	3	689	June 16, 2022
Entity existence on two nodes after rebalance Akka akka , akka-typed , java , akka-cluster	1	107	March 5, 2024
How to use 'remember entities' feature in lagom Lagom Persistence API akka-cluster	2	936	June 17, 2019
How to address duplicate key value violates unique constraint "journal_pkey" when remembering entities in Akka Cluster Sharding? Akka Cluster akka-typed	2	769	July 22, 2021

Inefficiencies in recovering with remember entities

Related Topics