Akka-Cluster: Decreasing system performance having many active actors

Hello everyone,

we are running an actor system (cluster sharding) having about 5 Million per node. With a constant inbound message rate we are seeing the system slowing down having more and more actors in memory.

few actors: 1 ms
5M actors: > 200 ms

Does the amount of active but unused actors have a significant negative impact on the latency?

System:

  • 5 node cluster
  • akka-cluster-sharding
  • akka-persistence-cassandra

F.e. does the dispatcher need to check 5M mailboxes when only 1K mailboxes have messages?

Thanks in advance and best regards
Thomas S.

Do you observe the increased latency also after a period of stability (when the 5 million actors have been created, rebalanced and the shard regions have all had a chance to cache the entities’ shards whereabouts)? I would need to check but my gut feeling tells me this has something to do with the communication path more than with individual actors replying to messages.

Wondering what you’re doing with 5 million actors per node :) I mean at this scale it may make sense to have a few more nodes around

One thing just crossed my mind: < 1ms means you didn’t cross a network boundary - so the additional latency you see is possibly related to the actors not all being local anymore but instead rebalanced onto other nodes

Scratch that - I was thinking in μs.

To get a better sense of what is goin on could you share more details about the nodes, mainly how many CPU cores they have (or vCPUs if this runs on virtualized hardware)?

The dispatcher only acts on actors with messages, it does not go through all actor mailboxes.

The JVM heap is a shared resource though, so GC times could be one thing to look into.

12 CPUs and 64 GB RAM each. We already changed from G1C to Shenandoah because of GC stop-the-world problems.

In our use case we constantly are spawning (empty recovery) and passivating about 20% of these actors (sharded entities). The cassandra write and read latency seems to be fine.

Just for testing purposes we changed the AbstractPersistentActor to an AbstractActor and implemented the recovery and persist ourselfs by loading and saving the state snapshot using the cassandra directly (blocking i/o *).

  • I know this is evil magic. It was just for testing … We won’t do it again : )

Test results:

  1. The throughput doubled. Which is confusing me because in this case we are saving the entire state to cassandra every time instead of using AbstractPersistentActor.persist to save the commands only (CQRS), which should be much more efficient. (We already checked the setting max-concurrent-recoveries)

  2. The AbstractPersistentActor implementation itself seems to consume about +600 bytes heap per actor instance (AbstractActor = 400 bytes ).

Maybe the whole problem is caused by GC collections. Right now it seems like the AbstractPersistentActor implementation is heap expensive but unfortunately not very fast…

Thanks for your support
Thomas S.

That the actors are persistent actors are relevant, because that means they also share the connection pool to the database and database as a resource. There is a limit on concurrent replays configured through akka.persistence.max-concurrent-recoveries that you could try tweaking. It is also possible to trim Cassandra plugin replay by turning off features of the plugin you do not use (deletions for example).

As I mentioned we cheched max-concurrent-recoveries and increased it. Which was an improvement of about 5-8%.

akka.persistence.cassandra: Do you mean something like disalbe events-by-tag.disable and journal.support-deletes? This seems to be very promising. We’ll definitely try this.

Also we’ll try to reduce the heap size per node by using more nodes.

Yes, at least support-deletes leads to an extra query on when a persistent actor recovers (since it needs to check to what if any offset the events were deleted). Not entirely sure events by tag causes any overhead unless you actually use tags.

By optimizing the heap size, solving a message serialization size issue and using smaller nodes the system is now performing as expected.

Thanks for your help,
Thomas