Performance of recovery of persistent actors (cassandra backend)

struijenID · August 22, 2018, 3:18pm

Hi,

Recent stress tests of the recovery of persistent actors I performed, reveal that recovering an actor with persistence ID “A” takes significantly longer if another instance of that actor with persistence ID “B” has also persisted some events. Furthermore, while the actor is alive, the higher the amount of events that have been persisted, the longer it takes for an individual persist-call to complete.

I am using a relatively simple ReplayActor which ups a counter every time a command comes in. It persists an event every time it changes its counter. Source code is here: https://pastebin.com/VRLcD4ev. The test I run simply sends 10k UpdateCounterCommands, then waits until the actor’s ReceiveTimeout lets the actor stop itself, and finally recreates an actor with the same persistence ID and sends it a new UpdateCounterCommand. Logs of my tests that show the timings of persisting 500 events and recovery of each actor are here https://pastebin.com/zy172Gy8.

Below a quick rundown of the recovery times

Actor 1: Replaying eventlog of 10000 messages took 24857ms PersistenceId: 2b922de1-d088-49b4-953f-1fc1b11195f8
Actor 2: Replaying eventlog of 10000 messages took 45666ms PersistenceId: 592119ae-3ae6-494b-898c-0e0ef868508b
Actor 3: Replaying eventlog of 10000 messages took 75792ms PersistenceId: 868d437b-6e9d-4e1a-8fe1-9bb8750dd714
Actor 4: Replaying eventlog of 10000 messages took 105311ms PersistenceId: 12aeeb05-7c87-4c27-9206-8f7081bc0e3c
Actor 5: Replaying eventlog of 10000 messages took 136015ms PersistenceId: 44cf1824-8a88-4f4d-b9a0-4b081f743264
Actor 6: Replaying eventlog of 10000 messages took 169328ms PersistenceId: b56624f9-0c17-444b-9ac6-51e302ae08a4
Actor 7: Replaying eventlog of 10000 messages took 198782ms PersistenceId: d4192d8e-2ee6-4e7e-9e6a-758f59340a8a
Actor 8: Replaying eventlog of 10000 messages took 245571ms PersistenceId: eb1c0a14-8594-4b36-8cb1-de28d827c077
Actor 9: Replaying eventlog of 10000 messages took 276393ms PersistenceId: 863c5ee2-80eb-4b70-8517-84ae9c40982e

Please note that each actor uses a different persistence ID

I have run these tests in Java with Akka Persistence Cassandra 0.60 and 0.89. Both show similar results. In fact, actors that recover in v0.89 are even slower than those that recover in v0.60.

This came as a surprise to me. I was expecting the amount of events persisted by an actor with one persistence ID to be irrelevant to the recovery speed of an actor with another persistence ID.

Were my expectations wrong? Is this intended behavior?

Thanks in advance!

struijenID · August 24, 2018, 2:13pm

Turns out lidalia’s in memory logger was on the classpath and it was keeping all logmessages in memory, slowing everything down.

Exchanging it with logback solved the issue.

Topic		Replies	Views
Writing events without restoring persistence actor's state Persistence / Event Sourcing	4	963	March 24, 2019
Cassandra plugin: what happens when there are persisted events with the same seqNo? Persistence / Event Sourcing	3	1151	May 13, 2018
Missing entries (events) in cassandra messages table Persistence / Event Sourcing	2	617	August 30, 2022
Performance degrading with Persistent Actors using Cassandra Persistence / Event Sourcing	3	2064	June 7, 2019
Persistence failure when replaying events Persistence / Event Sourcing akka-cluster	1	2773	June 20, 2018

Performance of recovery of persistent actors (cassandra backend)

Related Topics