Consistency between persistent actors

We’re looking for some advise ensuring atomic operation on more than one actor.

We’ve considered two-phase-commit and saga pattern.
However, both seem bad because persisting to DB could fail.
(we want that all or no journals will be persisted by a command)

We thought that transaction coordinator actor MUST append all journals at once to ensure strong consistency.

So, we are considering transformed two-phase-commit.

For example, let’s consider an account as an actor.
If we want to transfer 100$ from account A to account B,
the coordinator actor tells account A to decrease 100$.
the coordinator actor tells account B to increase 100$.
If both responded, the coordinator actor persists journals PersistentRepr(payload: Decreased(100$), persistenceId: "A"), PersistentRepr(payload: Increased(100$), persistenceId: "B") then send COMMITED and which events are applied to A and B.

It’s similar to two-phase-commit except that coordinator performs persisting events.

Is this design reasonable?

1 Like

Hi, I’m just a lay developer. Let me think it through with you.

So what you are saying is that you want to commit the two events atomically. Naturally, if you have multiple entities “committing”, then the action is no longer atomic by definition, since you split the action up into two or more parts. It follows that only one entity (actor) must be responsible to commit the event(s) for it to be atomic.

So I believe the logical conclusion is that you need a coordinator is correct, and also your order of actions, where

  1. first, the actors receive a command
  2. if the command can be applied, generate events, and send them to a coordinator. Do not process any more commands until the coordinator reports back
  3. The coordinator persists the events to the database, atomically
  4. The coordinator sends the events back to the actors, where they are applied to the transient state

Let me throw the ball back at you though: What are you going to do if one of the two actors does not respond to step 4)? What I’d suggest is to time out on the side of the actor, and if the actor does not hear back from the coordinator in a certain amount of time, it restarts itself. This way you bring it back into sync with the source of truth, the journal. Do you see any loophole in that approach?

The problem I see if you’d roll back in step 4) is that the actor might have received the message, but the acknowledgment to the coordinator might have been lost.

Hey there,

You are correct that you require a coordinator.

A PersistentEntity in Akka is an aggregate root in Domain-Driven Design terms. So, if you must exchange messages between aggregate roots, you do require a mediator.

I, personally, would go on about things slightly differently.

The coordinator’s sole responsibility is to ensure that money are withdrawn from one account and deposited to the other. Therefore it should only send (idempotent) commands to account A and B and keep track of the completion of those requests - think persisting the state and sending additional requests when things fail. The entities are responsible for maintaining their own state. In practice this would look something along the lines of (assuming account A has the funds):

  1. Coordinator sends Withdraw 100$ request to account A.
  2. Account A persists Withdraw Pending 100$ event.
  3. Account A informs the Coordinator the amount has been reserved.
  4. Coordinator sends Deposit 100$ to account B.
  5. Account B persists Deposit 100$ event.
  6. Account B informs the Coordinator it’s taken the funds.
  7. Coordinator sends Complete Withdraw request to account A.
  8. Account A writes Withdraw Completed 100$ event.

I’m leaving out any communication details, since they don’t matter that much. The important thing is to make the withdraw and deposit processes idempotent, and keep sending requests until the transfer operation is successful or considered failed.

The approach would work whether you have control over the actual accounts or you’re making a transfer between two banks.

Hope this helps.

1 Like

Hi, thanks for your advice.
Some problems can occur if I don’t use At-Least-Once delivery.
I think restarting actor can always fix because journal table is the source of truth.

Yes, I agree that what you said is the general solution…

But I don’t know how to make commands idempotent.
The simplest way I thought is put ID into request,
but must actor store HANDLED REQUEST IDs?? Its size can diverge.
If I make some expiring date because of memory, then its idempotence will be imperfect…

I will appreciate if you can tell me how I can make command idempotent.
Thanks!

You can give each command both an ID and an expiration timestamp. Then, in the receiving actor, you can store the history of handled requests that have not yet expired. If it receives a duplicate command later, it would check the expiration timestamp first and refuse the command if it is already expired. Then, the sender may choose to reissue the command with the same request ID and an updated expiration timestamp, if it doesn’t have a way to verify that the original command was ever processed. The full set of handled request IDs would need to be stored somewhere else that can handle the size of the complete history (such as a database) so that you can do asynchronous reconciliation later and reverse any duplicates.

1 Like