Various question about failures with akka persistance and cluster sharding

adelegue · July 12, 2018, 8:27am

Dear hakkers,

I was wondering how to handle failure using akka persistance and cluster sharding.

When the client send a command, the command hit a node then the command is routed to the good shard and then handled by the persitant actor.

From the client side, I want to have a response to tell the end user what append: command invalid, failure …

When something bad append in the cluster I think the caller will end with an AskTimeoutException. In that case how could I know if the command has been handled or not, to notify to the end user ?
For example, if the command is handled by the persistant actor but the node handling the sharding crash at the same moment, the caller will end with an AskTimeoutException but the command is handled.
If the persistant actor crash, the caller will end with an AskTimeoutException and the command is not handled.

What is the best way to handle those cases ?

Thanks,
Alex.

johanandren · July 13, 2018, 11:14am

This is one of the good old classic problems with distributed systems: You cannot know the difference between an ask that failed with timeout because the sent message got lost and was not received, the message was received and processed but caused the process to fail before a response was sent (potential partially applied) or the message was received but the response was lost.

To get that delivery guarantee you will have to do resending. There is a component for at least once delivery in Akka (https://doc.akka.io/docs/akka/current/persistence.html#at-least-once-delivery) but it will then require a roundtrip to persistence on both sides, pretty much the only way to guarantee 100% delivery in the face of nodes crashing etc.

However if you want something simpler with a bit less guarantee you can for example make the action idempotent, make sure that the same command several times only does the change once, and do resends on timeout. We are hoping to introduce something more lightweight to cover such requirements in Akka Typed (see this PR: https://github.com/akka/akka/pull/25099)

adelegue · July 13, 2018, 12:05pm

Ok, that what I thought.

Thanks for your answers!

Topic		Replies	Views
Akka-Cluster: "Failed to persist event" => "Retry request for shard" Akka akka-cluster	0	477	October 28, 2020
Having trouble running Akka proof-of-concept for Akka Event Sourcing Akka	1	239	May 4, 2022
Hello, do you know how to failure immediately when sending message to non-existent cluster sharding ? Is seems akka buffered messages and keep retrying Akka Cluster akka-typed , akka-cluster	2	429	November 4, 2022
Cluster Sharding with Persistent Entities Persistence / Event Sourcing	7	881	April 21, 2020
Akka-cluster-sharding HandOffStopper issue Akka Cluster	1	825	September 10, 2019

Various question about failures with akka persistance and cluster sharding

Related Topics