Reliable message delivery with Actors

Just wondering if I’m on the right track. Say we have a scenario in which a child Actor has the job of loading something for me, with a high chance of failure. A parent actor is the final recipient of the loaded resource.

I want a mechanism where the actors try again and again to load the resource until it succeeds, whether this takes milliseconds or hours.

I have experimented using several patterns. This is what I’ve come up with.

  • Parent actor wraps child actor in BackoffSupervisor
  • Parent asks child for resource (with timeout)
  • Child actor receives command to load the resource
  • If something goes wrong in the child, stop the child, triggering a restart
  • In the restart-hook of the child, forwarding the failed command to itself, trying again
  • If the parent actor does not receive a reply in time, try asking again.

Is this a correct approach, or could it be done better? At first I was attempting to have the all the restart logic in the child actor, but this becomes problematic because the parent actor has no way of knowing if the request reached the child - in between stops-then-restarts commands will go into Dead Letters. And the child has no way of knowing if the result has been received by the parent. That’s why there is retry-logic on both sides.

What feels a bit “messy” with this approach is that the parent timeout is not coordinated with the restart backoff. It just resends the command again and again at the same interval.

I was thinking of possibly reversing the relationships, and acknowledging the reception of the resource from the parent to the child. That way all the retry-logic is in the child, and can do all the restarting and re-delivery it wants until the parent acknowledges that it has received the resource. The parent will simply wait for the resource indefinitely without timeout, i.e. not using the ask pattern, but in the form of a command.