Supervision: optimal use for retry logic and understanding logged error

akka
java

(daz) #1

Hi,

I am playing around with supervision in Java Akka 2.5.8 / 2.12.

I would like to have the ability to decorate any old actor’s behaviour with retry logic.

I have the following:

public class TestSupervisionDocs {
    static int counter = 0;
 
    private static ActorSystem system;
 
    @Test public void supervisedActorIsRestarted() {
        supervised().tell("foo", ActorRef.noSender());
 
        await().until(() -> counter == 2);
    }
 
    private ActorRef supervised() {
        return system.actorOf(supervised(SupervisedActor.props()), "child-supervisor");
    }
 
    private Props supervised(final Props supervised) {
        return BackoffSupervisor.props(Backoff.onStop(
                supervised, "child-actor",
                Duration.create(100, MILLISECONDS),
                Duration.create(1, SECONDS),
                0.2).withSupervisorStrategy(
                new OneForOneStrategy(10, Duration.create(1, TimeUnit.MINUTES),
                        DeciderBuilder
                                .match(RestartMe.class, e -> {
                                    System.out.println("RETRY caught------------------");
                                    return restart();
                                })
                                .matchAny(e -> escalate()).build())));
    }
 
    @BeforeAll
    static void beforeClass() {
        system = ActorSystem.apply("TestActorSystem");
    }
 
    @AfterAll
    static void afterAll() {
        TestKit.shutdownActorSystem(system);
    }
 
    static class SupervisedActor extends AbstractActor {
 
        private ActorRef originator;
 
        @Override public Receive createReceive() {
            return receiveBuilder()
                    .match(String.class, m -> {
                        originator = getSender();
                        System.out.println("received. Count=" + counter);
                        if (counter++ < 3) {
                            throw new RestartMe() ;
                        }
                    })
                    .matchAny(m -> System.out.println("matched any: " + m))
                    .build();
        }
 
        @Override public void preRestart(Throwable reason, Option<Object> message) {
            System.out.println("SupervisedActor restarting with message " + message.get() + " of type " + message.get().getClass().getSimpleName());
            getContext().getParent().tell(message.get(), originator);
        }
 
        @Override public void postRestart(Throwable reason) {
            System.out.println("SupervisedActor restarted! " + getSelf().path());
        }
 
        public static Props props() {
            return Props.create(SupervisedActor.class);
        }
    }
 
    private static class RestartMe extends RuntimeException {
    }
}

A number of questions:

  1. The exception thrown by the SupervisedActor is logged as an error:
11:43:00.542 [TestActorSystem-akka.actor.default-dispatcher-3] ERROR OneForOneStrategy - nullcom.tesco.payments.async.TestSupervisionDocs$RestartMe: null
	at com.tesco.payments.async.TestSupervisionDocs$PrinterActor.lambda$createReceive$0(TestSupervisionDocs.java:66)

It’s not clear to me that I’ve done things correctly - what’s the ‘null’ for?

Also, is there a way to not have this logged as an error?

  1. I had to use Backoff.onStop(…) rather than what seemed more intuitive Backoff.onFailure(…). Are there any issues I should be aware off with this?

  2. In the SupervisedActor, I have overridden preRestart, which after a restart, sends the parent (supervisor) the failed message again, with the original sender.

sending to self rather than supervisor didn’t work, don’t understand why?

In any case is there a way to have the supervisor send the message again? Then the supervised actor would not need to be aware of its supervision nor know its parent is something special, nor contain part of the retry logic.

One way I can think off is to have the thrown exception contain the message, supervised actor and original sender, and then have supervisor send it to itself:

DeciderBuilder
        .match(RestartMe.class, e -> {
            System.out.println("RETRY caught------------------");
            e.self.tell(e.msg, e.sender);
            return restart();
        })

However this may cause a race condition for the supverised actor to be restarted, and still requires the SupervisedActor having some involvement in the process (throwing an exception with information specific to retry). In any case, seems a bit ugly.

  1. in the preRestart method, the signature is Optional for message. What situations might the message not be present?

Help / thoughts much appreciated!
daz


(Patrik Nordwall) #2

They are for different purposes. You try this by throwing an exception? Then onFailure should work. Perhaps it’s different if you throw from constructor or preStart?

That is because the previous self is actually stopped. Note that backoff supervision is implemented as a stop, scheduled delay, followed by a new start (actorOf) of the actor. Therefore the self actor reference is stale and sending to that will go to deadLettters. This is different from ordinary restart supervision, where actor references are still valid after a restart. Therefore all messages must be sent to the parent BackoffSupervisor when using the backoff supervision.

Perhaps you should explain what you are trying to achieve and we can give some advice of how to solve that rather than just looking at the solution with the BackoffSupervisor without knowing the original problem.