Throwing ActorKilledException to stop singleton cluster actor


#1

I have a cluster singleton actor, and when there is a fatal exception, is it appropriate to just throw

throw new ActorKilledException ("Fatal problem in external data source (just for example)")

and let the user guardian shut the actor down?


(Konrad `ktoso` Malawski) #2

Log the message on appropriate level and ‘context.stop(self)’ instead I would suggest.


#3

Whats the difference between the exception and calling

context.stop(self)

Is this method more graceful?


(Patrik Nordwall) #4

context.stop(self) would stop the singleton instance and that is not an option since the ActorSystem would continue running and other nodes would still believe that the singleton instance is located at that node (and effectively sending to deadLetters).

Restarting the singleton instance by throwing an exception is fine, but perhaps you wanted the actor system to be shutdown? Then I’d explicitly do so with system.terminate() or CoordinatedShutdown(system).run().

Throwing ActorKilledException is not good because that will stop the singleton instance, because that is how ActorKilledException is handled by the default supervision strategy.


#5

If i have explicit reference to the system like

ActorSystem system = ActorSystem.create("faultTolerance");
system.terminate();

Everything seems ok. However if I am in the Actor itself (this one was declared in the cluster singleton)

class ExamplePersistentActor extends AbstractPersistentActor {
 //stuff
  public Receive createReceive() {
 return receiveBuilder()
        		.match(Integer.class, c -> {
                	//Forcing system to die for tests
    			     Timeout timeout = new Timeout(Duration.create(30, "seconds"));
    			     Future<Terminated> future = getContext().system().terminate();
    			     String result = Await.result(future, timeout.duration()).toString();

It never seems to terminate properly. Am I allowed to terminate the actor system in context of a running actor?


(Patrik Nordwall) #6

Remove the Await. It is blocking the actor and prevent the actor from being stopped and thereby the system can’t stop.


#7

Oh gotcha thank you. Once I invoke the terminate() is there a programmatic way to check if the actor system is dead?


(Konrad `ktoso` Malawski) #8

Oh I missed you want to terminate the system afterwards. Thanks for correcting me Patrik!

You should use the Futures that are completed once the system terminates to do anything “after system terminated”:

/**
   * Returns a CompletionStage which will be completed after the ActorSystem has been terminated
   * and termination hooks have been executed. If you registered any callback with
   * [[ActorSystem#registerOnTermination]], the returned CompletionStage from this method will not complete
   * until all the registered callbacks are finished. Be careful to not schedule any operations
   * on the `dispatcher` of this actor system as it will have been shut down before this
   * future completes.
   */
  def getWhenTerminated: CompletionStage[Terminated]

Note that the pasted code is the Java DSL, it returns CompletionStage.


#9

Just wanted to thank everyone for their help. Just for anyone else who has this problem, this is my solution

getContext().system.terminate() 

is called in an actor if there is some fatal problem that requires the cluster to be shut down.

On creation of the actor system I registered a callback to issue a log (it will do more funky stuff later) to show the actor system shut down

CompletionStage<Terminated> stage = system.getWhenTerminated();
       	
stage.thenRun(() ->log.info( "actor system termianted"));