How to re create actors when the node gets terminated


   =>childA1,  childA2, childA3
       =>grandchild1, grandchild2
  =>childB1, childB1
  =>childB1, childB1

I had 3 Actor systems. Say i had a node N1 and when i try to create a parent and in turn creating children. It creates in the same parent which increases the load in N1 and i couldn’t distribute the load. This made me to use cluster sharding

Parent actor is starting trigger and I want to report all the child actors to its parent once it completes its execution so that I can proceed to next chain.

And I want to create actors for the above in a distribute fashion. So I have used cluster sharding and it is working fine.

I have 3 nodes named as N1, N2 and N3.

parent is created on either of the three nodes and it creates shard actor in either of above three nodes and shard actor creates child actor.

Say my parent created on N1 and it is creating shard actor on N2 and this creates child actor. But If the node N1 gets terminated due to the JVM crash then my child will become orphan and it will send reply message to dead letters.

So how to re create a parent actor for the above case.

I am using cluster sharding to create actors in a distribute fashion. Is it correct way to implement Or should I go with something else ??

Please guide me!!

Since you asked me on another thread to take a look at this …

I’m somewhat confused by what you are doing, and in particular, how you are trying to use sharding to solve it. Especially, with the “grandchildren”. So take this all with a grain of salt, as I know that I don’t fully understand what you are trying to do.

But my first instinct is that you would be better off with cluster singleton for your parent (so that it can be automatically recreated if it’s current node goes away) and a cluster aware router for the children (so that it can distribute work across the cluster).

Note that you take this strategy you may not want to do a simple reply, as a reply would go to the original sender. You’d want to look up the singleton via proxy again when you are sending the results to it so that you’d have a connection to the new worker. There are also some edge cases you’d need to deal with when the singleton is moving. I believe that the singleton docs link to a distributed workers example that might be helpful to you. It looks dated though.

But it depends a bit on the workload, how you are kicking off the parent, and how the parent deals with the responses. For example, you might be better off just letting everything fail and when the parent restarts it can start a whole new set of children. But it depends on the workload and how the parent is getting kicked off.

Thanks @davidogren

My aim is to attain entity actor recreation when the node gets terminated. I referred the doc where it mentioned that by setting remember-entities=on in Cluster sharding settings we can re create an entity actor though the node gets terminated.

I have tried but i don’t know what I am missing.

Kindly have a look at my source code and let me know your thoughts


I took quick look at your repo, but it wasn’t clear to me either how you build/run it (there’s no build file) or what isn’t working as you expected. Ideally it would be great if you had a unit test you think you should pass that fails. (Removing the Redis dependency would be great too.)

Mostly, I’m trying to figure out if you are having problems getting remember-entities working. (In which case, we can try to troubleshoot your config.) Or whether you are just surprised that actors are getting dead letters when responding to a shard that’s been rebalanced. Because that would be normal. As I mentioned in my other response, if you do a simple reply it isn’t going to go through the sharding system again. Even though a rebalanced entity may be the same logically it is a different actor.

Thanks for taking a look into my repo

I will update and let you know