Node/pod on rolling restart does not buffer message to shard proxy


Messages for shards belonging to a shard region are buffered when the node/pod is undergoing a restart. I noticed however if it has a message for a shard proxy, ie for a role not in the node, that message is not buffered. Since the node is restarting the message to the shard proxy ends up in the dead letter. Shouldn’t it be buffered like messages to its own shards?

A shard being rebalanced means that we know it can start in a new region, and messages to its entities can be forwarded to their new home once the ShardAllocationStrategy has decided where and the region has started the shard anew.

A shard proxy on the other hand is only a way for a node not hosting shards to pass messages to the nodes actually hosting shards. The only messages that go to the shard proxies from sharding is coordination messages around where shards are allocated. Responses from the sharded entity actors go directly to the requesting actor, not through the sharding infrastructure on their way back.

If the node with the proxy where a request came from is restarted, the original actor sending the request will no longer exist, so a response can not reach it, even the sharded actor would buffer the message and retry.

Thank you for the reply. I don’t have the logs with me now but we get an error indicating the shard proxy cannot be reached when the pods is sending/coordinating with its shard proxy while restarting. I will post the logs if I encounter them again.