Unexpected ShardAllocationStrategy behavior

bert · March 22, 2020, 7:31am

Hi,

in our application we noticed that the ShardAllocationStrategy sometimes behaves in an unexpected manner. We’re using Akka 2.5.27 (yes we’ll upgrade to 2.5.30).

As an example, suppose you have a cluster of 4 members, all of which participate in sharding. Sharding is configured to use the LeastShardAllocationStrategy. Then you send 4 messages to the sharding actor ref, each targeting a different shard. The behavior that I would expect is that 4 new shards will be created, one on each member.

However, that seems not always to happen in practice. Sometimes, all shards are allocated on the first member in the cluster. Only after some time (seconds), the shard reallocation policy will distribute the shards evenly across the cluster. We want to avoid such behavior because shard reallocation is (somewhat) disruptive.

Further investigation shows that the allocateShard method can be called concurrently with other allocateShard invocations. If that is the case, then obviously, the currentShardAllocations parameter that is passed is obsolete because it is not aware of the other invocations.

def allocateShard(
    requester: ActorRef,
    shardId: ShardId,
    currentShardAllocations: Map[ActorRef, immutable.IndexedSeq[ShardId]]): Future[ActorRef]

In the example, the allocateShard method would basically be called 4 times with an empty map of currentShardAllocations. The result being that the policy would just choose 4 times the first member since they all have the same amount of shards.

My question is if this is an undocumented feature or a bug?

Note: I can see two workarounds. First, if the allocateShard method returns a completed future then the behavior seems not to manifest itself. I’d prefer not to rely on this approach because it feels very much relying on the implementation of futures and thread scheduling. But it is good to be aware of this if you would like to reproduce. Second (and this is what we’ll be doing), we’ll have the policy implemented by an actor that (1) remembers all allocations it did and (2) serializes all requests sent to it. Because the actor remembers what it did, it can correct the provided currentShardAllocations with allocations that it previously did but that are not (yet) part of the state.

thanks,

Bert

patriknw · March 24, 2020, 7:46am

Hi @bert ,

Sounds like it would be nice to improve that, although I’m not sure we want to sequence all allocation requests by default. Thanks for asking here first. Next step would be if you can create an issue and we can start thinking about what solution would be possible.

Your workaround with a custom allocation strategy that is using an Actor is a valid way of solving it in your application.

bert · March 24, 2020, 9:47pm

Thanks for the reply.

Issue is created => https://github.com/akka/akka/issues/28798

Topic		Replies	Views
Akka cluster sharding base on traffic of shards Akka Cluster	1	706	September 21, 2020
External shard allocation strategy via explicit requests Akka Cluster akka-cluster	1	197	December 19, 2023
Multiple sharding in a Cluster Akka Cluster akka	2	877	June 4, 2019
Akka sharding with consistent hashing instead of shard coordinator Akka Cluster	6	2317	September 10, 2019
Akka Cluster Sharding Issue Persistence / Event Sourcing	2	2235	April 17, 2018

Unexpected ShardAllocationStrategy behavior

Related Topics