Consistency of shard coordinator

How does the shard coordinator (in distributed data mode) guarantee that an entity runs on at most one node at a time, given that distributed data mode is only eventually consistent?

The documentation mentions that the coordinator runs with Write Majority/Read Majority consistency. But I can’t figure out how that helps in partial failure scenarios. Here’s one example:

  1. Coordinator C1 starts to allocate a new shard S to shard region SR1, writes the mapping to a minority of nodes, and dies.
  2. Coordinator C2 starts up, reads from some nodes, one of those nodes has the mapping of S to SR1.
  3. C2 tells SR1 about the mapping for S. This starts S on that region.
  4. C2 dies.
  5. Coordinator C3 starts up, reads from some nodes, none of them have the mapping of S to SR1 (because it was only written on a minority of nodes).
  6. C3 allocates S on a different shard region SR2.

What prevents this from happening? More generally, is the a place to read more algorithmic details about the shard coordinator in distributed data mode, beyond just reading the code?

This scenario can happen, and it’s a bug. See https://github.com/akka/akka/issues/28856#issuecomment-607750168 for details.