majority-min-cap=5 means that it will write and read to more nodes in small clusters. For example in a cluster of 5 nodes it would use 3 for ordinary majority write/read, but with majority-min-cap=5 it would use all 5. In a cluster that is not changing it’s size majority would be enough, but this is to reduce the risk of that writes are “lost” in small clusters that are changing between the write and the read.
For example 5 nodes, N1, N2, …N5. The write goes to N1, N2 and N3. Right after that all those 3 nodes are shutdown before the information is disseminated to N4 and N5. Maybe a few more are joining in a rolling update. That could result in that the write is “lost”. With majority-min-cap=5 the write will also update N4 and N5.
It’s rather unlikely to happen because it’s disseminated to all nodes right afterwards because and it’s enough to read from one of the nodes with the latest information.
I think we choose a conservative default to be on the safe side. Reducing it to 3 would probably be enough if you have a good rolling update strategy in place, e.g. leaving the oldest until last and not shutting down too many at the same time.
What do you mean by stuck? Stuck as in that it never recovers, or just takes some extra time? It should never be completely stuck because of this. If that is what you see you are missing something, such as a proper downing strategy, or there is a bug in Akka. If it’s the latter it would be good if you can create an issue, share logs and assist in investigation.