The shutdown of one node in the cluster results in the shutdown of the other node

Hi,
I test Shard rebalancing in akka cluster sharding , the specific operations are as follow:
Config:

akka.cluster {
        seed-nodes = [
            "akka://cxcount@127.0.0.1:2553",
             "akka://cxcount@127.0.0.1:2554"
        ]
}

code:

@Override
    public CommandHandler<Command, Event, CountState> commandHandler() {
        return newCommandHandlerBuilder()
                .forAnyState()
                .onCommand(AddAmount.class,this::add)
                .onCommand(QueryAmount.class,this::query)
                .build();
    }

    private Effect<Event,CountState> add(CountState state ,AddAmount cmd){
        try {
            Thread.sleep(10000);
        }catch(Exception e){

        }
        return Effect().persist(new DataAdd(cmd.data))
                .thenRun(update -> cmd.replyTo.tell(new DataAdd(cmd.data)));
    }

I use “try{…}catch” to simulate message processing,and send multiple messages,these messages are all running on “2553 server”,then,I forced stopped “2553 server”。As a result,“2554 server” also stopped automatically,I don’t know what I did wrong.I expect “2554 server” to run normally and process the message of "2553 server " just now.The log information of “2554 server” is as follows:

[2022-11-01 10:51:23,143] [WARN] [akka.stream.Materializer] [cxcount-akka.actor.default-dispatcher-3] [akka.stream.Log(akka://cxcount/system/Materializers/StreamSupervisor-1)] - [outbound connection to [akka://cxcount@127.0.0.1:2553], message stream] Upstream failed, cause: StreamTcpException: The connection closed with error: The remote host forced an existing connection to close.
[2022-11-01 10:51:23,144] [WARN] [akka.stream.Materializer] [cxcount-akka.actor.default-dispatcher-3] [akka.stream.Log(akka://cxcount/system/Materializers/StreamSupervisor-1)] - [outbound connection to [akka://cxcount@127.0.0.1:2553], control stream] Upstream failed, cause: StreamTcpException: The connection closed with error: The remote host forced an existing connection to close.
[2022-11-01 10:51:27,214] [WARN] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-11] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Marking node as UNREACHABLE [Member(akka://cxcount@127.0.0.1:2553, Up)].
[2022-11-01 10:51:27,217] [INFO] [akka.cluster.sbr.SplitBrainResolver] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount/system/cluster/core/daemon/downingProvider] - This node is now the leader responsible for taking SBR decisions among the reachable nodes (more leaders may exist).
[2022-11-01 10:51:28,008] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-11] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - is the new leader among reachable nodes (more leaders may exist)
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [1] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.GossipStatus] from Actor[akka://cxcount/system/cluster/core/daemon#-835898627] to Actor[akka://cxcount/deadLetters] was not delivered. [2] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [3] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.GossipStatus] from Actor[akka://cxcount/system/cluster/core/daemon#-835898627] to Actor[akka://cxcount/deadLetters] was not delivered. [4] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [5] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [6] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,656] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [7] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:30,935] [WARN] [akka.stream.Materializer] [cxcount-akka.actor.default-dispatcher-14] [akka.stream.Log(akka://cxcount/system/Materializers/StreamSupervisor-1)] - [outbound connection to [akka://cxcount@127.0.0.1:2553], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1:2553,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused: no further information
[2022-11-01 10:51:38,594] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [8] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:38,594] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [9] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:38,594] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [10] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:38,594] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-11] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [11] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:38,994] [WARN] [akka.stream.Materializer] [cxcount-akka.actor.default-dispatcher-3] [akka.stream.Log(akka://cxcount/system/Materializers/StreamSupervisor-1)] - [outbound connection to [akka://cxcount@127.0.0.1:2553], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1:2553,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused: no further information
[2022-11-01 10:51:46,772] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [12] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:46,772] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [13] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:46,772] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/clusterReceptionist/replicator#353702940] to Actor[akka://cxcount/deadLetters] was not delivered. [14] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:46,772] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount/deadLetters] - Message [akka.cluster.ddata.Replicator$Internal$Status] from Actor[akka://cxcount/system/sharding/replicator#89667974] to Actor[akka://cxcount/deadLetters] was not delivered. [15] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:47,066] [WARN] [akka.stream.Materializer] [cxcount-akka.actor.default-dispatcher-11] [akka.stream.Log(akka://cxcount/system/Materializers/StreamSupervisor-1)] - [outbound connection to [akka://cxcount@127.0.0.1:2553], message stream] Upstream failed, cause: StreamTcpException: Tcp command [Connect(127.0.0.1:2553,None,List(),Some(5000 milliseconds),true)] failed because of java.net.ConnectException: Connection refused: no further information
[2022-11-01 10:51:47,216] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-11] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Leader can currently not perform its duties, reachability status: [akka://cxcount@127.0.0.1:2554 -> akka://cxcount@127.0.0.1:2553: Unreachable [Unreachable] (1)], member status: [akka://cxcount@127.0.0.1:2553 Up seen=false, akka://cxcount@127.0.0.1:2554 Up seen=true]
[2022-11-01 10:51:47,972] [WARN] [akka.cluster.sbr.SplitBrainResolver] [cxcount-akka.actor.default-dispatcher-14] [akka://cxcount/system/cluster/core/daemon/downingProvider] - SBR took decision DownReachable and is downing [akka://cxcount@127.0.0.1:2554] including myself,, [1] unreachable of [2] members, all members in DC [Member(akka://cxcount@127.0.0.1:2553, Up), Member(akka://cxcount@127.0.0.1:2554, Up)], full reachability status: [akka://cxcount@127.0.0.1:2554 -> akka://cxcount@127.0.0.1:2553: Unreachable [Unreachable] (1)]
[2022-11-01 10:51:47,973] [INFO] [akka.cluster.sbr.SplitBrainResolver] [cxcount-akka.actor.default-dispatcher-14] [akka://cxcount/system/cluster/core/daemon/downingProvider] - SBR is downing [UniqueAddress(akka://cxcount@127.0.0.1:2554,4445633720856653788)]
[2022-11-01 10:51:47,974] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-3] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Marking node [akka://cxcount@127.0.0.1:2554] as [Down]
[2022-11-01 10:51:47,977] [INFO] [akka.cluster.sharding.ShardRegion] [cxcount-akka.actor.default-dispatcher-14] [akka://cxcount@127.0.0.1:2554/system/sharding/Count] - Count: Self downed, stopping ShardRegion [akka://cxcount/system/sharding/Count]
[2022-11-01 10:51:47,977] [INFO] [akka.cluster.singleton.ClusterSingletonManager] [cxcount-akka.actor.default-dispatcher-14] [akka://cxcount@127.0.0.1:2554/system/sharding/CountCoordinator] - Self downed, stopping ClusterSingletonManager
[2022-11-01 10:51:47,977] [INFO] [akka.cluster.sbr.SplitBrainResolver] [cxcount-akka.actor.default-dispatcher-14] [akka://cxcount/system/cluster/core/daemon/downingProvider] - This node is not the leader any more and not responsible for taking SBR decisions.
[2022-11-01 10:51:48,234] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-5] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - is no longer leader
[2022-11-01 10:51:48,238] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-5] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Node has been marked as DOWN. Shutting down myself
[2022-11-01 10:51:48,238] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-5] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Shutting down...
[2022-11-01 10:51:48,255] [INFO] [akka.cluster.Cluster] [cxcount-akka.actor.default-dispatcher-5] [Cluster(akka://cxcount)] - Cluster Node [akka://cxcount@127.0.0.1:2554] - Successfully shut down
[2022-11-01 10:51:48,261] [INFO] [akka.actor.CoordinatedShutdown] [cxcount-akka.actor.default-dispatcher-5] [CoordinatedShutdown(akka://cxcount)] - Running CoordinatedShutdown with reason [ClusterDowningReason]
[2022-11-01 10:51:48,269] [INFO] [akka.actor.LocalActorRef] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount/system/cluster/core/daemon] - Message [akka.cluster.InternalClusterAction$GossipTick$] from Actor[akka://cxcount/system/cluster/core/daemon#-835898627] to Actor[akka://cxcount/system/cluster/core/daemon#-835898627] was not delivered. [16] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/system/cluster/core/daemon#-835898627] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:48,276] [INFO] [akka.actor.typed.ActorSystem] [ForkJoinPool.commonPool-worker-3] [] - WeatherServer http://127.0.0.1:8081/ graceful shutdown completed
[2022-11-01 10:51:48,292] [INFO] [akka.remote.RemoteActorRefProvider$RemotingTerminator] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount@127.0.0.1:2554/system/remoting-terminator] - Shutting down remote daemon.
[2022-11-01 10:51:48,301] [INFO] [akka.remote.RemoteActorRefProvider$RemotingTerminator] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount@127.0.0.1:2554/system/remoting-terminator] - Remote daemon shut down; proceeding with flushing remote transports.
[2022-11-01 10:51:49,311] [INFO] [akka.remote.RemoteActorRef] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount@127.0.0.1:2553/system/sharding/CountCoordinator/singleton/coordinator] - Message [akka.cluster.sharding.ShardCoordinator$Internal$RegionStopped] from Actor[akka://cxcount/system/sharding/Count#315222620] to Actor[akka://cxcount@127.0.0.1:2553/system/sharding/CountCoordinator/singleton/coordinator#-1309910999] was not delivered. [17] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount@127.0.0.1:2553/system/sharding/CountCoordinator/singleton/coordinator#-1309910999] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:49,311] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount/deadLetters] - Message [akka.remote.artery.Flush$] from Actor[akka://cxcount/system/flush-1#178273681] to Actor[akka://cxcount/deadLetters] was not delivered. [18] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:49,311] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount/deadLetters] - Message [akka.remote.artery.ActorSystemTerminating] from Actor[akka://cxcount/system/remoteFlushOnShutdown#-1062780213] to Actor[akka://cxcount/deadLetters] was not delivered. [19] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:49,311] [INFO] [akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef] [cxcount-akka.actor.default-dispatcher-5] [akka://cxcount/deadLetters] - Message [akka.remote.artery.ActorSystemTerminating] from Actor[akka://cxcount/system/remoteFlushOnShutdown#-1062780213] to Actor[akka://cxcount/deadLetters] was not delivered. [20] dead letters encountered. If this is not an expected behavior then Actor[akka://cxcount/deadLetters] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2022-11-01 10:51:49,337] [INFO] [akka.remote.RemoteActorRefProvider$RemotingTerminator] [cxcount-akka.actor.default-dispatcher-3] [akka://cxcount@127.0.0.1:2554/system/remoting-terminator] - Remoting shut down.

However,if I don’t send requests to “2553 server” ,and close it directly,it will not cause “2554 server” to close.

A two node cluster is problematic for the SBR as killing any side will mean half of the cluster unreachable. I recommend you run at least 3 nodes, or look into what options there are for SBR that could work better for a 2 node cluster.

Thanks , I seem to understand where the problem is. :rofl: