Hey guys, we have a cluster with 3 Nodes and using Rolling-Update as deployment strategy. Current version of akka is 2.6.3 and for management is 1.0.5
The situation goes like this: we have the cluster with all the three nodes healthy and responding incoming requests. In a certain moment we trigger a deploy and old nodes stuck in Leaving causing that new nodes can’t join.
Disclaimer: This behaviour doesn’t occur in all deploys. Some deploys goes OK and some others causes this situation.
We are using akka-dns and deploying through Kubernetess. I’m going to attach logs from kibana and cluster status from Akka Managament API info.
In the following gist you have the logs from Kibana, I first made a deploy at 18:34:22.487Z and it finishes OK at 18:38:12.333Z. Then at 19:31:15.179Z I made another deploy causing the problem.
Also in this gist is the response of Akka Managamente API