Akka (actor typed actors) cluster, 2 nodes in same jvm, they don't join the cluster

I’ve created an akka cluster with this config on node 1 (flattened):

akka.cluster.roles : [
    "distmem-server"
]
akka.logging-filter : "akka.event.slf4j.Slf4jLoggingFilter"
akka.remote.netty.tcp.hostname : "distmem.lan" <--- this exists in my hosts file and points to localhost, but tried 127.0.0.1 too
akka.cluster.pub-sub.gossip-interval : "150ms"
akka.loglevel : "warning"
akka.loggers : [
    "akka.event.slf4j.Slf4jLogger"
]
akka.actor.provider : "cluster"
akka.remote.netty.tcp.port : 2551
akka.dispatchers.split-dispatcher.executor : "fork-join-executor"
akka.cluster.seed-nodes : [
    "akka.tcp://distmem@distmem.lan:2551"
]
akka.jvm-exit-on-fatal-error : false
akka.dispatchers.split-dispatcher.fork-join-executor.parallelism-min : 1
akka.dispatchers.split-dispatcher.fork-join-executor.parallelism-max : 8
akka.dispatchers.split-dispatcher.fork-join-executor.parallelism-factor : 2
akka.dispatchers.split-dispatcher.type : "Dispatcher"
akka.remote.retry-gate-closed-for : 500
akka.remote.netty.tcp.maximum-frame-size : "32MB"
akka.dispatchers.split-dispatcher.throughput : 32
akka.cluster.jmx.multi-mbeans-in-same-jvm : "on"

And the config is the same on 2 apart from

akka.remote.netty.tcp.port : 2552

So I start node 1 (seed node), sleep for 2 secs and start node 2. Each node listens for MemberUp events and I get 1 member up event on each node. But those are for the node registering on itself.

INFO  c.a.d.Actors akka.tcp://distmem@distmem.lan:2551 : Joined the cluster : Member(address = akka.tcp://distmem@distmem.lan:2551, status = Up)
INFO  c.a.d.Actors akka.tcp://distmem@distmem.lan:2552 : Joined the cluster : Member(address = akka.tcp://distmem@distmem.lan:2552, status = Up)

My actor system name is “distmem”.

So seems both nodes create a separate cluster? Or maybe they don’t talk to each other? I am not getting any errors/exceptions. Any ideas what is wrong?

Thanks

You probably have the self node as the first seed-nodes element on both nodes, meaning that each one will join itself. Try with same seed-nodes config on both nodes:

akka.cluster.seed-nodes : [
    "akka.tcp://distmem@distmem.lan:2551", "akka.tcp://distmem@distmem.lan:2552"
]

More info in the https://doc.akka.io/docs/akka/current/cluster-usage.html#joining-to-seed-nodes. When moving to production I’d recommend Akka management bootstrap.

Thanks for the reply, I believe I have the correct setup, both nodes use:

akka {
cluster {
# Note: the test box should have a decl of distmem.lan under /etc/hosts
seed-nodes = [
“akka.tcp://distmem@distmem.lan:2551”
]

I just verified by logging the config of each akka system. For the one on port 2552, it is indeed:

akka.cluster.seed-nodes : [
“akka.tcp://distmem@distmem.lan:2551”
]
akka.remote.netty.tcp.port : 2552

I also do start the one on 2551, wait for 2 secs and then start the one on 2552 to make sure every node can find the seed node.

Ok, try to follow the Akka Cluster logs to see what’s going on. They are prefixed with ”Cluster Node” (on info level).

I think these are the relevant log entries:

2018-03-16 18:12:20,308 INFO a.c.Cluster(akka://distmem) -Cluster Node [akka.tcp://distmem@distmem.lan:2551] - Received InitJoin message from [Actor[akka.tcp://distmem@distmem.lan:2552/system/cluster/core/daemon/joinSeedNodeProcess-1#315565863]] to [akka.tcp://distmem@distmem.lan:2551]
2018-03-16 18:12:20,308 INFO a.c.Cluster(akka://distmem) -Cluster Node [akka.tcp://distmem@distmem.lan:2551] - Sending InitJoinAck message from node [akka.tcp://distmem@distmem.lan:2551] to [Actor[akka.tcp://distmem@distmem.lan:2552/system/cluster/core/daemon/joinSeedNodeProcess-1#315565863]]
2018-03-16 18:12:20,354 DEBUG a.s.Serialization(akka://distmem) -Using serializer [akka.cluster.protobuf.ClusterMessageSerializer] for message [akka.cluster.InternalClusterAction$InitJoinAck]
2018-03-16 18:12:20,366 DEBUG a.a.LocalActorRefProvider(akka://distmem) -resolve of path sequence [/system/cluster/core/daemon/joinSeedNodeProcess-1#315565863] failed

So it does try to communicate from 2551 to 2552 but it fails to resolve the path as per above. That is a system path, I suppose it should be available.

Is it ok to have the same system name in the same jvm? I am starting both actor systems like:

val system = ActorSystem(guardianActor, "distmem", config)

I don’t think that is the problem. Is that the last Cluster Node log message?

I’ve created a bare minimum project under https://github.com/kostaskougios/akka-cluster-test (clone & sbt run should do the trick)

I still get the issue. Here is the full log:

---------------- Starting server 1 -----------------------------------
INFO a.e.s.Slf4jLogger Slf4jLogger started
DEBUG a.e.EventStream logger log1-Slf4jLogger started
DEBUG a.e.EventStream Default Loggers started
INFO a.r.Remoting Starting remoting
INFO a.r.Remoting Remoting started; listening on addresses :[akka.tcp://distmem@127.0.0.1:2551]
INFO a.r.Remoting Remoting now listens on addresses: [akka.tcp://distmem@127.0.0.1:2551]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Starting up…
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Registered cluster JMX MBean [akka:type=Cluster]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Started up successfully
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Node [akka.tcp://distmem@127.0.0.1:2551] is JOINING, roles [distmem-server, dc-default]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Leader is moving node [akka.tcp://distmem@127.0.0.1:2551] to [Up]
DEBUG a.s.Serialization(akka://distmem) Using serializer [akka.cluster.ddata.protobuf.ReplicatorMessageSerializer] for message [akka.cluster.ddata.Replicator$Internal$DataEnvelope]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Trying to join [akka.tcp://distmem@127.0.0.1:2551] when already part of a cluster, ignoring
INFO t.Actors akka.tcp://distmem@127.0.0.1:2551 : Joined the cluster : Member(address = akka.tcp://distmem@127.0.0.1:2551, status = Up)
---------------- Starting server 2 -----------------------------------
INFO a.e.s.Slf4jLogger Slf4jLogger started
DEBUG a.e.EventStream logger log1-Slf4jLogger started
DEBUG a.e.EventStream Default Loggers started
INFO a.r.Remoting Starting remoting
INFO a.r.Remoting Remoting started; listening on addresses :[akka.tcp://distmem@127.0.0.1:2552]
INFO a.r.Remoting Remoting now listens on addresses: [akka.tcp://distmem@127.0.0.1:2552]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2552] - Starting up…
WARN a.c.Cluster(akka://distmem) Could not register Cluster JMX MBean with name=akka:type=Cluster as it is already registered. If you are running multiple clusters in the same JVM, set ‘akka.cluster.jmx.multi-mbeans-in-same-jvm = on’ in config
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2552] - Started up successfully
DEBUG a.s.Serialization(akka://distmem) Using serializer [akka.cluster.ddata.protobuf.ReplicatorMessageSerializer] for message [akka.cluster.ddata.Replicator$Internal$DataEnvelope]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2552] - Node [akka.tcp://distmem@127.0.0.1:2552] is JOINING, roles [distmem-server, dc-default]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2552] - Leader is moving node [akka.tcp://distmem@127.0.0.1:2552] to [Up]
INFO t.Actors akka.tcp://distmem@127.0.0.1:2552 : Joined the cluster : Member(address = akka.tcp://distmem@127.0.0.1:2552, status = Up)
DEBUG a.r.Remoting Associated [akka.tcp://distmem@127.0.0.1:2551] <- [akka.tcp://distmem@127.0.0.1:2552]
DEBUG a.r.EndpointWriter Associated [akka.tcp://distmem@127.0.0.1:2552] -> [akka.tcp://distmem@127.0.0.1:2551]
DEBUG a.s.Serialization(akka://distmem) Using serializer [akka.cluster.protobuf.ClusterMessageSerializer] for message [akka.cluster.InternalClusterAction$InitJoin]
DEBUG a.r.EndpointWriter Drained buffer with maxWriteCount: 50, fullBackoffCount: 1, smallBackoffCount: 0, noBackoffCount: 0 , adaptiveBackoff: 1000
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Received InitJoin message from [Actor[akka.tcp://distmem@127.0.0.1:2552/system/cluster/core/daemon/joinSeedNodeProcess-1#-1663174884]] to [akka.tcp://distmem@127.0.0.1:2551]
INFO a.c.Cluster(akka://distmem) Cluster Node [akka.tcp://distmem@127.0.0.1:2551] - Sending InitJoinAck message from node [akka.tcp://distmem@127.0.0.1:2551] to [Actor[akka.tcp://distmem@127.0.0.1:2552/system/cluster/core/daemon/joinSeedNodeProcess-1#-1663174884]]
DEBUG a.s.Serialization(akka://distmem) Using serializer [akka.cluster.protobuf.ClusterMessageSerializer] for message [akka.cluster.InternalClusterAction$InitJoinAck]
DEBUG a.a.LocalActorRefProvider(akka://distmem) resolve of path sequence [/system/cluster/core/daemon/joinSeedNodeProcess-1#-1663174884] failed
---------------- Terminating -----------------------------------------

I would expect the log message:
INFO t.Actors akka.tcp://distmem@127.0.0.1:2552 : Joined the cluster : Member(address = akka.tcp://distmem@127.0.0.1:2552, status = Up)

to come up twice for each of the cluster nodes. So 2551 to register to 2552 and 2552 to register to 2551, no?

Thanks

Node 2551 won’t register to 2552. It’s part of the cluster and instead it will welcome 2552. That message is not showing up in your logs.

On a previous message you posted some logs where we could see the InitJoin and InitJoinAck hand-shake. That’s good and points us to think that your config is correct. Nevertheless it will be useful if you paste here the full config on both nodes.

You also wrote:

Is it ok to have the same system name in the same jvm?

That’s not a problem, but I hope your are only doing it as an experimentation. It makes no sense to have a cluster formed on the same jvm.

The problem is that in the actor it is joining itself and therefore you get two separate clusters:

val address = cluster.selfMember.address
cluster.manager ! Join(address)

At the same time you have configuration of seed nodes which will also try to join, but that will be superseded by the Join message. If you remove that Join message and only used seed-nodes I think it will be all good.

2 Likes

Patrik, thanks that did the trick and now the cluster works OK.
Renato I am using 2 actor systems in the same jvm just to test.