Failed to set up a cluster using Akka Cluster Bootstrap without specifying seed nodes

I’m trying to run a Lagom service in production mode in an Akka cluster, which is configured via Akka Cluster Bootstrap as described in https://www.lagomframework.com/documentation/1.5.x/scala/Cluster.html. (I was able to run the app by specifying seed nodes manually). However, I could not manage to start the service. I have the following setup:

application.conf (only the cluster related configs)
akka.management.cluster.bootstrap {
  # example using kubernetes-api
  contact-point-discovery {
    discovery-method = akka.discovery
#    discovery-method = config
    service-name = "lagom-scala"
    required-contact-point-nr = 0
  }
}

An application loader, loading the AkkaDiscoveryComponents in the production mode as described here (https://www.lagomframework.com/documentation/1.5.x/scala/AkkaDiscoveryIntegration.html):

class LagomscalaLoader extends LagomApplicationLoader {

  override def load(context: LagomApplicationContext): LagomApplication =
    new LagomscalaApplication(context) with AkkaDiscoveryComponents

  override def loadDevMode(context: LagomApplicationContext): LagomApplication =
    new LagomscalaApplication(context) with LagomDevModeComponents

  override def describeService = Some(readDescriptor[LagomscalaService])
}

I get the following logs when required-contact-point-nr is set to 0:

2019-10-28T23:48:54.867Z [info] akka.management.cluster.bootstrap.internal.BootstrapCoordinator [sourceThread=application-akka.actor.default-dispatcher-26, akkaTimestamp=23:48:54.867UTC, akkaSource=akka.tcp://application@192.168.0.34:2552/system/bootstrapCoordinator, sourceActorSystem=application] - Looking up [Lookup(lagom-scala,None,Some(tcp))]
2019-10-28T23:48:54.886Z [info] akka.management.cluster.bootstrap.internal.BootstrapCoordinator [sourceThread=application-akka.actor.default-dispatcher-22, akkaTimestamp=23:48:54.886UTC, akkaSource=akka.tcp://application@192.168.0.34:2552/system/bootstrapCoordinator, sourceActorSystem=application] - Located service members based on: [Lookup(lagom-scala,None,Some(tcp))]: [], filtered to []
2019-10-28T23:48:55.957Z [info] akka.management.cluster.bootstrap.LowestAddressJoinDecider [sourceThread=application-akka.actor.default-dispatcher-16, akkaTimestamp=23:48:55.957UTC, akkaSource=LowestAddressJoinDecider(akka://application), sourceActorSystem=application] - Exceeded stable margins without locating seed-nodes, however this node 192.168.0.34:8558 is NOT the lowest address out of the discovered endpoints in this deployment, thus NOT joining self. Expecting node [] (out of []) to perform the self-join and initiate the cluster.

When I set required-contact-point-nr to 2 (default), I get the following logs:

2019-10-29T00:15:57.846Z [info] akka.management.cluster.bootstrap.internal.BootstrapCoordinator [sourceThread=application-akka.actor.default-dispatcher-23, akkaTimestamp=00:15:57.846UTC, akkaSource=akka.tcp://application@192.168.0.34:2552/system/bootstrapCoordinator, sourceActorSystem=application] - Looking up [Lookup(lagom-scala,None,Some(tcp))]
2019-10-29T00:15:57.865Z [info] akka.management.cluster.bootstrap.internal.BootstrapCoordinator [sourceThread=application-akka.actor.default-dispatcher-4, akkaTimestamp=00:15:57.865UTC, akkaSource=akka.tcp://application@192.168.0.34:2552/system/bootstrapCoordinator, sourceActorSystem=application] - Located service members based on: [Lookup(lagom-scala,None,Some(tcp))]: [], filtered to []
2019-10-29T00:15:58.299Z [info] akka.management.cluster.bootstrap.LowestAddressJoinDecider [sourceThread=application-akka.actor.default-dispatcher-3, akkaTimestamp=00:15:58.299UTC, akkaSource=LowestAddressJoinDecider(akka://application), sourceActorSystem=application] - Discovered [0] contact points, confirmed [0], which is less than the required [2], retrying
2019-10-29T00:15:58.599Z [warn] akka.cluster.sharding.ShardRegion [sourceThread=application-akka.actor.default-dispatcher-4, akkaTimestamp=00:15:58.597UTC, akkaSource=akka.tcp://application@192.168.0.34:2552/system/sharding/kafkaProducer-greetings, sourceActorSystem=application] - kafkaProducer-greetings: No coordinator found to register. Probably, no seed-nodes configured and manual cluster join not performed? Total [1] buffered messages.

I use Akka 2.5.25 and default configurations except the ones I specified above. E.g. I see the following logs that might be of relevance after running the service:

2019-10-29T00:15:44.987Z [info] akka.remote.Remoting [sourceThread=main, akkaTimestamp=00:15:44.987UTC, akkaSource=akka.remote.Remoting, sourceActorSystem=application] - Remoting now listens on addresses: [akka.tcp://application@192.168.0.34:2552]
2019-10-29T00:15:45.276Z [info] akka.cluster.Cluster(akka://application) [sourceThread=application-akka.actor.default-dispatcher-2, akkaSource=akka.cluster.Cluster(akka://application), sourceActorSystem=application, akkaTimestamp=00:15:45.275UTC] - Cluster Node [akka.tcp://application@192.168.0.34:2552] - No seed-nodes configured, manual cluster join required, see https://doc.akka.io/docs/akka/current/cluster-usage.html#joining-to-seed-nodes
2019-10-29T00:15:46.411Z [info] akka.management.cluster.bootstrap.ClusterBootstrap [sourceThread=main, akkaTimestamp=00:15:46.411UTC, akkaSource=ClusterBootstrap(akka://application), sourceActorSystem=application] - Using self contact point address: http://192.168.0.34:8558
2019-10-29T00:15:48.164Z [info] akka.management.scaladsl.AkkaManagement [sourceThread=application-akka.actor.default-dispatcher-24, akkaSource=AkkaManagement(akka://application), sourceActorSystem=application, akkaTimestamp=00:15:48.163UTC] - Bound Akka Management (HTTP) endpoint to: 192.168.0.34:8558
2019-10-29T00:15:48.286Z [info] akka.management.cluster.bootstrap.internal.BootstrapCoordinator [sourceThread=application-akka.actor.default-dispatcher-24, akkaSource=akka.tcp://application@192.168.0.34:2552/system/bootstrapCoordinator, sourceActorSystem=application, akkaTimestamp=00:15:48.285UTC] - Locating service members. Using discovery [akka.discovery.aggregate.AggregateServiceDiscovery], join decider [akka.management.cluster.bootstrap.LowestAddressJoinDecider]
2019-10-29T00:15:48.772Z [info] play.core.server.AkkaHttpServer [] - Listening for HTTP on /0:0:0:0:0:0:0:0:9000

So, I think there is a mismatch between the ports but I couldn’t figure out how to fix it. Thanks for your help

PS: The question is also available at Stackoverflow (cannot paste link and sorry for cross-posting)

Is this being deployed to Kubernetes? If not, what kind of service discovery mechanism do you have in your production environment?

Cluster Bootstrap relies on querying an external source of truth to get a list of the nodes that should form a cluster. In this case, it is doing a DNS query for lagom-scala and getting zero results back. Are you expecting this hostname to be found in DNS?

Thanks a lot for your answer. Was really looking for some help.

I’m trying to deploy this locally as a first step. I build only the first service in the hello-world example (i.e. the hello-impl service) with sbt dist and run it via ./lagom-scala-impl-1.0-SNAPSHOT/bin/lagom-scala-impl -Dplay.http.secret.key=changemeasdf.

BTW, please note that I also tried to include service-name-mappings configuration as described in the documentation, like:

lagom.akka.discovery.service-name-mappings {
    service-name-mappings {
        lagom-scala {
            lookup = "lagom-scala" #(I also tried to provide full SRV names for the `lookup` field like `_http._tcp.lagom-scala` or `_http._tcp.lagomscala`. But none of them worked.
            scheme = http
        }
    }
}

As far as I could follow the documentation, this configuration should have started the service by instantiating a cluster where dynamic configurations are used as explained at: Lagom ServiceLocator using deterministic hostnames

I already read the relevant Lagom and Akka documentations but I’m not sure what I’m missing. You mention DNS as a standalone component. Do I need to establish something related to it in my local settings? (If so, could you please point me to some relevant documentation) I was thinking the locator would be providing DNS-related capabilities.

Lagom doesn’t provide the DNS server, it only does the DNS queries and expects the server to be provided by your deployment environment. For a local test, using seed nodes as you had originally done is the simplest way to configure a cluster.