Akka management health checks not running in Lagom 1.5

Theoretically Lagom 1.5 enables health checks by default. However I could not get them running unless in combination with cluster bootstrapping.
How could I enable them in non-clustered services?

Hi @ale222,

Lagom 1.5.0-RC2 should start the Akka Management HTTP server and expose the Health Check routes anytime you start a Lagom server.
When your Lagom service is also an Akka Cluster, then the Health Checks will use the Cluster membership status as part of the decision process to determine if the node is healthy.

Are you saying that the port is not exposed, or that it’s exposed but has not health check routes added, or…? Do you happen to have a reproducer for the issue?

We’re about to release Lagom 1.5.0 and it’d be great if we could get this fix (is there’s an issue) before the release.

Thanks!

BTW: for the moment, Lagom doesn’t use Akka Management HTTP in Dev Mode (only in Prod).

Are you observing this issue in Prod?

Hi Ignasi, thanks for your reply.

That’s on prod, in kubernetes (Google KE). I have two services. The one with Akka Cluster works as expected.
The other one that’s a stateless service without Akka Cluster, I wanted simply to benefit from akka management helthchecks routes (which from my understanding are retourning 200 status code by default) and don’t write my own.
The GKE reports “Readiness probe failed: HTTP probe failed with statuscode: 500”
Here are my application.conf and kubernetes.yml for the service in question.

play {
  application.loader = com.trg.mt.search.SearchApplicationLoader

  // To properly setup the CORSFilter, please refer to https://playframework.com/documentation/2.5.x/CorsFilter
  // This example is only meant to show what's required for Lagom to use CORS.
  filters.cors {
    // review the values of all these settings to fulfill your needs. These values are not meant for production.
    pathPrefixes = ["/api"]
    allowedOrigins = ["http://localhost:3000"]
    allowedHttpMethods = null
    allowedHttpHeaders = ["Accept"]
    exposedHeaders = []
    supportsCredentials = false
    preflightMaxAge = 6 hour
  }
}

akka.actor.enable-additional-serialization-bindings = on

# The properties below override Lagom default configuration with the recommended values for new projects.
#
# Lagom has not yet made these settings the defaults for backward-compatibility reasons.

# Enable the serializer provided in Akka 2.5.8+ for akka.Done and other internal
# messages to avoid the use of Java serialization.
akka.actor.serialization-bindings {
  "akka.Done"                 = akka-misc
  "akka.actor.Address"        = akka-misc
  "akka.remote.UniqueAddress" = akka-misc
}

lagom.circuit-breaker {
  elasticsearch-circuitbreaker {
    # Possibility to disable a given circuit breaker.
    enabled = on

    # Number of failures before opening the circuit.
    max-failures = 10

    # Duration of time after which to consider a call a failure.
    call-timeout = 10s

    # Duration of time in open state after which to attempt to close
    # the circuit, by first entering the half-open state.
    reset-timeout = 15s
  }
}
play {
  server {
    pidfile.path = "/dev/null"
  }

  http.secret.key = "${APPLICATION_SECRET}"
}

akka {
  discovery.method = akka-dns
}

lagom.akka.discovery.service-name-mappings.elastic-search.lookup = ${?ELASTICSEARCH_SERVICE_NAME}
apiVersion: "apps/v1"
kind: Deployment
metadata:
  name: search-impl
spec:
  replicas: 1
  selector:
    matchLabels:
      app: search-impl

  template:
    metadata:
      labels:
        app: search-impl
    spec:
      containers:
        - name: search-impl
          image: "us.gcr.io/mt-lagom/search-impl:latest"
          env:
            - name: JAVA_OPTS
              value: "-Xms256m -Xmx256m -Dconfig.resource=prod-application.conf"
            - name: APPLICATION_SECRET
              valueFrom:
                secretKeyRef:
                  name: mt-application-secret
                  key: secret
            - name: KAFKA_SERVICE_NAME
              value: "_broker._tcp.reactive-sandbox-kafka.default.svc.cluster.local"
            - name: ELASTICSEARCH_SERVICE_NAME
              value: "_http._tcp.reactive-sandbox-elasticsearch.default.svc.cluster.local"
            - name: REQUIRED_CONTACT_POINT_NR
              value: "1"
          ports:
            - name: management
              containerPort: 8558
              protocol: TCP
            - name: remoting
              containerPort: 2552
              protocol: TCP
          readinessProbe:
            httpGet:
              path: "/ready"
              port: management
            periodSeconds: 10
            failureThreshold: 10
            initialDelaySeconds: 20
          livenessProbe:
            httpGet:
              path: "/alive"
              port: management
            periodSeconds: 10
            failureThreshold: 10
            initialDelaySeconds: 20
          resources:
            limits:
              memory: 512Mi
            requests:
              cpu: 0.25
              memory: 512Mi
---
apiVersion: v1
kind: Service
metadata:
  name: search-impl
spec:
  ports:
    - name: http
      port: 80
      targetPort: 9000
  selector:
    app: search-impl
  type: LoadBalancer
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: pod-reader
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: read-pods
subjects:
  - kind: User
    name: system:serviceaccount:default:default
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Let me know if you need additional info.

Hi @ale222,

Does the liveness probe work? I guess it is. If it’s working it means that akka management has started and is serving on port 9000.
Another indication that it’s running is that you are getting 500 error. It’s throwing an exception when doing the check.

I’m wondering if it’s trying to use the cluster membership check for some reason and then failing. I need to be able to reproduce it.

Can you share with us the build.sbt file and the list of components you are adding to SearchApplicationLoader?

Thanks,

Renato

Sure @octonato,
I’ve just executed the curl on the /alive which works fine and /ready which returns Not Healthy.
So here is the sbt file:

organization in ThisBuild := "com.trg"

// the Scala version that will be used for cross-compiled libraries
scalaVersion in ThisBuild := "2.12.8"

val playJsonDerivedCodecs = "org.julienrf" %% "play-json-derived-codecs" % "4.0.0"
val macwire = "com.softwaremill.macwire" %% "macros" % "2.3.2" % "provided"
val cassandraDriverExtras = "com.datastax.cassandra" % "cassandra-driver-extras" % "3.5.1"

val akkaDiscoveryServiceLocator = "com.lightbend.lagom" %% "lagom-scaladsl-akka-discovery-service-locator" % "1.0.0"
val akkaDiscoveryKubernetesApi = "com.lightbend.akka.discovery" %% "akka-discovery-kubernetes-api" % "1.0.0"

val scalaTest = "org.scalatest" %% "scalatest" % "3.0.7" % Test
val scalaTestPlusPlay = "org.scalatestplus.play" %% "scalatestplus-play" % "3.1.2" % Test
val mockito = "org.mockito" % "mockito-core" % "2.23.4" % Test

ThisBuild / scalacOptions ++= List("-encoding", "utf8", "-deprecation", "-feature", "-unchecked", "-Xfatal-warnings")

def dockerSettings = Seq(
  dockerUpdateLatest := true,
  dockerBaseImage := "adoptopenjdk/openjdk8",
  dockerUsername := sys.props.get("docker.username"),
  dockerRepository := sys.props.get("docker.registry")
)

// Update the version generated by sbt-dynver to remove any + characters, since these are illegal in docker tags
version in ThisBuild ~= (_.replace('+', '-'))
dynver in ThisBuild ~= (_.replace('+', '-'))

lazy val root = (project in file("."))
  .settings(name := "mt")
  .aggregate(
    `set-api`,
    `set-impl`,
    `search-api`,
    `search-impl`
  )
  .settings(
    commonSettings: _*
  )
  .settings(
    publish in Docker := Unit
  )

lazy val `set-api` = (project in file("set-api"))
  .settings(commonSettings: _*)
  .settings(
    libraryDependencies ++= Seq(
      lagomScaladslApi,
      playJsonDerivedCodecs,
      "org.sangria-graphql" %% "sangria-play-json" % "1.0.5"
    )
  )

lazy val `set-impl` = (project in file("set-impl"))
  .enablePlugins(LagomScala)
  .settings(commonSettings: _*)
  .settings(
    libraryDependencies ++= Seq(
      lagomScaladslPersistenceCassandra,
      lagomScaladslKafkaBroker,
      lagomScaladslTestKit,
      cassandraDriverExtras,
      akkaDiscoveryServiceLocator,
      akkaDiscoveryKubernetesApi,
      "org.sangria-graphql" %% "sangria" % "1.4.2",
      "com.github.tkqubo" % "html-to-markdown" % "0.7.2",
      macwire,
      filters,
      scalaTest
    )
  )
  .settings(dockerSettings)
  .settings(lagomForkedTestSettings)
  .dependsOn(`set-api`)

lazy val `search-api` = (project in file("search-api"))
  .settings(commonSettings: _*)
  .settings(
    libraryDependencies ++= Seq(
      lagomScaladslApi,
      playJsonDerivedCodecs
    )
  )

lazy val `search-impl` = (project in file("search-impl"))
  .enablePlugins(LagomScala)
  .settings(commonSettings: _*)
  .settings(
    libraryDependencies ++= Seq(
      lagomScaladslPersistenceCassandra,
      lagomScaladslKafkaClient,
      lagomScaladslTestKit,
      akkaDiscoveryServiceLocator,
      macwire,
      filters,
      scalaTest
    )
  )
  .settings(dockerSettings)
  .settings(lagomForkedTestSettings)
  .dependsOn(`search-api`, `set-api`)

def evictionSettings: Seq[Setting[_]] = Seq(
  // This avoids a lot of dependency resolution warnings to be showed.
  // They are not required in Lagom since we have a more strict whitelist
  // of which dependencies are allowed. So it should be safe to not have
  // the build logs polluted with evictions warnings.
  evictionWarningOptions in update := EvictionWarningOptions.default
    .withWarnTransitiveEvictions(false)
    .withWarnDirectEvictions(false)
)

def commonSettings: Seq[Setting[_]] = evictionSettings ++ Seq(
  javacOptions in Compile ++= Seq("-encoding", "UTF-8", "-source", "1.8"),
  javacOptions in(Compile, compile) ++= Seq("-Xlint:unchecked", "-Xlint:deprecation", "-parameters", "-Werror"),
  scalacOptions ++= List("-encoding", "utf8", "-feature", "-deprecation", "-unchecked"),

  testOptions in Test ++= Seq(
    // Show the duration of tests
    Tests.Argument(TestFrameworks.ScalaTest, "-oD")
  )
)

lagomCassandraCleanOnStart in ThisBuild := false

// ------------------------------------------------------------------------------------------------

// register 'elastic-search' as an unmanaged service on the service locator so that at 'runAll' our code
// will resolve 'elastic-search' and use it. See also com.example.com.ElasticSearch
lagomUnmanagedServices in ThisBuild += ("elastic-search" -> "http://127.0.0.1:9200")

lagomKafkaPropertiesFile in ThisBuild :=
  Some((baseDirectory in ThisBuild).value / "project" / "kafka-server.properties")

and the SearchApplicationLoader

import com.lightbend.lagom.scaladsl.akka.discovery.AkkaDiscoveryComponents
import com.lightbend.lagom.scaladsl.broker.kafka.LagomKafkaClientComponents
import com.lightbend.lagom.scaladsl.devmode.LagomDevModeComponents
import com.lightbend.lagom.scaladsl.server.{LagomApplication, LagomApplicationContext, LagomApplicationLoader}
import com.softwaremill.macwire._
import com.trg.elasticsearch.response.SearchResult
import com.trg.elasticsearch.{ElasticSearchIndexedStore, Elasticsearch}
import com.trg.mt.search.api.SearchService
import com.trg.mt.search.impl.{BrokerEventConsumer, SearchServiceImpl}
import com.trg.mt.set.api.SetService
import play.api.libs.ws.ahc.AhcWSComponents
import play.api.mvc.EssentialFilter
import play.filters.cors.CORSComponents

abstract class SearchApplication(context: LagomApplicationContext)
  extends LagomApplication(context)
    with LagomKafkaClientComponents
    with CORSComponents
    with AhcWSComponents {

  lazy val indexedStore: IndexedStore[SearchResult] = wire[ElasticSearchIndexedStore]

  lazy val elasticSearch = serviceClient.implement[Elasticsearch]

  lazy val setService = serviceClient.implement[SetService]

  lazy val searchService: SearchServiceImpl = wire[SearchServiceImpl]

  override lazy val lagomServer = serverFor[SearchService](searchService)

  override val httpFilters: Seq[EssentialFilter] = Seq(corsFilter)

  wire[BrokerEventConsumer]

}

class SearchApplicationLoader extends LagomApplicationLoader {

  override def load(context: LagomApplicationContext) =
    new SearchApplication(context) with AkkaDiscoveryComponents

  override def loadDevMode(context: LagomApplicationContext) =
    new SearchApplication(context) with LagomDevModeComponents

  override def describeService = Some(readDescriptor[SearchService])
}

and thanks for your help.

I think I already see the problem on search-imp, the dependency lagomScaladslPersistenceCassandra.
I’ll remove and test it.

After removing lagomScaladslPersistenceCassandra the service doesn’t start.
The stacktrace:

Picked up JAVA_TOOL_OPTIONS: 
Oops, cannot start the server.
java.lang.ClassNotFoundException: akka.remote.UniqueAddress
	at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at akka.actor.ReflectiveDynamicAccess.$anonfun$getClassFor$1(ReflectiveDynamicAccess.scala:22)
	at scala.util.Try$.apply(Try.scala:213)
	at akka.actor.ReflectiveDynamicAccess.getClassFor(ReflectiveDynamicAccess.scala:21)
	at akka.serialization.Serialization.$anonfun$bindings$3(Serialization.scala:419)
	at scala.collection.TraversableLike$WithFilter.$anonfun$map$2(TraversableLike.scala:742)
	at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
	at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:465)
	at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:741)
	at akka.serialization.Serialization.<init>(Serialization.scala:417)
	at akka.serialization.SerializationExtension$.createExtension(SerializationExtension.scala:16)
	at akka.serialization.SerializationExtension$.createExtension(SerializationExtension.scala:13)
	at akka.actor.ActorSystemImpl.registerExtension(ActorSystem.scala:962)
	at akka.actor.ActorSystemImpl.$anonfun$loadExtensions$1(ActorSystem.scala:995)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at akka.actor.ActorSystemImpl.loadExtensions$1(ActorSystem.scala:993)
	at akka.actor.ActorSystemImpl.loadExtensions(ActorSystem.scala:1007)
	at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:882)
	at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:870)
	at akka.actor.ActorSystemImpl._start(ActorSystem.scala:870)
	at akka.actor.ActorSystemImpl.start(ActorSystem.scala:891)
	at akka.actor.ActorSystem$.apply(ActorSystem.scala:246)
	at play.api.libs.concurrent.ActorSystemProvider$.start(Akka.scala:196)
	at play.api.libs.concurrent.ActorSystemProvider$.start(Akka.scala:145)
	at com.lightbend.lagom.scaladsl.server.ActorSystemProvider$.start(LagomApplicationLoader.scala:268)
	at com.lightbend.lagom.scaladsl.server.LagomApplication.actorSystem$lzycompute(LagomApplicationLoader.scala:245)
	at com.lightbend.lagom.scaladsl.server.LagomApplication.actorSystem(LagomApplicationLoader.scala:244)
	at com.lightbend.lagom.scaladsl.server.AkkaManagementComponents.$init$(AkkaManagementComponents.scala:24)
	at com.lightbend.lagom.scaladsl.server.LagomApplication.<init>(LagomApplicationLoader.scala:228)
	at com.trg.mt.search.SearchApplication.<init>(SearchApplicationLoader.scala:18)
	at com.trg.mt.search.SearchApplicationLoader$$anon$4.<init>(SearchApplicationLoader.scala:42)
	at com.trg.mt.search.SearchApplicationLoader.load(SearchApplicationLoader.scala:42)
	at com.trg.mt.search.SearchApplicationLoader.load(SearchApplicationLoader.scala:39)
	at com.lightbend.lagom.scaladsl.server.LagomApplicationLoader.load(LagomApplicationLoader.scala:77)
	at play.core.server.ProdServerStart$.start(ProdServerStart.scala:57)
	at play.core.server.ProdServerStart$.main(ProdServerStart.scala:29)
	at play.core.server.ProdServerStart.main(ProdServerStart.scala)

hi @ale222,

I think the first failure was related to the lagomScaladslPersistenceCassandra. Because you had the dependency on your cake, Lagom will use the Cluster, but since your yaml and application.conf was not taking that in consideration, it fails to form the cluster and the readiness probe fails as well.

Now that you removed the dependency, I think that you are hitting another issue. I’m afraid that Akka Management requires Akka Remote dependency but expected it to be provided. Lagom won’t include that dependency unless you use the Cluster (directly or indirectly). I will have a look into it first thing tomorrow morning. It’s late here.

I keep you posted.

Thanks @octonato, I’ve just tested from curiosity with the added akka-remote dependency and works like a charm.

I was about to write to you about this. I picked a Lagom app without clustering and it works as expected. I don’t know why this is happening on your app, but it should not require Akka Remote.

I guess you have some other dependency that is adding an Akka Management route that is requiring the Akka Remote.

Hi @octonato,
the problem was these configuration:

# Enable the serializer provided in Akka 2.5.8+ for akka.Done and other internal
# messages to avoid the use of Java serialization.
akka.actor.serialization-bindings {
  "akka.Done"                 = akka-misc
  "akka.actor.Address"        = akka-misc
  "akka.remote.UniqueAddress" = akka-misc
}

which I replaced with

akka.actor.allow-java-serialization=off

and everything works fine without akka-remote dependency.

Thanks for your time.

I solved same issue (lagom in k8s).
Just add this for application.conf:

akka.management {
health-checks {
readiness-checks {
# Default health check for cluster. Overwrite the setting to replace it with
# your implementation or set it to “” (empty string) to disable this check.
cluster-membership = “”
}
}
}

@vgordievskiy, that’s interesting. Maybe there is a bug in it. Lagom is supposed to add health check, but only include the cluster health check when cluster is enabled.

Maybe we got it wrong and the combination of cluster health check without cluster enabled (and therfore without akka remote) is causing the failure.

I created an issue to investigate this further:
https://github.com/lagom/lagom/issues/1917

My problem was related with a Not Healthy when call to the /ready. I tested it in on the microk8s.

@vgordievskiy was it on a service with cluster? If not then you should check the dependencies. My problem was like @octonato pointed above:

besides that, I also had this configuration of the akka serializer on my application.conf

akka.actor.serialization-bindings {
  "akka.Done"                 = akka-misc
  "akka.actor.Address"        = akka-misc
  "akka.remote.UniqueAddress" = akka-misc
}

which caused java.lang.ClassNotFoundException: akka.remote.UniqueAddress.
Since I removed that, everything works as expected. So for me it was a configuration problem and not a framework issue.

@octonato it was in cluster. I got a message on a pod like this:

Readiness probe failed: HTTP probe failed with statuscode: 500

I made a call for /ready with the curl and got 500 error with 'Not Healthy'. Google brought me in this thread by this keywords)