Kalix with ElasticSearch Handshake error

Hi,

We’ve created a small microservice with Kalix to listen on audit events and create entires in a ElasticSearch db.

I’m using elastic4s as a dependency and using Akka HTTP as a client for that.
When running the Kalix microservice locally (developmode & dockered container) I’m able to connect to our deployed instance of ElasticSearch.

The moment we deploy this microservice to the Kalix platform we get the following error
javax.net.ssl.SSLHandshakeException: Remote host terminated the handshake
When its trying to contact the ElasticSearch instance in the cloud.

I’ve also tried to use the JavaClient, same result.

Using a simple WSClient of java.net URI to just do a simple GET to any endpoint on ElasticSearch we also have the same behaviour. Any other secured endpoints(not elasticsearch) work as expected.

Any insights on why we get this only on the deployed instance of the microservice and how we could solve this issue?

Kind regards,
Gilles

Do you have a cause from the stack trace with some additional details explaining why the handshake was terminated?

Hi Johan,

The following stacktrace is when testing a call with java.net.URI

java.base/sun.security.ssl.SSLSocketImpl.handleEOF(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.decode(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
java.base/sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown Source)
java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
java.base/java.net.URLConnection.getContent(Unknown Source)
java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getContent(Unknown Source)
com.enprove.audit.Main$.$anonfun$main$1(Main.scala:58)
scala.util.Try$.apply(Try.scala:210)
com.redacted.audit.Main$.main(Main.scala:58)
com.redacted.audit.Main.main(Main.scala)

The following stacktrace is when testing a call with a Standalone WSClient

java.net.ConnectException: https://redacted.es.europe-west1.gcp.cloud.es.io:9243

play.shaded.ahc.org.asynchttpclient.netty.channel.NettyConnectListener.onFailure(NettyConnectListener.java:179)
play.shaded.ahc.org.asynchttpclient.netty.channel.NettyConnectListener$1.onFailure(NettyConnectListener.java:151)
play.shaded.ahc.org.asynchttpclient.netty.SimpleFutureListener.operationComplete(SimpleFutureListener.java:26)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
play.shaded.ahc.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
play.shaded.ahc.io.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1788)
play.shaded.ahc.io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1065)
play.shaded.ahc.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:260)
play.shaded.ahc.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:246)
play.shaded.ahc.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:239)
play.shaded.ahc.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
play.shaded.ahc.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:260)
play.shaded.ahc.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:246)
play.shaded.ahc.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
play.shaded.ahc.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818)
play.shaded.ahc.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
play.shaded.ahc.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
play.shaded.ahc.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:497)
play.shaded.ahc.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
play.shaded.ahc.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
play.shaded.ahc.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.base/java.lang.Thread.run(Unknown Source)

No line with caused by showing the root cause, the Remote host terminated the handshake is the only exception message in there?

Hi Johan,

Yeah those are the printed out stacktraces for both cases
They dont output more information.

We’ve integrated with other 3rd party services over HTTP already but did not have this issue with them. We’ve also tried to enforce TLS 1.1 as a test;

javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
java.base/sun.security.ssl.HandshakeContext.<init>(Unknown Source)
java.base/sun.security.ssl.ClientHandshakeContext.<init>(Unknown Source)
java.base/sun.security.ssl.TransportContext.kickstart(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
java.base/sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown Source)
java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
java.base/java.net.URLConnection.getContent(Unknown Source)
java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getContent(Unknown Source)
com.enprove.audit.Main$.$anonfun$main$1(Main.scala:64)
scala.util.Try$.apply(Try.scala:210)
com.enprove.audit.Main$.main(Main.scala:64)
com.enprove.audit.Main.main(Main.scala)

When enforcing to use TLS 1.2 we get the same error as we get without enforcing any TLS version

In the deployed kalix service i’ve also printed out the supported TLS versions with the following snippet;

import javax.net.ssl.SSLContext

println(SSLContext.getDefault.getSupportedSSLParameters.getProtocols.mkString(", "))

With the following output;

TLSv1.3, TLSv1.2, TLSv1.1, TLSv1, SSLv3, SSLv2Hello

just as reference, the outputs below are from a local running docker instance of the service, this is expected behaviour as the connection requires authentication.

{"timestamp":"2022-05-19T13:12:43.962Z","thread":"main","logger":"com.enprove.audit.Main$","message":"java.io.IOException: Server returned HTTP response code: 401 for URL: https://redacted.es.europe-west1.gcp.cloud.es.io:9243","context":"default","severity":"WARN"}

{"timestamp":"2022-05-19T13:12:43.964Z","thread":"main","logger":"com.enprove.audit.Main$","message":"Server returned HTTP response code: 401 for URL: https://redacted.es.europe-west1.gcp.cloud.es.io:9243","context":"default","severity":"WARN"}

{"timestamp":"2022-05-19T13:12:43.964Z","thread":"main","logger":"com.enprove.audit.Main$","message":"starting the Kalix service","context":"default","severity":"INFO"}

{"timestamp":"2022-05-19T13:12:44.381Z","thread":"kalix-akka.actor.default-dispatcher-5","logger":"akka.event.slf4j.Slf4jLogger","message":"Slf4jLogger started","context":"default","severity":"INFO"}

Looks like this was caused by the Kalix service mesh intercepting that call, it has bypass for various ports and kinds of services but apparently 9243 for elastic cloud was missing. They do also support port 443, so until we have updated the port listings in the mesh config using that could be a workaround.

2 Likes