Akka HTTP Client connection pools and reverse proxies : a puzzling use case

Hello again,

I would like to share an observation we have had with the use of Akka HTTP client in a context where an F5 reverse proxy distributes traffic to 2 different Play Servers (let’s call them A and B), internally listening resp. on the ports 9000 and 9001, but both sharing the same public hostname. The redirect happens based on the local path of the HTTP request on that public host. The F5 is a full proxy and sets up separate connections for the client and the servers.

Public requests use HTTPS, the TLS is terminated on the F5 and internally the Play servers are using the HTTP protocol, not HTTPS.

The use case is a sequence where first a GET request is submitted to server A (internally routed to an IP address and port 9000), followed by another request - a POST request - to the same public host, which should then redirect to server B (internally routed to an IP address and port 9001).

The first request checks if the user is eligible for POST’ing to server B and if so, it tries to POST a message to server B, provided server A confirms the user can be posted to.

The observation is:

  1. The first request runs successfully to completion (GET request to server A)
  2. The second POST however fails, with a message (POST request to server B)

Outgoing request stream error akka.stream.StreamTcpException: The connection closed with error: An existing connection was forcibly closed by the remote host

In the server B logs, we find the following message
Illegal request, responding with status ‘400 Bad Request’: Unsupported HTTP method: The HTTP method started with 0x16 rather than any known HTTP method. Perhaps this was an HTTPS request sent to an HTTP endpoint?

It is worth noting that this problem only presents itself with the Akka HTTP Client, perhaps due to the way the connection pool works ?

Our application uses Http().singleRequest. Another setup with a dedicated host connection pool did not resolve it, nor a setup with a Source queue, but the result is the same.

Before thinking it might be something with the F5, it is worth noting that we can successfully execute this sequence with other Scala HTTP clients (we did a test with STTP and a few other backends, and with these it is working fine. Although our application uses the Akka HTTP client, not SoftwareMill’s, when we switched to STTP with Akka Streams as backend, we noticed that, we see the same error pop up.

Could someone explain why we are making this observation. Is there some special care needed when using Akka HTTP client ? Am I missing something ?

For us this seems to indicate a specific issue that exists only with Akka HTTP client.

Turns out this only works if you enforce the creation of 2 separate pools.

So, I got this working when I created (by means of a different configuration) 2 separate host-level connection pools, one for the traffic that is to hit server A and another one which will hit server B.

It seems as if connections cannot be reused when eventually you are getting your requests routed to a different endpoint by the intermediary.

Is this a correct observation. Can this be explained by the Lightbend team based on the implementation of the connection pools ?

PS : the other open question is : why does server B think it is receiving an HTTPS request on its HTTP port ? The F5 connection pool is configured to use HTTP and target port 9000 while adding the X-FORWARDED-PROTO header.

Thanks for the report. You could try enabling akka.http.client.log-unencrypted-network-bytes = 1000 on the client and the same on the server, to see what’s going on there.

Hard to say what’s going on, since ultimately the proxy seems to be the client to the server app. One thing that could be different between akka-http and other clients is how connections are reused, especially for POST requests.

Hi,

Thanks for picking this up.

I did a quick check by applying the trace option you suggested. Especially the server side is interesting.

The following is observed:

  • When the 400 Bad Request is thrown, the trace indeed shows the payload in an unreadable format (as if it is still encrypted). This I guess explains why the server thinks it is receiving an HTTPS request on an HTTP port. I wonder how it is possible this encrypted payload reaches the Play server. Something rather peculiar is going on, it seems.
  • Another observation is that if I try to run this sequence a number of times (20), I did see one get through correctly and I saw 1 successful post complete. When it works you can see the payload in a readable format in the trace with the tracing option you asked me to activate.

Looks like somehow at the level of the F5 there is some kind of disruption which appears to correct itself after some time. I will have to talk to our F5 admin to get a better picture of what’s going on at his level, but that will be for later this week. I will get back to you on this once I have more detailed information.

05-04-2021T08:47:41.320+0000 [08:47:41.320UTC] DEBUG [Materializer] - [server-plain-text FromNet] Element: SessionBytes ByteString(142 bytes)
16 03 01 00 89 01 00 00 85 03 03 12 08 CF 0F 7D | …}
13 EC 50 63 DC 1F E2 A2 7A 87 46 18 CE C8 EF 19 | …Pc…z.F…
17 1E BF E4 06 FA B7 46 45 CC 64 00 00 2C 00 9F | …FE.d…,…
00 9E 00 6B 00 39 00 67 00 33 00 16 00 9D 00 9C | …k.9.g.3…
00 3D 00 35 00 3C 00 2F 00 0A C0 30 C0 2F C0 28 | .=.5.<./…0./.(
C0 14 C0 27 C0 13 C0 12 00 FF 01 00 00 30 00 0A | …’…0…
00 06 00 04 00 17 00 18 00 0B 00 02 01 00 00 0D | …
00 1C 00 1A 04 01 05 01 06 01 04 02 05 02 06 02 | …
04 03 05 03 06 03 02 01 02 02 02 03 01 01 | …
05-04-2021T08:47:41.320+0000 [08:47:41.320UTC] WARN [ActorSystemImpl] - Illegal request, responding with status ‘400 Bad Request’: Unsupported HTTP method: The HTTP method started with 0x16 rather than any known HTTP method. Perhaps this was an HTTPS request sent to an HTTP endpoint?
05-04-2021T08:47:41.321+0000 [08:47:41.321UTC] DEBUG [Materializer] - [server-plain-text ToNet ] Element: SendBytes ByteString(168 bytes)
48 54 54 50 2F 31 2E 31 20 34 30 30 20 42 61 64 | HTTP/1.1 400 Bad
20 52 65 71 75 65 73 74 0D 0A 44 61 74 65 3A 20 | Request…Date:
4D 6F 6E 2C 20 30 35 20 41 70 72 20 32 30 32 31 | Mon, 05 Apr 2021
20 30 38 3A 34 37 3A 34 31 20 47 4D 54 0D 0A 43 | 08:47:41 GMT…C
6F 6E 6E 65 63 74 69 6F 6E 3A 20 63 6C 6F 73 65 | onnection: close
0D 0A 43 6F 6E 74 65 6E 74 2D 54 79 70 65 3A 20 | …Content-Type:
74 65 78 74 2F 70 6C 61 69 6E 3B 20 63 68 61 72 | text/plain; char
73 65 74 3D 55 54 46 2D 38 0D 0A 43 6F 6E 74 65 | set=UTF-8…Conte
6E 74 2D 4C 65 6E 67 74 68 3A 20 32 33 0D 0A 0D | nt-Length: 23…
0A 55 6E 73 75 70 70 6F 72 74 65 64 20 48 54 54 | .Unsupported HTT
50 20 6D 65 74 68 6F 64 | P method

Yes, that’s a TLS ClientHello packet, i.e. probably the start of a HTTPS connection.

Yeah, it’s a bit weird how some requests manage to get through correctly but some not, maybe even depending on the original client used.

while still using Http.singleRequest, I got this working in the following way :slight_smile:

  1. create 2 different connection pool settings (and notice I added this socket option as to not reuse addresses)

val connectionPoolsettingsClientAPI = ConnectionPoolSettings(system).withMaxRetries(2).withConnectionSettings(ClientConnectionSettings(system).withSocketOptions(List(ReuseAddress(false))))
val connectionPoolSettingsReceiverAPI = ConnectionPoolSettings(system).withMaxRetries(4).withConnectionSettings(ClientConnectionSettings(system).withSocketOptions(List(ReuseAddress(false))))

I had to choose some setting to ensure I had 2 different configs,so I put it more retries for the POST operation than for the GET operation.

This appears to work without losses.

Without the ReuseAddress = false, I did see some intermittent socket issues in the logs.

Could you comment on this based on the inner workings of the Akka HTTP connection pool ?

  1. When the client code targets server A or server B, use a different connection pool, by adding the explicit setting parameter with each singleRequest invocation.

A few notes:
The test consists of running the GET-POST sequence N number of times and trying to post a message to server B, after first doing a GET to server A.

When I increase the number of iterations to 256 and I do not put in the ReuseAddress(false) option, only 133 succeeded and at some point, this ERROR message is logged.

[ERROR] [04/06/2021 14:52:51.384] [ClientAPISystem-akka.actor.default-dispatcher-11] [ClientAPISystem/Pool(shared->https://<SERVER_ADDRESS>t:443)] [13 (Unconnected)] Slot execution failed. That's probably a bug. Please file a bug at https://github.com/akka/akka-http/issues. Slot is restarted.
java.lang.IllegalStateException: cannot create children while terminating or terminated
	at akka.actor.dungeon.Children.makeChild(Children.scala:287)
	at akka.actor.dungeon.Children.attachChild(Children.scala:51)
	at akka.actor.dungeon.Children.attachChild$(Children.scala:50)
	at akka.actor.ActorCell.attachChild(ActorCell.scala:447)
	at akka.stream.impl.ExtendedActorMaterializer.actorOf(ActorMaterializerImpl.scala:68)
	at akka.stream.impl.TlsModulePhase.materializeAtomic(PhasedFusingActorMaterializer.scala:944)
	at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:492)
	at akka.stream.impl.SubFusingActorMaterializerImpl.materialize(ActorMaterializerImpl.scala:129)
	at akka.stream.impl.SubFusingActorMaterializerImpl.materialize(ActorMaterializerImpl.scala:117)
	at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:629)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1.akka$http$impl$engine$client$pool$NewHostConnectionPool$HostConnectionPoolStage$$anon$$openConnection(NewHostConnectionPool.scala:556)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Slot.openConnection(NewHostConnectionPool.scala:392)
	at akka.http.impl.engine.client.pool.SlotState$UnconnectedState.onNewRequest(SlotState.scala:182)
	at akka.http.impl.engine.client.pool.SlotState$UnconnectedState.onNewRequest$(SlotState.scala:181)
	at akka.http.impl.engine.client.pool.SlotState$Unconnected$.onNewRequest(SlotState.scala:192)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Event$.$anonfun$onNewRequest$1(NewHostConnectionPool.scala:154)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Slot.runOneTransition$1(NewHostConnectionPool.scala:259)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Slot.loop$1(NewHostConnectionPool.scala:342)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Slot.updateState(NewHostConnectionPool.scala:351)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1$Slot.onNewRequest(NewHostConnectionPool.scala:222)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1.dispatchRequest(NewHostConnectionPool.scala:122)
	at akka.http.impl.engine.client.pool.NewHostConnectionPool$HostConnectionPoolStage$$anon$1.onPush(NewHostConnectionPool.scala:87)
	at akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:523)
	at akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:409)
	at akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:606)
	at akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:485)
	at akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:581)
	at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:749)
	at akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:764)
	at akka.actor.Actor.aroundReceive(Actor.scala:539)
	at akka.actor.Actor.aroundReceive$(Actor.scala:537)
	at akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:671)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:614)
	at akka.actor.ActorCell.invoke(ActorCell.scala:583)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)
	at akka.dispatch.Mailbox.run(Mailbox.scala:229)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:241)
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

With the ReuseAddress(false) option, all 256 iterations complete successfully.

The error message indicates that the ActorSystem is terminated before the test is finished which is unlikely to be directly affected by the given setting (ReuseAddress which shouldn’t have any effect at all on the client side).

Hi,

Correct yes, I notice that I am calling coord shutdown in one of the exception handlers. So, indeed ; you can forget about that trace entry. It is induced by the shutdown.

Still, the observation remains : without that setting it does not work without losses, with that option it does work without losses.

Perhaps this is the more interesting type of log entry I see for the ones that fail:

[2021-04-06 15:45:54,846] [DEBUG] [akka.http.impl.engine.client.PoolGateway] [ClientAPISystem-akka.actor.default-dispatcher-4] [ClientAPISystem/Pool(shared->https://<SERVER_NAME>:443)] - [9 (Failed)] State change from [WaitingForResponseDispatch] to [Failed(akka.stream.StreamTcpException: Tcp command [Connect(<SERVER_NAME>:443,None,List(ReuseAddress(true), KeepAlive(true)),Some(10 seconds),true)] failed because of java.net.SocketException: Invalid argument: no further information)]. Closing the existing connection.

It is this kind of entry that led me to try with that option (ReuseAddress(false)), which did lead to no losses when running this sequence N times.

I think we can forget about my previous remarks about that socket option. Dead end and eventually did not seem to have a direct relation with the outcome.

Let’s perhaps summarize where I am with this now.

I am currently having this working provided one defines 2 separate connection pools, 1 for Server A, 1 for Server B requests. Without the 2 connection pools we are looking at the issue described in the opening statement of the discussion.

Currently, when I am not tweaking default pool settings, my assumption is that Akka HTTP client executes with a parallelism of 4 (max-connections) and a maximum allowed number of open requests of 32 (max-open-requests) and I am thinking that the max-open-requests applies across both connection pools.

I assume that for a component like this that is written to just talk to 2 endpoints, but may have growing traffic to handle, the way to support higher loads would be to increase the ‘max-open-requests’ value in the configuration.

I assume that Lightbend recommends to move to a HostConnectionPool setup or perhaps a Source queue instead of Http.singleRequest in order to provide more buffering options and take advantage of backpressure. Is there any additional advise you could give ?

PS : still awaiting the return of our F5 admin to get his view on the situation.

To clarify, requests are counted for each host pool separately.

Higher max-open-requests is needed if you temporarily need to submit more requests to the pool than it can instantly handle. A higher setting doesn’t really affect normal operation much if you don’t overload the pool/backend (averaged over some period). However, during overload bigger buffers can hurt. That can go up to the point that the pool doesn’t make any useful progress any more because all requests are delayed into timeout just by the buffering time. So, you need to do some tests to find the smallest value that still works.

No, currently the best way to use the pool is to use Http().singleRequest, all the other APIs are build on top of it and often rather add more idiosyncrasies. If you have the choice of using either cachedHostConnectionPool in a flow or mapAsync()(Http().singleRequest), use the latter.

Hello again,

Thanks for the suggestions and remarks.

We indeed stuck with Http().singleRequest and we just ensured to have a different pool for each target system (by supplying a different settings parameter to the call).

I am still waiting for our F5 administrator to look into what is really going wrong at the level of our reverse proxy. Will keep you posted on that, once we can get his attention.

1 Like