We would like to get an advice for tracing akka-http.
We believe we have issue with random connections closing, presumably by akka-http server. It exists for more then a year already, but it started to actually affect us only couple of months ago.
What is happening:
We have amazon Classic Load Balancer that redrives HTTP requests to our proxy server which is built on top of akka-http. Sometimes (around each ~800 K requests, which results in ~100-150 reproductions per day) ELB returns 504 GATEWAY_TIMEOUT error with almost zero backend processing time with minimal info (as if connection was closed prematurely). We don’t see any requests in our proxy logs at the moment of such errors, it looks like our async request handler is not getting called when this happens.
We couldn’t get any correlations between requests url\verb\payload size and their “cancellation”. Most likely, issue is caused by “previous” request, not the one that got 504 and we have no idea how to track that.
We upgraded akka-http versions several times from 10.0.3 on Scala 2.11 up to latest 10.1.6 on 2.12, problem persisted across all these changes.
Our current guesses are:
- The backend server closing the connection with the ELB prematurely by sending a RST.
- Backend sends a response without “Connection: close” header but closes the connection anyway, causing the ELB to send 504 to some subsequent request.
- Closes connection without sending FIN or RST packet (?)
We can get any logs and tcpdumps at client side and at host machine. What info\logs\metrics we can gather from akka-http that may help us to track down the issue? Any suggestions?