Play framework streaming responses causes too many open files errors

streams

(Tarik) #1

In my project i am using streaming method and getting response as Source[ByteStrring, _] from my API via WsClient over http.Then I send this response to client as Result.When I get the result of linux command ls of, i see a lot of TCP socket connections.Their counts always increase and finally system shot down.I developed my code according to play framework’s official site.My play framework version is 2.5.10.Here is the code example in play’s site that causes this problem

https://www.playframework.com/documentation/2.5.x/ScalaWS#Processing-large-responses

def downloadFile = Action.async {

  // Make the request
  ws.url(url).withMethod("GET").stream().map {
    case StreamedResponse(response, body) =>

      // Check that the response was successful
      if (response.status == 200) {

        // Get the content type
        val contentType = response.headers.get("Content-Type").flatMap(_.headOption)
          .getOrElse("application/octet-stream")

        // If there's a content length, send that, otherwise return the body chunked
        response.headers.get("Content-Length") match {
          case Some(Seq(length)) =>
            Ok.sendEntity(HttpEntity.Streamed(body, Some(length.toLong), Some(contentType)))
          case _ =>
            Ok.chunked(body).as(contentType)
        }
      } else {
        BadGateway
      }
  }
}

I didn’t modify this code.i used as exactly like this.I would like your help.Thanks.


(Rich Dougherty) #2

Are the counts of open sockets something that seems reasonable to you or does it look like there’s a resource leak? Have you configured your OS limit for open files up to the limit you want to reach?


(Tarik) #3

I have enough resource.When i watch CPU or memory usage graphics, i don’t see anomally.Increasing OS limits is not a solution.Because open files count always increases.Let me tell you how i realized this case:

Database ----> API ------> Frontend -------> End user(client)

This is a web site project architecture.API and Frontend were developed by using Play framework.They are completely different projects.I have some files such as jpeg,pdf etc.I keep these files in database as BLOB.There are some end point adresses for clients to download them.When you get these adresses via web browser or some tool,frontend requests to API via wsClient over http(Code in above).API get file as Inputstream or ByteArray from DB and gives to Front End by using Ok.chunked() method.Frontend gives you the file.Finally you download the file.

I developed a simple code in scala and play framework.This code is something like Jmeter that is used for system overload testing.By using this code I made requests to file download adresses(I made this for all of them) repeatedly.I started to watch OS open files count.I realized that open files count at Frontend project always increase and never decrease although I escape making requests.When I reboot the JVM(frontend) open files count returns to normal.For the other type adresses(such as web page,JSON etc), there isn’t abnormal case like this.

Code in above causes this case in any way.I think some hanged TCP socket connections remain in the system.Because I see so many TCP socket connections in my operating system ls of command output file.

Thanks for your help.
Kindly regards.


(Rich Dougherty) #4

Can you explain more about the load testing you’re doing. E.g. concurrency limits, keep-alive settings, how connections are opened/closed, etc. I believe open sockets count as open files. If you run netstat can you check if the socket state looks right. I believe sockets can stay open for a while depending on how they are closed.

To debug behaviour try making a single request and look at socket state/open file count. Does the usage go up then return down after the request is made. This will tell us if there is a leak under normal conditions.

Finally you may want to capture heap dumps and inspect them in a tool like VisualVM or YourKit. That will verify things like number of sockets/streams/files open from the JVM’s point of view, which may be more informative than a simple OS file descriptor count.


(Tarik) #5

My load testing is simple.I am making http requests repeatedly(for example three times in one second) to some adresses for download file(for example www.something .com/downloadFile).My system settings are default.I don’t have extra special settings.

Debug like you said is almost impossible for me.There are so many requests to application.I can’t trace my single request.But I can definitely say that the code above (downloadFile method code example) causes open file count to increase and never decrease.I also tried this scenario on other play framework project and result is same.

I use wilyweb to monitor my system stats.I don’t see any abnormal case.

I run netstat command but it gives statistics for only our private network.When I look at the STATES, I see generally ESTABLISHED.But total count of this command’s output is not high.The high count belongs to lsof command.I am writing one row of lsof command output file.

here is column titles:
COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME

here is the example row:
java 31198 20492 mw 702u sock 0,6 0t0 862298196 protocol: TCP

There are too many rows just like above.


(Rich Dougherty) #6

Can you try running with a single request in your application and see if it causes a leak, i.e. if it allocates a socket but doesn’t close it. You don’t need to run lots of requests. I’m assuming you can run the application on your local machine, so hopefully that’s true.


(Tarik) #7

I did what you said on my local machine.Let me explain what i did step by step.

I run project at port 9000.
I opened cmd window.

I run command to see open file count.
I made one request to file download address
I run command to see open file count and did these steps again and again.

In every request i saw that open file count increased 1 or 2 more.These files are TCP connections.I will add the screenshots.

2556 is process Id.As you can see after every single request TCP connection count increases and never goes down.At beginning this count was 70.I will also add the command output row by row so that you can see what it looks like.

I am adding another picture.I see that TCP connections between frontend and api are increasing.tsahin:9001 which means localhost:9001 is my API’s address.I had explained the architecture above.

openfile10


(Tarik) #8

I also realized something else.If I do this scenario by using web browser I don’t get same result.When I make requests over web browser system looks fine.Open file count doesn’t increase.Scenario in above occurs if I make requests by a program code.I will write my simple scala code that make requests.

wsClient.url(“www. something .com/downloadFile”).withMethod(“GET”).stream()

I trigger this code to make requests.Not from inside my project.I run this code in a different local project.


(Rich Dougherty) #9

Do you do anything with the Future[Response] returned by stream(). The connections may be being kept open waiting for you to read the response body.


(Tarik) #10

Actually I had written above.I take the Future[response] and give it to client.You can see at my first post.I recommend you to read all of my posts from beginning to end carefully.I think this is a bug about play framework.Because I developed my code according to official play framework web site.I wonder if anyone else faced this situation.

as summary, If I stream the response by using some program code(not web browser) connections on the system are hanged.Do you have chance to create this scenario with your developer team and examine this problem with details.


(Rich Dougherty) #11

It might be a bug in the framework or the docs, but it would be helpful to totally debug it first before spending developer time reproducing. We’ve made progress since the first issue since you can reproduce on a single connection now rather than only as part of a load test and since you’ve identified that it works in a web browser.

I suspect that there might be an issue with the fact that the body Source is not being totally consumed, and thus the socket connection is staying open. You could debug this by consuming the Source directly, e.g. into a Sink.ignore and checking if the connection issue is fixed.

Also it would be useful to verify which branch of the if and match you go to in your code. If you exit via BadGateway then perhaps the Source is unconsumed, leading to a leak (and a bug in the Play docs).


(Greg Methvin) #12

Are you using the WSClient provided by Play through dependency injection, and if not how is the WSClient being created?

If you’d like others to help debug, it would be very useful to provide a small sample project reproducing the problem.


(Tarik) #13

Yes.I use through dependency injection.I will prepare a sample project.

It goes to this branch :


(Tarik) #14

If I consume Source via Sink.foreach and write into a bytearray, then give this bytearray as response, issue is fixed.Here is the code:

def getByteArray(url: String)(implicit lang: Lang): Future[(Array[Byte], Map[String, Seq[String]])] = {
wSClient.url(url).withMethod(“GET”).stream().flatMap { res =>
val response = (res.headers,res.body)
if (response._1.status == 200) {
val out = new ByteArrayOutputStream()
val sink = Sink.foreach[ByteString] { b => out.write(b.toArray) }
response._2.runWith(sink).andThen {
case result =>
out.close()
result.get
}.map(a => (out.toByteArray, response._1.headers))
} else throw new Exception(“status error while streaming:” + url + " status : " + response._1.status)
}
}

Then I take the ByteArray and give response to client as stream by using this code(I didn’t write irrelevant parts like logging etc).

Ok.chunked(Source.fromPublisher(Streams.enumeratorToPublisher(Enumerator(byteArray).andThen(Enumerator.eof)))).withHeaders(headers.toList:_*)

BUT this is not something I want because consuming result and convert into bytearray then give result is not a good approach.While processing this steps clients are waiting idle.I mean download is not starting.Also this steps enforces the system such as CPU usage or memory leaks.Not usable to download big files.

Approach at the my first post is good for the system resource and client happiness but causes problems I mentioned before(open TCP connections).I will try some other methods like Sink.ignore.I still think that there is some problem with the code at play framework’s official site.

If so how framework can handle this situation.Is there any configuration or some other way you can suggest.For example configuration for maximum life time for TCP connection.


(Tarik) #15

I prepared some example.

Here is my API example code.It runs at port 9001

download address for API

controller class for API

Here is my frontend project code example.It runs at port 9000

download address for frontend project

controller class for frontend project

Here is my load test code.I trigger this code repeatedly.
wsClient.url(“http://localhost:9000/downloadTestFile2”).withMethod(“GET”).stream()

This samples produces the results that I mentioned above, at frontend project which runs at localhost:9000.


(Rich Dougherty) #16

Thanks for debugging with Sink.foreach. I agree this is not how you would want to write code, but it is useful for debugging the issue.

In your original code, can you confirm whether the Ok.sendEntity(HttpEntity.Streamed(body, ...)) or Ok.chunked(body) code is running? Whichever one of these is running is where we should look for bugs.

Can you also confirm that you’re fully receiving the file that you’re streaming? This will tell us whether the Source is even being fully read at all, which may tell us whether there’s a bug in Play WS or Play’s result code.


(Tarik) #17

Ok.chunked(body).as(contentType) ----> This part of code is running.

Actually I don’t mind if I receive or not the file.I just only make requests and do nothing with the responses.I will be examinig this issue today.I will be informing.


(Rich Dougherty) #18

I just only make requests and do nothing with the responses.I will be examinig this issue today.I will be informing.

Thanks. If you receive the whole file via chunked but the original Source isn’t closed then that suggests that at least all the data is being read from the Source. That means that we may have a bug in our Source which doesn’t close the underlying socket even when fully read.