Framework Performance Process

Following up from the conversation in playframework#8374:

Some references can’t be linked directly in this post since, as a new member, I can only post 2 links, but I have some observations that I want to get some feedback on:

  • Some parts of Play have had, and probably still have, performance impacting “bugs” that can easily go unnoticed (eg: playframework#8374, playframework#8335)

  • Other heavily used components can benefit from pretty easy low hanging optimization fruit (eg: playframework#8375, play-ws#251)

In the case of bugs, it would be nice if there were an easier way to spot these earlier. In the case of optimizations that can be easily implemented, it would be nice if someone spent a little time looking for them and making those optimizations.

In all of the tickets listed above I was able to identify the problematic parts of the code because I have access to a staging service running a profiling agent that can produce flame graphs, and that service routinely has long periods of time where the only traffic coming in is “health check” traffic- which is to say, I can produce flame graphs from a play application where, for certain periods of time (weekends), the only “work” being done and measured comes from framework overhead and not the application.

Beyond the tickets linked above, we’ve also fixed more than a couple issues that we created ourselves in our own middleware- but the observation stands: having flame graphs for “framework overhead” is super useful in noticing work patterns that aren’t expected and point to straightforward resolution paths.

My question to this community is the following: how do we go about setting up a process within the play development cycle so that performance “bugs” and improvements can be noticed earlier without depending on the broader community to carefully monitor the performance of their own applications and submit patches upstream (although people should definitely be doing that on their own too). For example:

  • Is it possible that prune can gather profiling data in addition to benchmark data and surface that somewhere?
  • Does it make sense to set a performance goals for play itself so that some expected baseline latency and throughput targets can be used to make harder, more subjective trade-off decisions (ie: does it make sense to switch akka-http to the default backend)?
  • More broadly, should a group of engineers (either at lightbend, via community efforts, or both) be formed to look into areas for improvement? How would such a group operate?

Looking forward to hearing the thoughts and feedback from others about how we can make Play better in this regard.

4 Likes

Hi James

So Play has been running without decent performance regression testing for a while now. In the end Prune proved too hard to keep running. When I wrote Prune I got quite fancy and made it support backtesting, etc but this ended up making Prune very fragile because it needed deep support for building Play for every revision. A small change to the Play build system or project structure and Prune would fall over. Prune needed to support all versions simultaneously so it got very cumbersome. It also controlled the whole build so it was complicated to debug build failures.

Last year I decided that maintaining Prune was too hard so I wrote some JMH tests to use instead. These can be run easily by anyone who can use sbt. These JMH tests have actually been running every night since October 30 last year and I’ve been logging results as JSON in a directory. I just haven’t had a chance to look at the results yet!

There are only a few JMH tests currently. I wrote a few microbenchmark tests while debugging performance issues, e.g. these tests. If we’d had a microbenchmark for rendering responses then we would have caught the AsciiBitSet issue that you raised recently. The other microbenchmark is a “realistic” test that uses actual sockets to send and receive a request.

I think these JMH tests would be a good basis for tracking Play’s performance ongoing. Developers can easily write new tests and run them locally. What’s missing is regular automated testing, regular reporting on those tests and maybe collecting some extra profiling information, as you suggest. It would be good to add more tests like these for things like JSON handling, JDBC, Play WS, etc.

If you’re interested in working together on Play performance, I’d certainly be interested in collaborating. I’m sure there would be others interested too. I’m open to any ideas for how to structure this collaboration.

– Rich

5 Likes

well I think it would be good if it would be possible to share some kind of setup for something like that.

I’m also pretty sure that play has a lot of cases where performance could be improved. But it’s not always as easy as just looking into some flame graphs.

1 Like

We’ve been capturing JMH results since late last year. I wrote a script to generate a CSV from the results and uploaded them here: https://docs.google.com/spreadsheets/d/1cSkhlCe_xaJxfrqw2WGIrTmGrTqYTN-PpeWNunqThuM. I made some comments about commits that seemed to speed up or slow things down.

1 Like