How to configure a separate thread pool (execution context) and use it

Hi guys,

We have a Lagom service that handles the API calls which is used to run some sort of CPU-intensive simulation and return the result of simulation back to the caller. And this service sometimes emits the “heartbeat interval growing” warning messages in the log.

I am just wondering if there is a way in Akka and in Lagom to spare the separate thread-pool / execution context per API call.
I looked at this documentation but it requires us to initiate the Actor directly so I am bit reluctant to use this approach yet.

And also, my understanding of Akka gossiping protocol is that Akka always spares a fixed number of threads for heartbeat exchange to prevent this heartbeat interval growing issue. So I am not quite sure why this error is happening first of all.

Thanks,

Joo

This is the log WARN messages we get when this simulation end-point gets called by multiple callers at the same time.

WARN a.r.PhiAccrualFailureDetector - heartbeat interval is growing too large for address akka.tcp://application@xcvlskdfjalskdfjlasjdfaskjf: 4469 millis

How is the code that performs this calculation invoked? Is it part of a persistent entity command processing, or executed outside an entity?

Lagom doesn’t currently provide a way to run entities on a different dispatcher, but in general we would recommend avoiding very expensive computations within the command handler if possible. Does the computation both depend on the current state of the entity and emit new events to change the state? If you don’t need the current state, then you can move the computation to before the command is sent. If you do need the current state but don’t need to change it, then you can use a read-only command to get the state and the perform the simulation outside the entity. Another possibility might be to move the simulation to a read-side processor. If none of these options are suitable, another option is to customize the cluster dispatcher so that it isn’t blocked by computations running on the default dispatcher.

Being able to customize the dispatcher that Lagom persistent entities use would be a nice enhancement, but it looks like it would require a change to the framework.

And also, my understanding of Akka gossiping protocol is that Akka always spares a fixed number of threads for heartbeat exchange to prevent this heartbeat interval growing issue. So I am not quite sure why this error is happening first of all.

This will become true in Akka 2.6, but isn’t in current stable versions. Right now, the default dispatcher is used for the heartbeats unless you configure it otherwise as described in the link above.

Even with a dedicated cluster dispatcher, blocking the default dispatcher is undesirable. It won’t cause the same cluster heartbeat stability issues, but could result in unpredictable performance in other areas of your application.

Hi Tim,

Thanks for your reply. Out of all the options, it seems like

another option is to customize the cluster dispatcher so that it isn’t blocked by computations running on the default dispatcher

would be the most viable?

Just to confirm if I got this right, if we change the cluster dispatcher, that would effectively isolate akka cluster activity from everything else (command handler/read-side processor, etc) ?

Just to confirm if I got this right, if we change the cluster dispatcher, that would effectively isolate akka cluster activity from everything else (command handler/read-side processor, etc) ?

Yep, you got it. As I mentioned, though, you might still see other problems by running blocking or CPU-intensive tasks on the default dispatcher, so it might not be enough to solve everything.

Great! That’s good enough as a quick fix for cluster stability while we explore the other methods.

Thanks again!