Heartbeat interval is growing too large when sending large payload between nodes

weicheng113 · October 17, 2019, 8:10am

Hi,

My akka cluster is processing messages from a Queue. For each message, I will do a calculation, which will result in an big Update payload being sent to its correponding persistent actor(which can be sitting in different machine). Message processing is done within an akka stream, which control maximal parallel messages at a time. The Update payload can be quite large, up to 50MB. I observed akka.remote.default-remote-dispatcher having high cpu and “Heartbeat interval is growing too large” message started to appear. I was reading this article - https://petabridge.com/blog/large-messages-and-sockets-in-akkadotnet/. As Heartbeat message is sent through akka.remote.default-remote-dispatcher also, I think this big Update payload is a problem. I need an akka stream to control and limit the parallel message processing. Do you have a recommendation of design for such system? Thanks.

====Update
I am currently trying out artery and its artery.advanced.idle-cpu-level=1(it seems this option no longer exists? only artery.advanced.aeron.idle-cpu-level = 1 exists?). But I am still receiving “heartbeat interval is growing too large for address”

akka.remote {
  artery {
    enabled = on
    transport = tcp
    large-message-destinations = [
      "/user/myActor1"
    ]
    advanced {
      aeron.idle-cpu-level = 1
      idle-cpu-level = 1
      maximum-large-frame-size = 50 MiB
      large-buffer-pool-size = 5
    }
  }

Cheng

patriknw · October 19, 2019, 1:54pm

When using transport=tcp the aeron settings are not used, such as the idle-cpu-level.

I think the tcp transport is better for large payloads.

50 MB is very much for a message, also when using the large channel. It could be other things than the transfer that disturb the heartbeats when using such large messages. Difficult to guess without investigation.

We have plans of supporting bulk transfers with StreamRefs, but that is not ready yet.

The typical recommendation would be to send such large things over a side channel, e.g. HTTP, instead of with actor messages.

weicheng113 · October 20, 2019, 3:09am

Thanks for the reply @patriknw. If it is going through a side channel, e.g. HTTP, I will need to set up an extra HTTP server on each node and act as a forwarding proxy, which is not a clean solution(and message itself having destination actorRef). I was thinking about sending TriggeringCalculation message to persistent actor and let persistent actor to spread calculation locally on its node. In this way, it will avoid network transmission. But this has disadvantages of hard to control number of parallel calculations(which I can with stream) and tying calculation logic with persistent actor(which it is not ideal).

Thanks, Cheng

Topic		Replies	Views
Over Sized Message Channel Persistence / Event Sourcing	1	464	May 3, 2021
Heartbeat interval is growing too large Akka Distributed Data (CRDT)	4	4132	August 27, 2018
Failure-detector.acceptable-heartbeat-pause does not seem to work Akka	2	706	September 27, 2019
Need larger sender queue in Akka Cluster Akka Cluster	2	492	July 8, 2020
Warnings about scheduled sending of heartbeat Akka	14	5560	September 24, 2019

Heartbeat interval is growing too large when sending large payload between nodes

Related Topics