How to use local memory transport message from A to B actor?

0x3E6 · June 30, 2023, 2:23am

I have two actor: objManagerActor and objActor, send message to objManagerActor, then objManager dispatch message to objActor, I am computing TPS in objActor and result is about 2.4 millions (default-dispatcher thread num is 2), I want to improve the TPS.

It seems that akka transport message with TCP, how to improve the throughput?

johanandren · June 30, 2023, 7:02am

Hi 0x3E6

There is no networking involved in local actor messaging, for a cluster TCP or UDP can be used but Akka does not use RMI at all, what you see is likely because of connecting a debugger or profiler to the JVM.

To get a baseline of what you could potentially reach on the hardware you are running your app, you can run this JMH benchmark with sbt after first checking out the Akka repo and then updating the number of cores in the test to match your machine (running jmh from the sbt shell with akka-bench-jmh/jmh:run -f 1 -w5 -i5 .*ActorBenchmark.*). Reading through the benchmark actors can also give you a hint about what kind of message flow you’d have to have to reach such numbers and perhaps compare to your own flow for insights:

github.com

akka/akka/blob/main/akka-bench-jmh/src/main/scala/akka/actor/ActorBenchmark.scala

/*
 * Copyright (C) 2014-2023 Lightbend Inc. <https://www.lightbend.com>
 */

package akka.actor

import java.util.concurrent.TimeUnit

import scala.concurrent.Await
import scala.concurrent.duration._

import BenchmarkActors._
import com.typesafe.config.ConfigFactory
import org.openjdk.jmh.annotations._

object ActorBenchmark {
  // Constants because they are used in annotations
  final val threads = 8 // update according to cpu
  final val numMessagesPerActorPair = 1000000 // messages per actor pair

This file has been truncated. show original

For the record, on my current Apple M1 Max, with JDK 17, the bench does ~250m messages per second. A real application that actually does something of business value rather than just ping-ponging messages will not likely reach such a high number though.

0x3E6 · July 1, 2023, 9:19am

Ok, thank you! I’ll try the benchmark. If there are suggestions on the number of Actors, CPU, and memory configuration in the document, it would be great.

johanandren · July 3, 2023, 6:56am

Since Akka is a very general purpose library that can be used to implement any number of very different use cases it is hard to give any general recommendation other than to carefully think about the message flows, what actors may be bottle necks, how heavy each type of actor will be.

I’d argue that it is relatively seldom that Akka actors or the message passing will be the overhead that is problematic when implementing systems that solve actual business problems. In cases where it is, the granularity may be too fine and each actor is doing so little actual work, and or is so short lived, that the time spent starting, stopping and passing messages dominates what the CPU cycles are spent on.

0x3E6 · July 4, 2023, 1:48am

It looks like you sayed ,the ActorRef.tell spend many time because of the granularity may be too fine. I created ten thousand actors in my 4 core computer.
When the number of actors far exceeds the number of machine cores, will akka performance decrease quickly?

davidogren · July 4, 2023, 2:35am

No. There is effectively no correlation between # of actors and performance. I’ve run lots of tests with millions of actors and 4 cores.

johanandren · July 4, 2023, 7:27am

However, as I said, if each actor does very little work in response to every message, the logic to enqueue messages in a mailbox in a thread safe fashion may very well dominate what the CPU cycles are used for. If you’d run a profiler against the message sending benchmark I mentioned earlier, you will likely see that most of the time is spent in tell, since all the actors does is ping-pong:ing messages.

Note that profiling and benchmarking is hard, and it is easy to draw the wrong conclusions, so make sure to verify your findings carefully. For example changing your logic and re-running your benchmark to see that it changes in the way you expected it to. It is also important to do warmup of the JVM before running the actual benchmarking so that you get numbers for a system that has been running for a while (with the hot paths JIT:ed).

0x3E6 · July 4, 2023, 8:59am

Ok. I use Akka as a stream processing framework because it is more convenient to program custom complex logic than stream processing engines such as Flink. In each Actor, some incremental aggregation operations will be performed on every message, and the results may be output.

0x3E6 · July 4, 2023, 9:01am

In this scenario, can the throughput be improved only by optimizing the mailbox?

johanandren · July 4, 2023, 12:11pm

Fusing together multiple aggregation operations into one actor so that it does more would be one thing to consider. This is how Akka Stream works behind the scenes, operators are fused together into a single actor avoiding passing messages between if not strictly needed.

davidogren · July 4, 2023, 6:35pm

improved only by optimizing the mailbox?

I’m not sure what you would even mean by this. Effectively, (excluding details such as the interfaces) the default mailbox is a queue. There are fancier options for mailboxes, such as bounded mailboxes and prioritized mailboxes, but the default mailbox is essentially a simple queue and as “optimized” as conceivable.

I’m in agreement with Johan, I think you really need to start by evaluating your profiling/benchmarking. I’m definitely concerned that you are using most of your CPU on your profile tools than you are on executing work.

What are the results of the ActorBenchmark on your hardware if you just run it without changes? (You give no indication of the size of your hardware.) That should give you an idea of the upper limit of your TPS in terms of actor messages on that hardware. That should be a pretty absurdly high number, but if it’s still not high enough your option would be to fuse work together so that it requires fewer messages. As Johan mentions, Akka Streams will do some of that for you automatically, but you could also do it manually in your application logic.

Roiocam · October 11, 2023, 3:44am

One actor only running on one thread, in your case, you may want to incr your actor which improve your concurrency(depend on your cpu core size).

The most important thing is understand the bottleneck. What is the throughput of objManagerActor and objActorrespectively? In your case with only two actors, this is equivalent to a producer-consumer pattern with two threads. Akka no much performance magic on this case.

Topic		Replies	Views
What's the best practice to send data between actors? Akka	2	408	August 16, 2020
How to send messages to appropriate actors by routing. (Content-based routing, solved?) Akka	3	644	October 30, 2018
Lots of Actors communicating with each other in ordered sequence Akka	1	540	December 5, 2018
How to reliably send a message from outside the ActorSystem Akka java	3	320	December 7, 2023
Akka with Java - Application Design Pattern - Request For Comments Akka	2	659	October 31, 2018

How to use local memory transport message from A to B actor?

Related Topics