Garbage collection and actors

noahf · January 2, 2020, 12:38am

Hello! First post. I have created a pattern where I am spawning an actor to aggregate results from other actors. Once its task is completed, it sends the results back to requesting actor and terminates. The system handles a large volume of these requests, which result in a large number of these actors being created and destroyed. I am concerned with garbage collection of a large number of these actors and the impact on performance due to stop the world collections. Questions: 1. is there an elegant way to recycle/reuse these actors and avoid generating so many of them? 2. When an actor is shut down, is it garbage collected or are there some conditions where it might not be?

johanandren · January 2, 2020, 9:36am

If you can model aggregation as an actor instance you could likely also model it as a state machine inside of a single actor and use a pool of them. If that would give much less GC overhead depends a lot on the type of aggregation they do and how many other objects are created during each aggregation, actors are relatively light weight but if you could represent state with for example two primitive fields that would definitely be less gc intense than allocating and starting a new actor for each operation. Make sure to measure (jmh can for example give you a nice allocation rate metric).

It will essentially become a pool so you will have to implement logic in a parent actor for starting n aggregates and distributing work over them, and handle all aggregates being busy etc.

When the actor it has completely stopped the actor system will no longer have any references to it, so the same rules for GC as for other objects will apply. Since all other access to the actor should be through it’s ActorRef which is decoupled from the actual instance the only case of other objects having access would be antipatterns such as some thread/future etc accessing the internals of the actor.

noahf · January 10, 2020, 11:37pm

Thanks, this is helpful. I have been trying to run two separate, yet identical actor “clouds” running on the same server in an attempt to reduce the impact of GC and use a remote router to distribute jobs to them. My thinking is to have several distinct clouds of actors working so that if any one of them goes into a GC stop-the-world cycle then the work will be redirected to the others, especially if I use a shortest queue type logic on the remote router. The problem is that these separate groups appear to GC together. Is the Java memory model sharing memory between the JVMs running the clouds of actors? Or is this just a coincidence?

johanandren · January 13, 2020, 10:35am

GC is completely JVM local, so that would be coincidental unless heap have the same size and allocations are identical I guess.

Are you sure it is GC that is the problem? Sharing the same server could cause both (or all) processes to starve at the same time if they share CPU/IO and that becomes a bottleneck. If you haven’t already you can enable GC logging with -XX:+PrintGCDetails on JDK 8 or -Xlog:gc*:file=/some/path on JDK 9+

ignatius · January 14, 2020, 4:34am

To me it sounds more like you are allocating too much memory to the JVM (Xmx), which causes the machine to use to swap memory. This causes the stop-the-world scenario that you are describing. I’ve experienced this.

Just a guess of course. But I think it’s unlikely that it is actors or the garbage collector alone that “stops the world”. As long as properly configured, they shouldn’t cause worries unless you are doing some low-level optimizations.

noahf · January 14, 2020, 8:20am

Thanks. I’m running a video analytics app. This particular instance in processing over a million frames an hour. I’ve got Xmx at 64G and Xms at 32G and a bunch
of other settings recommended by a friend. I reduced the Xmx to 32G, 24G and 18G and am still seeing this issue. Anything below 18G results in more pauses. Is there a good resource out there that provides tips on tuning JVMs based on JVM logs? I’m running
Ubuntu 18.04 server and Java 11 Corretto.

ignatius · January 14, 2020, 8:34am

I’ve got Xmx at 64G and Xms at 32G

How much memory does the machine have?

noahf · January 14, 2020, 8:44am

256G

noahf · January 14, 2020, 6:56pm

I ran the system, creating a JVM log file, and then processed the log at gceasy.io and it showed no issue that would explain the long (30-60 sec) stop the world
events I am seeing 4 times an hour. The analysis indicated that the GC appeared to be healthy and there are no major memory leaks. So it doesn’t look like it is being caused by GC. Do you have any other suggestions of other ways to see which system resources
might be being temporarily starved to cause this?

Topic		Replies	Views
How to re create actors when the node gets terminated Akka Cluster akka-cluster	4	1145	August 14, 2019
Understanding the resolution and destruction of remote ActorRef Actors	4	697	March 6, 2020
Is it common to stop child actors in event sourcing? Akka akka-typed	5	1604	January 11, 2022
Akka-Cluster: Decreasing system performance having many active actors Akka akka-cluster	8	1235	October 28, 2020
LocalActorRef memory leak and Full GC Allocation Failure causing gc to take too long Akka Cluster java	7	2025	October 26, 2020

Garbage collection and actors

Related Topics