Alpakka not using more than one CPU core

rbr1589 · February 10, 2021, 4:13am

Hi,
We have created a Alpakka stream, which consumes Kafka message from a topic and then process those messages. These messages are processed in parallel, using mapAsyncUnordered with a configured parallelism. The Kafka lag for the consumer increases, but the application uses only 1 core of CPU. I have changed the default dispatchers to akka.actor.default-dispatchers, which uses a fork-join executor expecting it to use more than a CPU core. I have my application running in 32 cores.
Please find the configured settings below:

akka.kafka.consumer.use-dispatcher = "akka.actor.default-dispatcher"

Consumer stream code:

Consumer.DrainingControl<Done> control = Consumer.committableSource(consumerSettings, Subscriptions.topics(topic))

                .buffer( 500, OverflowStrategy.backpressure() )

                //De-serialize the response from json to java object
                .mapAsyncUnordered( 5, //deserialize the output )

                .mapAsyncUnordered(5, //Process it and perform some calculations )

                .mapAsyncUnordered( 5, //Do something and return the consumer offset )

                //Commit the offset
                .toMat( Committer.sink(committerSettings.withMaxBatch(100)), Consumer::createDrainingControl)
                .run( materializer );

The stream runs in a akka-cluster, which is load balanced by same consumer group id. We have a typed actor system as well in the application which is used for triggering the request, with a group router which helps in sharing the load across the cluster. The triggered request is sent to a micro service as a Kafka message and we get a response as a Kafka message which is processed by streams. And these messages are not necessarily to be processed in order, hence the use of mapAsyncUnordered…

Tried increasing the parallelism to even 100, but didn’t see a change.

Thanks in advance

seglo · February 10, 2021, 4:40pm

Is the issue that you never observe more than 1 core being used, or that the parallelism value is not actually running operations in parallel?

rbr1589 · February 11, 2021, 8:21am

I never observed more than core being used, it answers the second point as well, or am I wrong in understanding that? If the operations were run parallel then, we expect more than one core to be used right?

seglo · February 12, 2021, 4:26pm

There could be a number of reasons why additional cores are unused. Ultimately it’s up to the JVM and your OS to determine when a thread should be assigned to another core.

If you have 32 cores at your disposal and the transformations are mainly CPU bound then you could set the parallelism in mapAsyncUnordered to 32 (or a bit higher).

Can you share your changes to the dispatcher config?

Have you tried increasing the throughput of your application?

Is your application running within a Linux cgroup with any memory or CPU limits?

General note: It’s not advisable to use mapAsyncUnordered if you rely on ordering of your messages. Since you’re using the Committer.sink it’s possible that offsets could be committed out of order and you may skip the processing of some records when a rebalance or shutdown occurs.

baltiyskiy · February 17, 2021, 10:15am

How many partitions does the topic have? If you have only one partition, and you are committing offsets manually, then you will only be able to process 1 message at a time — regardless of the parallelism in the middle; Kafka will only give you messages one by one.

Topic		Replies	Views
Akka kafka consumer parallel processing Akka Streams & Alpakka akka , kafka	5	3290	May 14, 2020
Load Balancing in Akka Stream application Akka Streams & Alpakka	1	735	April 3, 2020
Multiple Consumer threads using Alpakka connector Akka Streams & Alpakka alpakka	0	708	May 2, 2019
What parallel value should I set for Akka Kafka consumer when we scale out more Kubernetes pods? Akka Streams & Alpakka akka-cluster , kafka	1	1000	February 3, 2022
Multisource kafka consumer Akka Streams & Alpakka kafka	3	468	March 22, 2021

Alpakka not using more than one CPU core

Related Topics