Throughput and batching with Alpakka Kafka



I’m trying to migrate an ETL pipeline from a plain java solution to Akka streams, but finding some sticking points.
Input comes from a UDP socket and output goes to Kafka.
One main WTF right now, is batching at the producer. I’m used to being able to set the max batch size and the on the producer, and it handles batching automatically. With Akka Kafka as a sink, the batch size seems to be ignored, and all does is backpressure on each message. Is this the desired behavior? I’ve partially worked around this with the MultiMessage construct. How does this compare to native Kafka producer batching?

(Gergő Törcsvári) #2

I have no idea how the kafka or the kafka sink works, but as you described the problem you maybe want to use groupedWithin.