Are produced messages ordered when using alpakka-kafka's `flexiFlow` and default settings?

Hi David,

maybe the discussion in Throughput and batching with Alpakka Kafka is helpful to you?

Short summary:

  • The mapAsync part that you found is independent of the order of publishing. In itself, it only guarantees that you will receive the Results (i.e. the success notifications of the publishing process) in the same order as the messages you put into it. (If the Producer was using mapAsyncUnordered instead, this would not be guaranteed. The parallelism setting affects the size of the internal buffer than mapAsync uses, but that buffer uses ahead-of-line-blocking to guarantee ordering.)

  • The order of publishing is managed by the underlying KafkaProducer from the kafka clients library. The alpakka flexiflow will pass messages to the KafkaProducer in the same order they arrive, then collect the futures that track publishing success for each message and hand them back to you once they’re done. The KafkaProducer will usually keep the order, but make sure to read the documentation for the producer configs retries and max.in.flight.requests.per.connection at https://kafka.apache.org/documentation/ ! In particular, the default for max.in.flight.requests.per.connection is 5 and I don’t think alpakka overrides this by default, so you need to adjust your configuration to retain ordering.

  • Finally - although I’m certain you’re aware of that - “publishing in order” only preserves order for downstream consumers of your partitioner is the same for the source and target topic, as two messages consumed from the source will inherently lose their order if they are being published to two different partitions downstream.

Hope that helps!