Why TransactionalSource needs isolation.level to be read_committed?

laymain · January 22, 2019, 1:22pm

Hi there,

I was wondering why the TransactionalSource is forcing isolation.level to ‘read_committed’:

  /**
   * We set the isolation.level config to read_committed to make sure that any consumed messages are from
   * committed transactions. Note that the consuming partitions may be produced by multiple producers, and these
   * producers may either use transactional messaging or not at all. So the fetching partitions may have both
   * transactional and non-transactional messages, and by setting isolation.level config to read_committed consumers
   * will still consume non-transactional messages.
   */
  private val txConsumerSettings = consumerSettings.withProperty(
    ConsumerConfig.ISOLATION_LEVEL_CONFIG,
    IsolationLevel.READ_COMMITTED.toString.toLowerCase(Locale.ENGLISH)
  )

I understand why it is safe and a good practice to use transactions in all the steps of a streaming flow,
but I don’t understand why the isolation.level is forcibly set to ‘read_committed’ instead of just as a default value.
Why shouldn’t I be able to use transactions when I’m consuming non-transactional messages?

I have to admit that the comment is unclear, especially the following part

by setting isolation.level config to read_committed consumers will still consume non-transactional messages

seglo · January 23, 2019, 9:46am

Hi @laymain ,

The reason this consumer property is enforced is to ensure that this transactional workload will not consume any messages that were produced by a previous upstream transaction that was aborted. For example, if the upstream transactional workload produced a message, but has not yet committed it and then it was consumed by this transactional workload then there is no guarantee that the message will be part of a successfully committed or aborted transaction. In other words, using read_committed is required to enforce Kafka exactly-once semantics guarantees across a pipeline of transactional workloads in serial.

This is consistent with Kafka Streams when using the EoS (Exactly-once Semantics) processing.guarantee config.

Additionally, consumers are configured with isolation.level="read_committed" and producers are configured with retries=Integer.MAX_VALUE , enable.idempotence=true , and max.in.flight.requests.per.connection=1 per default.

The Kafka Streams EoS KIP design document provides a more complete explanation.

Stream thread’s embedded consumer should set the isolation.level config to read_committed to make sure that any consumed messages are from committed transactions. Note that the consuming partitions may be produced by multiple producers, and these producers may either use transactional messaging (for example, if they are Streams’ embedded producers) or not at all. So the fetching partitions may have both transactional and non-transactional messages, and by setting isolation.level config to read_committed consumers will still consume non-transactional messages.

I have to admit that the comment is unclear, especially the following part

by setting isolation.level config to read_committed consumers will still consume non-transactional messages

In other words, messages produced by producers that are not part of a transaction can be consumed when read_committed is used because those messages are immutable and there’s no risk of them being “rolled back” as part of an aborted transaction.

Hope that helps!

seglo · January 23, 2019, 9:56am

More details about read_committed from the Exactly-once Semantics KIP design doc: https://docs.google.com/document/d/11Jqy_GjUGtdXJK94XGsEIK7CP1SnQGdp2eF0wSw9ra8/edit#bookmark=id.nogouvgyxtup

laymain · January 23, 2019, 10:10am

Thanks for you answer.

That’s what I was missing! So, we can say that messages in a topic have 3 potential “status”:

uncommitted
committed
non-transactional

I think I have an issue with my producers that feed my input topic, they’re not using transactions
but I can’t consume the produced messages using a TransactionalSource (no problem using
a CommittableSource).

Is there any way to see the “status” of messages in a topic?

Topic		Replies	Views
Commiting offsets for Kafka messages that are filtered Akka Streams & Alpakka akka , kafka , streams	1	1172	October 23, 2020
Using multiple sources with Transactional.sink Akka Streams & Alpakka	3	1070	January 10, 2020
Support of transactional partitoned source using multiple transactional sinks Akka Streams & Alpakka	1	261	July 17, 2023
Transational push of a Kafka Producer Akka Streams & Alpakka kafka	2	930	January 14, 2019
In-correct documentation	0	400	August 26, 2021

Why TransactionalSource needs isolation.level to be read_committed?

Related Topics