Sync up a new service and subscribe for updates

What is actually the prefered way to both sync up a new service A with data from another service B and also subscribe for further updates?
As I understand it, if A subscribes to B’s kafka topic it starts consuming from the “head” of the topic and does NOT consume messages from the “past”.
Therefore the only 2 solutions I can think of so far are these:

  1. Create a new topic just for A and let lagom fill the topic with the messages generated from the beginning of the event journal. For me it seems a bit ugly to always create a new topic for a new subscriber.
  2. Subscribe to the topic AND additionally fetch the current state of B (via HTTP API). This requires some ordering/sequence nr in kafka messages and HTTP response data to be able to correctly drop messages received on the kafka topic if they are “older” then the ones received in the current state response.

Both solutions have drawbacks.

I think thats a very common problem. Is there already a good solution for it?

I am hoping to find a way to replay a kafka topic only for given consumers. So my new service A will receive all messages generated from the beginning of the event journal but other consumers will “ignore” those “old” messages which they have already seen.

It will consume from the beginning, but oldest events may have been deleted from the topic depending on configured retention time.

Thanks for the reply!

I still dont fully understand the kafka behaviour.
For example:
What would happen if I set the “topicProducer” offset for a given topic of service B to 0?
I understand that service B would “re-produce” messages for this topic from the beginning of the event journal.
But I dont understand if those messages would just be appended to the partition(s) of this topic or if the partition(s) are “mutated”. What I mean by mutated: I can see that there is both a Message and an Offset specified when producing a topic message. Is the Offset used to place the message at a given offset into the partition?
So if I have for example 5 events in my journal would that mean that re-producing the topic messages from the journal would put them into the kafka partition offsets at 0, 1, 2, 3, 4? or would they just be appended to the partition?

I am asking this because I would like to know if I can just “repopulate” the topic from the event journal and let new services/topic consumers do a full sync up while other existing consumers would not re-consume those “old” messages (and therefor not re-apply possible side effects).

Note that there are two offsets in play here. One is the offset in the event journal that is used by the topicProducer. Another is the Kafka consumer offset, which is a separate offset managed by Kafka for each consumer of the topic.

When you start a new consumer (new service) it will consume all messages from the topic, using it’s own offset. Unless messages have been deleted from the topic (the retention I talked about previously).

Resetting the journal offset to re-produce all events to the same topic doesn’t make much senses because then there will be duplicated events in the topic and those duplicates will be received by old consumers.

There is some other good info here: