Possibility to store the events and offsets of the persistent entities on the message broker (Kafka)

I was wondering if can be a possibility to use the message broker API or a new API to store all the events (and maybe the offsets) used on the persistence layer and the read side as a partial replacement of the Cassandra duties.

I see a better fit for the events to be stored / read on a system like Kafka but at the same time i understand the complexities behind depending on 2 systems (eg: cassandra and kafka) for just one responsibility like state reconstruction from snapshots.

Apologies if this question is totally out of place or is a totally nonsense (my knowledge of Lagom it’s limited at the moment) i just felt that kafka can be better pro-fitted in some cases, but maybe the place of this is not on the message broker API, and should be a additional implementation/configuration of the PersistentEntity.

Probably there are a lot of stopper for something like this to be possible, probably things like the way to handle the partitions, so please enlighten me.

Thanks

@darkxeno there were so many times I thought about the same thing. Check this out.

So if I understood correctly that means that the transactional storage of the events it’s required maybe for the sequential entity updates to work correctly on the persistent entity layer as a failsafe mechanism.
But even if that’s the case there could be ways on how to overcome that by using the new kafka transational API: https://www.confluent.io/blog/transactions-apache-kafka/

Do you know whats the main reason behind the requirement of the multiple events atomic storage? I just found one or two mentions about it on the documentation but don’t remember to have found the actual reason behind.

Its the all or nothing storage for cascading of changes (doesn’t sound very addecuate for CQRS aggregates) or it’s only to assure state consistency to avoid the possibility of loosing events.

Thanks

@darkxeno did not have time to study all of this in details so can not comment.
But regarding multiple events atomic storage, I believe it is referring to ordered entity instance event batch insert with all or nothing.

From my perspective biggest advatange of kafka as a journal storage, in context of Lagom, is read side creation scalability.
With, for example cassandra storage, readside creation (readside processor) scalability depends on predfined static number of event shards that can not be changed (because of the event tagging algorithm). Readside processor event query is very heavy in case of big number of shards.

From my perspective cassandra journal storage in combination with Kafka would be the best match (regarding flexibility and scaling):

  1. store journal in cassandra as it is now
  2. use journal for write side and for publishing to kafka (using topic producer) - as it is now
  3. expose kafka consumer based readside processor (has no scale limitations)