Mental model about akka-persistence-cassandra read journal in combination with guaranteed delivery

(Jo Vanthournout) #1

I have a mental model about how akka-persistence-cassandra works. First of all, it would be great to see this model either confirmed or corrected. Continuing from the mental model I have a particular question (below the mental model)

Mental model

  • Two persistent actors A and B can write different events a and b to cassandra with the same tag using akka-persistence-cassandra
  • Both events a and b get a timebased UUID as offset upon the moment they are written to cassandra.
  • If another persistent actor C queries cassandra using queryByTag, it queries not cassandra itself but a “materialized view” of the events which will be eventual consistent.
  • If this other persistent actor C queries the materialized view, it returns a AkkaStreams stream.
  • The stream contains only those events that arrive/are written “after” the last event that was emitted on the stream, or for the example: If actor C queries and receives event a, but after a was emitted on the stream, b arrives in the materialized view in cassandra with an offset that indicates that b was written before a, then b will not be emitted on the stream.
  • It is possible to give these late-arriving events some more time by tweaking the property eventual-consistency-delay.
  • However, this is only best effort. By using the eventual-consistency-delay, you minimize the chances of missing an event, but there still is a chance that an event arrives too late.

Is this model correct or flawed?
_Is it correct that the setting eventual-consistency-delay is replaced in the latest version with gap-timeout and offset-scanning-period?

My question

  • Different persistent actors in different microservices kan write different (broadcast)events in cassandra. All these broadcast events get the same tag.
  • All these broadcast events also get an offset upon arrival in cassandra, which in the case of akka-persistence-cassandra is a timebased UUID.
  • The same or other microservices are interested in some of these broadcast events. They “listen” for broadcast events in which they are interested by performing a query by tag and filtering out the event types in which that microservice is interested.
  • I see however that (without specifying an eventual-consistency-delay) some messages are not picked up by the listener (they are not returned by the akkaStreams stream). The missing message however is present in cassandra.
  • In the application, all broadcast events have to be delivered at least once (no prob if they are delivered twice or more)
  • In the application, all microservices can cope with out of order events. So the arrival order in cassandra is not that important

Is it possible to configure, tweak or design the listeners in such a way, that all broadcast events with the same tag are eventually emitted on the Stream at least once?

(Patrik Nordwall) #2

In later versions of akka-persistence-cassandra the materialized view is replaced by an ordinary table that is updated by the plugin. It can still be seen as a materialized view, but it’s not using Cassandra’s mechanism for that. There is a rough explanation of how it works here.

I think the new property is called gap-timeout. There will be no missing events for a persistenceId that has already had some events, e.g. b1, b2, b3, delayed b4, b5. Then b5 will not be delivered until b4 is seen. I’m slightly uncertain about first event for new persistenceId, e.g. c1. I think it will be scanned during the gap-timeout, at least.

If you are using the old version of the plugin, and can’t upgrade yet you should consider using the delayed-event-timeout but make sure you read the documentation here:

Specifically that all events of a persistenceId must be tagged. That limitation doesn’t exist in the new version.