Getty very large amount of missed events for currentEventsByTag query

Hi ,

I encountered a very odd issue when using cassandra persistence query, currentEventsByTag.

Relevant log: java.lang.RuntimeException: 97124 missing tagged events for tag [TAG]. Failing without search.

My configuration is pretty basic and per reference conf.

events-by-tag {
bucket-size = "Hour"
eventual-consistency-delay = 2s
flush-interval = 25ms
pubsub-notification = on
first-time-bucket = "20201001T00:00"
max-message-batch-size = 20
scanning-flush-interval = 5s
verbose-debug-logging = true
max-missing-to-search = 5000
gap-timeout = 5s
new-persistence-id-scan-timeout = 0s
}

We are deploying on aws keyspaces, atm there are 7-8 persistent actors at the moment.
We are using currentEventsByTag(with Offset) to implement a data polling endpoint.

This issue only happens when offset is set to the very end of the previous hour, i.e. 59th minute and 31+s of previous hour. After clock moves another hour, retrying the same query will not cause this error.
Example: if now is 14:05:00 and I try to get data from 13:59:59, I will get error in logs that 97124 events are missing, but after we “fixed” this it turns out that there were only 59 events for that query, which was confirmed in the database.

There were 2 ways to “fix” this, either setting max-missing-to-search to 100k or 1M, which would be fine I guess if I knew that this number(missing events) will never grow from 97124 or to set new-persistence-id-scan-timeout = 0s per advice in akka gitter chat.

I choose the latter but I am still not sure that it’s a correct fix.

I kindly ask you to provide any feedback as I couldn’t find anything about this case in documentation. Thanks.

1 Like