Do I have to delete snapshots after Event Migration

lejoow · May 28, 2018, 3:07am

Hi,

I am getting following errors after making some major event stucture changes:

[SnapshotMetadata(PortfolioEntity|xxxxx-2aab-478a-xxxxxx-beb1e071f82a,1512,1524909743413)] (1 of 3), trying older one. Caused by: [com.lightbend.lagom.scaladsl.playjson.JsonSerializationFailed: Failed to de-serialize bytes with manifest [com.stashaway.trading.portfolio.impl.portfolio.ces.PortfolioState]\nerrors:\n\t/allocationTargetRecords(1)/allocationTarget/strategyId: ValidationError(List(error.path.missing),WrappedArray())\n\t/allocationTargetRecords(0)/allocationTarget/strategyId: ValidationError(List(error.path.missing),WrappedArray())\n\t/portfolio/strategyId:

Can this be fixed by deleting the snapshots from the database? Does it mean we have to delete the snapshots table everytime we make a structural event change?

Thanks,

ignasi35 · May 28, 2018, 8:28am

Hi @lejoow,

You can delete snapshot whenever you want as long as you keep all the events of the journal forever. Snapshots are just a performance optimisation.

This being said, you can use Lagom’s Schema Evolution so that old-format persisted data is updated in-flight.

Cheers,

lejoow · May 28, 2018, 8:45am

Thanks @ignasi35
But I got that error even though I created the Transformation in the JsonSerialRegistry class. And that error disappeared when I deleted the snapshot of that Entity, and restarted the service without any code changes.

ignasi35 · May 28, 2018, 12:42pm

Hmm, that’s interesting. Transformations should work for both events and state serializers. If transformations are not used on snapshot (state) deserialization this could be a bug.

lejoow · May 29, 2018, 7:11am

Interesting… We are still using the older version of Lagom (1.3.5). Are you aware of any bug fixes that were introduced to cover this?

What do you need from me to assess whether this is a bug or not?

Thanks

ignasi35 · May 29, 2018, 11:40am

Only alternative I can think of is a reproducer on github with two commits. The first commit uses a case class State with a certain shape and the second commit has a new field in case class State and migrations. We would then be able to checkout 1st commit, runAll, emit some commands until a snapshot is produced, stop, checkout second commit, runAll over the existing Cassandra and see a failure/success.

I think we could use sbt new lagom/lagom-scala.g8 and I’d also tune the number of events before a snapshot is taken to 3 or 5 instead of the default 100.

Makes sense ?

lejoow · May 30, 2018, 5:56am

Hi Ignasi,

While I was investigating, I found something else to be somewhat strange.

When I did following query on the snapshots table:

select * from portfolio.snapshots
where persistence_id = 'PortfolioEntity|7d0273b8-8xxxxxx071-6b33ef4afcf7'

I found that snapshots were generated NOT at exactly every 100th sequences.

As you can see, some of the sequence_nr are at 601, 704, 805… and so on.

How is this possible when we set the snapshot size to be 100?

lejoow · May 30, 2018, 6:00am

Also, what is quite interesting is this error message:

Failed to load snapshot [SnapshotMetadata(PortfolioEntity|xxxx-8070-xxxx-b071-6b33ef4afcf7,1405,1522220657955)]

(3 of 3), last attempt.

Caused by: [com.lightbend.lagom.scaladsl.playjson.JsonSerializationFailed:

Failed to de-serialize bytes with manifest [com.stashaway.trading.portfolio.impl.portfolio.ces.PortfolioState]

Do you notice how the error message says it failed at the snapshot number 1405? Don’t you think it is quite strange that how it did not have any issue processing snapshots until 1405? I checked the snapshots number 1305,1205 and 1005. They have no structural differences to the snapshot 1405. If something in snapshot 1405 is causing it to fail, the process should have failed processing snapshot 1305 and 1205 as well.

Or perhaps, it is reading the snapshots in the decending order? meaning that Snapshot 1606 and 1506 will be processed first, and then 1405…

octonato · May 30, 2018, 7:08am

This can happen if, for instance, you have 99 events saved and you get a command that emits 2 or more events all together. In that case your events count jumps to 101 or more and you get this effect.

There is no harm.

octonato · May 30, 2018, 7:26am

I’m surprised by the fact that it’s reading snapshot 1405 and also by the first error you posted here that says “trying older one”. That suggests that it read 1606 and failed, probably it read 1506 and failed as well and now is trying to read 1405.

I didn’t know that we had this cascading thing. Probably a feature from the cassandra plugin I was not aware.

ignasi35 · May 30, 2018, 11:22am

(edit: what @octonato said . You can stop reading, it’s duplicate. )

IIRC tne 100 means that there’ll be a snapshot not-before 100 events. Sometimes a command causes many events, in that case, the events could be crossing the x%100 boundary but the snapshot happens after the last event on the Seq.

For example, if all your commands emitted 3 events your snapshots would be at 102 (first multiple of 3past 100), 204 (first multiple of 3 that’s 100 events father from 102),…

Topic		Replies	Views
Lagom snapshot and persistent entity storage configuration Lagom scala , configuration , lagom	1	1278	June 5, 2019
Schema Evolutions Lagom scala	2	1209	May 19, 2021
Deleting events for a persistence Actor Lagom akka	1	507	October 6, 2020
States are not being snapshotted Lagom Persistence API	3	888	June 16, 2020
How do you delete an entity in lagom? Lagom Persistence API	3	1761	December 21, 2018

Do I have to delete snapshots after Event Migration

Related Topics