Eventual consistency, while being consistent

Hey,

I have the following scenario:

  • Kick off Lagom Process 1, that eventually saves something to the database via read side processor.
  • As soon as that happens, I want to execute Lagom Process 2 which queries the database, and the data from process 1 needs to be there since it depends on it.

Now, if I understand correctly, there is no guarantee, when the data from process 1 will be written, so I cannot hardcode a constant delay between the two processes.

What is the best way to deal with this?

You could have Process 1 publish an event to a topic on the message bus, and have Process 2 subscribe to that topic (see https://www.lagomframework.com/documentation/1.4.x/scala/MessageBroker.html).

The event could either carry the needed information (Event-Carried State Transfer, see https://martinfowler.com/articles/201701-event-driven.html), or Process 2 could just be notified ( Event Notification) and then call an API from Process 1 to get the needed information.

The data should be transferred in either an event or through an API call. Process 2 should never access the database of Process 1 directly (see for example the first principle of https://isa-principles.org).

I had the similar requirement in the past.

I calculate the value of the portfolio and save to the readside database, then I needed some other process to read that value from the database, create some other data using the queried data.

I tried many tricks including the putting the thread to the sleep for 20 seconds before the query gets fired.

Long story short, I figured that I should get out of this kind of linear & sequential thinking of designing my processes, and I need to think more from the event centric perspective.

I solved that problem by publishing that calculated portfolio value to the kafka topic, and let the other service to start its process once it receives the new portfolio value. I think what Lutz explained above is exactly this.

1 Like

Want to follow up this discussion.

So the solution to this is waiting for the data to be available on the database?

Let say I process all the events on the read side once 5 seconds, I will have a delay of 5 seconds?

Let’s say that process A adds something to a List, and after that I want to recover the List (including that new item that I added). I should wait 5 seconds to get the data, right?

No, I’d just reiterate what lutzh and lejoow said, the answer is explicitly NOT to wait because there’s no way to know how long to wait. The read side processor might be backlogged for example.

The answer is to find some other event driven way to solve the problem. As lutzh said that might be sending a Kafka message with all of the information, or it might be publishing an event that triggers a callback to an API.

Thanks for answer,
When should I send this Kafka message with all the information? It should be before read side process the event (if not I will not have the new information and it will be the same as waiting)
With all the information (in the example that I’m telling) we refer to all the elements of the list?