How to handle duplication and data loss in akka and kafka


(Nautiydp) #1

We are consuming json messages from a topic using kafka and akka. We have one topic and 1 partition.
The json data belong to three banks and every bank has 5 entity.
So after consuming the data we need to parse it based on the bank and its corresponding entity and after that we have to push that data into sales force with corresponding bank and its 5 entity.
we are using here kafka and akka Stream and acor. Its working fine but now the requirement is that we need to reduce the number of hits with database.
So we have created batch for every entity of particular bank and once a entity reached a particular size we are flushing that into the sales force.
But the problem we are facing here is data loss if consumer went down. How we can hand this situation any suggestion will appreciated.