Hi, We have a Messaging Platform built on top of Akka (2.5) using akka cluster and Distributed Pubsub. We have a cluster of 25 servers currently.
The scenario is as follows.
- Actor1 created in Server1 subscribes to a topic Chat1.
- Actor2 created in Server2 publishes a message over Chat1 (after around 100ms of subscription)
- Sometimes the 1st message is not received by Actor1 but subsequent messages always do.
We could derive that this is happening because of the fact that a subscription takes some time to register on all the nodes of the cluster. These are the actions we took to solve this -
- Decreased the gossip-interval from 1sec (default) to 50ms.
- Added a delay of another 400ms thus giving the cluster 500ms in total to register the subscription. This reduced the probability of the issue happening but its still pretty frequent (1/6 times around)
So few questions here -
- Is it expected for Pubsub to take more than 500ms in a cluster of just 25 (that too in private network of servers in the same data centre)
- Are there additional configurations in akka which can help in tweaking the time taken for subscription propagation.
- What are our options here to monitor the average time taken by Pubsub for subscription propagation within the cluster? This would help in getting the right estimate of delay to be introduced(if at all needed)
- If the above mentioned delay is expected, Are there any workarounds which has been used by someone in the past to overcome this issue.