Cluster design strategy


(George Soler) #1

Hello, I would like to ask for some cluster high-level design advice.

I have been trying to figure out if Akka would be a viable alternative to Quasar to write a distributed Java application, even though Akka does not support Java as well as it supports Scala. But writing the application in Scala is simply not an option.

So I have been studying Akka for a few weeks but I’m still having problems deciding on the best design approach for my Akka use case.

At the core of the design I would like to set four independent processes running in four different hosts on the same network. The purpose is not to distribute client load among the four processes, but rather to have each process perform a specifically dedicated unique job. The system does not require libraries to be shared across the four applications (not like a “distributed monolith” the way I read it).

As a requirement, I need actors in the four systems to be able to communicate with each other in order to update their dynamic state. The state updates may originate from any of the four processes. A state is a shallow POJO which contains around 40 primitive fields. Eventual consistency for the states of (let’s say ten thousand) client actors is not a problem as long as it doesn’t take an update more than around 4 seconds to propagate among all four systems.

I have the impression that Akka clusters are more geared toward distributing client load than distributing functionality. So I am still struggling with the following questions-

1- Does the 4-second max latency requirement sound unrealistic for updating states in 10,000 actors?

2- Would Akka cluster distributed data be a good solution? Or would distributed publish-subscribe be a better choice for this use case?

3- Should I use Akka routing? Is it an alternative to distributed data or to pub-sub in this case? Or does it work in addition to either?

4- Would it make sense to make each system’s top-level actor a singleton?

5- One application.conf file with four roles would start four members in each cluster node (16 members in all four processes). Since I am not concerned with redundancy and I only need to assign one role per process, would it be better to have four different config files each with a single different role? But wouldn’t that cause four separate clusters to spin up independently and thus prevent the actors from communicating state updates between processes (even if I include a seed role and the cluster names are the same)?

It’s a big topic, so I’m not asking for definitive answers. But if you can point out which of these strategies have better chance of success it would be a big help.

Thank you so much.


(Matt Howard) #2

Yea 10K objects @ ~40 attributes each sounds reasonable to sync within a few seconds at face value. Normally IO would be the most limiting factor in something like that, so let’s say at a conservative average of maybe 20 bytes per attribute you’re looking at pushing 8M bytes across the wire in <4 seconds which is reasonable even in subseconds.

Delivery guarantees would be one of the biggest impacts on IO and therefore throughput. If these need to be absolutely guaranteed then your architecture becomes a bit more complex and there is another level of IO that can really diminish throughput. The options range from persisted actors writing to a local file/journal handling the delivery guarantee to persisted actors writing to a database to using a message broker like kafka or any mq.

If you don’t need an absolute guarantee then probably distributed pub sub is a good fit but has at-most-once semantics. A broadcast router would effectively do the same thing but may take a little more effort to manage the creation and possibly registration of the routees across the cluster. In either case you can achieve effectively once semantics by adding acknowledgements and retries but again that will impact throughput.

As for distributed data I have only tinkered with it. But CRDTs are really meant for managing concurrent updates. So for example if Node 1 needs to update object A at the same time that Node 2 might also update object A then an ORMap or ORMultimap might save you a lot of headaches. Some CRDTs also support deltas so each change doesn’t send the entire object across the wire but only the changed attribute(s). If you aren’t changing all 40 attributes regularly then this becomes a very compelling solution. I’ve never done it but I assume you could add a persisted actor whose state is a CRDT in order to get guaranteed delivery.

For the top level actor these are effectively singletons because they are started by the actor system on startup. You could certainly create a pool of them if needed but I wouldn’t do that by default without some specific need. Kind of depends what that root will be doing, but it mostly exists to manage lifecycle of child actors so being a singleton is kind of nice.

For the application.conf - I’ve never tried but my understanding is that as long as the actor systems share the same name and seed nodes then they will form a single cluster. So no need for every conf to be identical on each node.

FWIW I started a large project with Akka cluster several years ago and we went all Java with no troubles. I’m starting another one now and am very seriously considering going mostly Scala if I can get my team trained up enough. But it doesn’t have to be a binary choice - we’ve discussed putting most of the Akka code in Scala so we can use more functional constructs and some convenient traits in actors, but maybe that code calls business logic in Java classes. Java is trying to become Scala lately so I’m finding my Java devs are more and more comfortable with it.

Hope that helps.


(Matt Howard) #3

Sorry I meant to ask why you need the state distributed - I took that as a requirement but there may be uses for sharding that could simplify your system. While sharding to partition load is very useful it also kind of abstracts away the location of actors. So for example if Object A needs to be retrieved by a message that could arrive on any node of the cluster then with cluster sharding you can basically work with it as if it were local. Under the hood if it actually is sharded on a different node then the ShardRegion handles that communication for you. That comes at the expense of potential messaging overhead when you work with your entities but eliminates the messy replication and delivery guarantees needed to to that at write time rather than read time. If they mutate very often then replication becomes more of a headache.

I have a similar sounding requirement on my new project and we probably will have to replicate on write but I’ve tried very hard to find ways around that.


(George Soler) #4

Matt, what an awesome and useful reply! Thank you so much!

I meant to ask why you need the state distributed - I took that as a requirement but there may be uses for sharding that could simplify your system.

There is no absolute requirement that the system be distributed in this sharding sense, so your suggestion is an interesting one. The only reason I hadn’t considered sharding is because I am quite new to Akka and I haven’t learned why/how to use cluster sharding yet. Until your reply sharding seemed like a low priority for me because I was under the impression it is mainly useful for distributing external client load among nodes rather than eliminating or replacing the need for inter node messaging. But sharding may indeed allow a node to work locally as you suggest (ShardRegion handles that communication for you). On the other hand, I don’t need strict message delivery guarantees -just the replication, so maybe sharding would just add more complexity without a reasonable payoff?

Delivery guarantees would be one of the biggest impacts on IO and therefore throughput.
If you don’t need an absolute guarantee then probably distributed pub sub is a good fit but has at-most-once semantics.

No need to guarantee each delivery. A state update message dropped every now and then is totally acceptable. It would only become a problem if say >5 consecutive messages were all dropped for the same exact client, not a likely occurrence if I read you correctly.

A broadcast router would effectively do the same thing but may take a little more effort to manage the creation and possibly registration of the routees across the cluster.

I will then forego the router and first try a design with simple pub-sub since I don’t need acknowledgements and retries. If a router would improve throughput dramatically, then I would consider it. My goal is to keep the design as simple as possible, but not more.

CRDTs are really meant for managing concurrent updates.
If you aren’t changing all 40 attributes regularly then this becomes a very compelling solution.

Thanks for another useful bit of info. I don’t think it will be necessary to update nodes with strict concurrency. This is a soft-real-time application -soft enough that pub-sub and eventual consistency (<4 sec) should work ok.

For the top level actor these are effectively singletons because they are started by the actor system on startup.

Here I’ve arrived at the same conclusion, thanks. No need to form a pool. What I’m trying first is to have just one gateway cluster node that runs a singleton top-level actor to manage the external inbound data connections. I know it’s a choke point but data subscription costs and design complexity make this the first choice to try out.

Once again, thanks for your help!