Read-only Persistent Actor

I’m thinking about the following: Say, I have a live application that collects data and represents the processed data in a persistent actor. The persistent actor’s events are written to a distributed cassandra database.

Now I have a second application, which is not collecting data, but where I want to query/read the state of the previously named persistent actor. So rather than implementing CQRS by writing the processed actor’s state to a database, I want to have a read-only representation of the same actor in the JVM of the second application. I suppose it would be good for resilience if the two applications do not have to join the same cluster, but they can share code.

Any thoughts on this, and ideas how this could be best implemented in practical terms?

Sharing the same database between two applications creates some tight coupling around the serialized form of the events, I think that is a quite important caveat to consider - the database is now a form of public API and will require the same care wrt compatible versioning of format as say a HTTP endpoint. For this reason the general recommendation would be to keep the database as an internal detail of each application and share state some other way. Publishing the each events to Kafka or something like it as a projection rather than a stored state could be one option that provides looser coupling and a more clear border between “data on the outside vs data on the inside”.

For the concrete technical side, If you still want to do it, you could either have a flag for your Event Sourced Behavior to make sure it never returns any effect persisting events. Another alternative, perhaps more clear/clean, would be to use the eventsByPersistenceId query and share only the state building logic, feeding it deserialized events for the read-only variant.

1 Like

Thanks for the insightful response. Not turning the database into an API makes a lot of sense. I was thinking about this question for a while and ended up settling for a gRPC-based microservice architecture rather than a distributed monolith - exactly for the reasons of coupling that you outlined.

So instead of having read-only instances of this actor, other microservices can read from the singleton actor through a gRPC service. I can still use cluster capabilities within the service. If scalability was an issue there could be one write-node, and multiple read-only nodes, using the methods that you described. Other services could retain (cache) the immutable data in memory as appropriate. But realistically, that will never be required for this particular use-case. Interesting considerations nonetheless.

Kafka also sounds interesting, but as a single developer I’d first have to study it.

That sounds like a reasonable design. Kafka, or some other messaging bus would allow you to detach the two services so that both do not need to be running at the same time for the “projection” to be available, but on the other hand it will instead add operational complexity to the system.

1 Like

That’s also interesting. So a resilient messaging bus not only decouples the implementation, but also availability. Thanks for sharing this concept.

My first approach at this project was crushed under the weight of doing too much too early (you can count operational complexity into that) so I’ll be more careful to go a simple route this time.