Help with Akka Actors reading and parsing XML files. Design

Hi,
I Have a directory with XML Files.
Each file needs to be parsed into an object that is then turned into an Actor with the state of the content from the file.
The Actor can then receive update change messages and when sent a “Save” command will write the XML file back to the disk.
One of the main reasons for using Actors, is that there is a requirement that only one person at a time make changes to the file. An Actor will track who is making the changes and in the case that there are no changes, in like half an hour, can mark the document as open for edit again.

I am saving the commands and the final file.
I will have a FilesSupervisor Actor that will have a command for loading the content of a file and turning it into an Actor for that specific file.

Obviously, working with files is blocking! Where do I do the file handling and parsing?

Option One:
Have a worker Actor whose job it is to read the file and parse the object. This object is then sent as a message to the XMLFile Actor.
Same for a worker actor that writes the final object to the disk.
I can create a special dispatcher for these actors (since they do blocking things) and can put them behind a router to have more of them if needed

Option Two:
In the XMLFileActor, it is created with a name and id. It finds the file on the disk, then reads and parses into an object that is then the state of the actor.
This can be done with the Alpakka library (I think) and is non blocking, since Alpakka manages the blocking operations - is this correct?
Is option two a good idea and if so, is there an example of using Alpakka in this scenario in Java?

Please advise. What is the recommended way to handle Actors that need to have their initial state loaded from a file containing an XML that needs to be parsed, then changed, and then saved.

Thanks,
Tamara.

Hi,

Obviously, working with files is blocking! Where do I do the file handling and parsing?

Actually, if you can use Akka Streams to read and write files the blocking is either avoided or at least hidden so that you do not have to care about that.

If not using streams but the blocking JDK file ops you are correct that you should isolate those actors on a separate dispatcher so to not starve the other actors in the system.

What is the recommended way to handle Actors that need to have their initial state loaded from a file containing an XML that needs to be parsed, then changed, and then saved.

I think both options you mentioned sound quite reasonable, if you are sure actors are the right abstraction. If I was you I’d try to sit down and list pro:s and con:s with each of them (what can fail and how, in what way do each make things faster or slower, where are the bottlenecks etc.).

But, if the task is something like a batch or stream process: for N files, parse, transform and write each, all as fast as the files on the write side can be written an possibly keep doing a few in parallell, then using only Akka Streams could make more sense leading to a more clear and easy to reason about app than trying to model the same with actors.

1 Like

Thank you Johan!
I do need the Actor abstraction since the content of the files are updated by backend users. I need to support the fact that once a files is “opened for edit” another editor can’t make changes but only receive a “read only” view. In addition, using Actors and saving the change commands, I can provide a fairly decent “undo” functionality.
I love coding with Actors but I understand the ETL scenarios that are best implemented with simple streams.

If I can load the content of the file using a stream inside the actor using Akka streams or Alpakka, I think this would give me a nice solution.
I might offload writing the file back to the disk (S3) using a separate actor.

Thanks again for your reassurances and assistance.
You guys truly rock!

Tamara.