How to use supervision/failure handling properly with asks?

Alright, because this problem is rather complex, let me start out with a design sketch:

This sketch is a very much simplified version of the real thing, but I think it shows all that you need to know to understand my issue.

I’m going to start with saying that in retrospect, perhaps we shouldn’t have chosen this design at all. However, sadly at the moment I’m stuck in a situation where I only have time for bugfixes and can’t do an entire redesign - I’ll have time for that later. Right now I’m looking for a quick fix that fits in what I have here.

Okay, so the main challenge here was that when a request is sent into the Akka HTTP route, we need to gather several pieces of information from a rather bad database implementation that doesn’t support transactions of any kind and returns a scala Future on every request. We started out with a Flow-based design, but all the mapping of Futures and checking whether the result was still valid quickly got out of hand, so we opted for Actors instead.

Now, what happens if everything goes well is what you see above. The HTTP route’s request enters a Flow that sets up an EntityProcessingActor and asks it for a result. The EntityProcessorActor does a bunch of db requests (one is shown), pipes the result to self, and whenever it gets a message, adds it to its own state. One db query returns a bunch of ids for subentities. Those need their own processing before being added to the EntityProcessorActor’s state, which is done in child actors that work very similarly to the main actor.

Whenever an actor processes a message, it checks if its state is complete, and if so, sends a message to the original sender of its first request, and then stops itself. In case of a SubEntityProcessorActor sending a message to its parent, the EntityProcessorActor handles it like any other and adds it to the state. In case of the EntityProcessorActor completing, it sends the answer to the original ask, the Flow does some more steps that aren’t important for my problem, and the route responds with a 200 OK.

If the actor never completes, the ask times out and we can handle that.
However, something weird seems to happen when one of the child actors throws an Exception. Ideally we want to forward that Exception all the way back to the Flow so we can return some error HTTP response code.

We tried to set up something like that with an Escalate SupervisorStrategy, but it doesn’t seem to work. The ask just times out and that’s it. Worse than that, either the EntityProcessorActor restarts or it just never shuts down. So, if the API user sends another request to process “abcd” actor, it fails immediately because the flow is unable to create a new actor with the name “abcd”, as that one is still around somewhere.

My question is: how can we cause the actor to shut down and not restart in case itself or a child throws an exception, and is there any way to return that exception to the Flow with the ask?

One way could be to use a stopping supervision strategy in combination with watching the child actor after spawning it in the EntityProcessing actor, this doesn’t let you send the specific exception back to the route (but that may be a good thing, it is likely better to log it and then just tell the requestee “something” went wrong than sending a stack trace back three-tier-style, after all, what can the user do about the internal error except for possibly retrying?).