FlowId in logs For debugging

Hi,
At work we are using lagom for last 3 months. We would like
to add flowId feature in logger i.e. (Logs will contain flowId which
will help us track from which flow the unit was invoked). This will come
in handy for debugging, like if we want to track what happened to a request and it was interacting with multiple services during its flow, using the flowId we can track the same request in other services as well. An example would be If a request in
service A with flowId “flowIdA” invokes service B then the flowId for
that request in service B will be prefixed by flowIdA (flowIdA:flowIdB)
and so on.

    **A sample snippet**
 public class TestOneServiceImpl implements TestOneService{
        
                @Override
            public ServiceCall<TestOneInput, Done> testCallToSecond() {
                return incomingStr -> {
                    logger.info("incoming String in testOne is "+ incomingStr.getInput());
                    return twoService.testCallToThird("Hello").invoke().thenApply(returnStr -> {
                        return returnStr;
                    });
                };
            }
                
        }
        
        public class TestTwoServiceImpl implements TestTwoService{
        
                @Override
            public ServiceCall<NotUsed, Done> testCallToThird(String incomingStr) {
                return notUsed -> {
                    logger.info("incoming String in testSecond is "+ incomingStr.getInput());
                    return thirdService.testMe("flowIdAppended").invoke().thenApply(returnStr -> {
                        return returnStr;
                    });
                };
            }
                
        }

In this case the logs will look something like

04:14:55.387 [info] [Flow-101102010] incoming String in testOne is test
04:14:55.387 [info] [Flow-101102010:121700433] incoming String in testSecond is hello
04:14:55.387 [info] [Flow-101102010:121700433:234777649] incoming String in testThird is flowIdAppended

This can be implemented by

  • Having a new logger module which will internally use slf4j but will add additional MDC attribute “FlowId” from ThreadLocal.
  • FlowId has to be passed across services and this can be done using headers
  • Since we write the code in asynchronous manner, The flowId needs to be passed to the futures as well. This is the part which i am not sure how to implement. On way could be we return an extended implementation of Promise which will wrap flowId in it. Then before executing the onComplete/exceptionally part we will set the flowId in threadLocal. But i guess this would require changes in Play Framework as it is creating the future object during service calls. I am not able to think of any alternatives to do the same.
  • We we want flowId to pass to ReadSides as well the the flowId has to be wrapped into events of PersistentEntity which looks like a dirty hack to me as we are adding data to events which previously contained only business related data. Same is the case for services which are listening to kafka events.

I would like someone to clear my doubts which are mentioned above :slightly_smiling_face:. There is also a possibility that we actually do not need a flowId feature in lagom, in that case i would like to know how users debug a complete flow of a request when we communicate across services.

I’ll start by mentioning the Lightbend Telemetry product (also known as “Cinnamon”) which is part of the commercial Lightbend subscription and includes support for this kind of distributed tracing and context propagation.

Implementing this yourself is of course possible, but would be a pretty big effort. The approach used by Cinnamon is based on bytecode instrumentation performed by a Java agent.

Cinnamon doesn’t fully cover everything you are talking about. For example, it does not propagate the request ID through read-side processors and Kafka topics.

Kafka now has support for message headers that you could use for this (since version 0.11) but so far, Lagom only supports this on the consumer side and does not yet support an API for producing messages with headers. This would be a nice addition to the framework, and if you’d like to contribute it, that would be very welcome. Start a new discussion topic if you need some help.

Akka Persistence doesn’t have any facility for adding key/value metadata to an event, so it would need to be part of the persisted event data. One option might be to use a custom serializer that adds it to the event data and restores it on the read side, but as you mention, propagating this across async boundaries could be tricky.

@TimMoore - Thanks. We were thinking instead of implementing the feature, if lagom could be integrated with other Monitoring and Troubleshooting platforms as well ?

What did you have in mind exactly?