I have a task to write items which coming from Akka stream into database. Database (dynamodb from AWS) supports batching and I already implemented the thing using this - batch • Akka Documentation .
However from the documentation I see that it will only group elements from stream into batches in case if downstream backpressures. I’m not sure that the client I’m using to access database will be doing backpressure and I also do not want batching logic to depend on it. I considered using buffer (Buffers and working with rate • Akka Documentation) instead, but it seems like it would produce items to downstream only when buffer is full? This is also not really what is needed, because I don’t want buffer to wait for all 25(AWS limit on batch) items indefinitely.
What I need is something like Kafka client behavior - batching that has a limit (exactly like
batch streaming method), but also with “lingering” timeout, so that stream won’t produce 1-length batches each time it receives element and there’s no backpressure from DB, but rather wait some minimal time for more elements to join the batch.
Is this possible to do with either of those streaming methods? If not, is there an alternative?
Additionally I wanted to ask for clarification what exactly this line from “batch” docs mean -
Will eagerly pull elements, this behavior may result in a single pending (i.e. buffered) element which cannot be aggregated to the batched value.
Why single buffered element cannot be aggregated? It seems easy to me - just call the
seed method and here you go - batch with single element. What exactly “pending” means? That element will be wrapped with
seed and sent downstream? Or that it would wait in a buffer indefinitely until second element arrives? I think this document needs some more detailed explanation…