Alpakka S3 GetObject metadata as materialized value

DISCLAIMER: I’m fairly new to Akka streams, so this is my best understanding of the behaviour.

Alpakka S3 version 4.0.0 deprecated S3.download in favour of S3.getObject. The relevant parts of the signatures:

@deprecated("Use S3.getObject instead", "4.0.0")
def download(...): Source[Option[(Source[ByteString, NotUsed], ObjectMetadata)], NotUsed]
// vs 
def getObject(...): Source[ByteString, Future[ObjectMetadata]]

The trouble I’m finding with this, is that in order to access the materialized meta data you need to first consume at least one chunk of the stream of ByteString’s. The tests show what I mean:

val s3Source: Source[ByteString, Future[ObjectMetadata]] = S3.getObject(bucket, bucketKey)
val (metadataFuture, dataFuture) = s3Source.toMat(Sink.head)(Keep.both).run()
...

HttpResponse(
  entity = HttpEntity(
    metadata.contentType
      .flatMap(ContentType.parse(_).toOption)
      .getOrElse(ContentTypes.`application/octet-stream`),
    metadata.contentLength,
    s3Source
  )
)

But in order to consume the entire stream (as opposed to Sink.head), you’d need to run the source again, which is what the HttpEntity above will do, correct? That’s two network requests right? Which means you’d be better off calling getObjectMetadata yourself instead.

The deprecated download method doesn’t suffer from this, since you’re only consuming the outer Source that will give you the meta data as well as another source for the body. Clear from the previous version of the tests:

val s3File: Source[Option[(Source[ByteString, NotUsed], ObjectMetadata)], NotUsed] =
      S3.download(bucket, bucketKey)

val Some((data: Source[ByteString, _], metadata)) =
  s3File.runWith(Sink.head).futureValue: @nowarn("msg=match may not be exhaustive")

(Granted the rest of the old test consumed the body anyway before passing to HttpEntity, but you could instead simply pass the data: Source[ByteString, _] as-is to HttpEntity)

In summary, so while the download API might not be as simple or ergonomic to use, I’d argue that it had a very good reason for doing so, and shouldn’t really be deprecated.

Am I missing something obvious with this assessment?

Some explanation for the change can be found in the issue (and PR fixing it) here: Memory leak when using S3.download to ONLY check if object in S3 exists · Issue #2866 · akka/alpakka · GitHub

Thanks for that context.

Clearly the same concerns as mine were raised, but just brushed aside.