Hello, we use
CassandraSession#selectAll to query our Cassandra cluster for some additional data separate from akka-persistence. Because of some application requirements we block on the result before continuing. We had a case in production where the
CompletionStage (we’re using the Java DSL) never completed and caused the entire thread to block indefinitely due to lack of a timeout.
"application-cassandra-plugin-default-dispatcher-7889" #29169 prio=5 os_prio=0 tid=0x00007fae38042000 nid=0x7abe waiting on condition [0x00007fadd8e0f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000732b05e58> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at com.twilio.application.ItemDAOCassandraImpl.get(ItemDAOCassandraImpl.java:66) ... snip ...
Where the code is just:
final Select select = QueryBuilder.select() .from(KEYSPACE, TABLE) .where(QueryBuilder.eq(ACCOUNT_ID, accountId.getValue())) .limit(1); return cassandraSession .selectAll(select) .thenApply(rows -> convertToProtocol(rows)) .toCompletableFuture() .get();
.thenApply() is simply object conversion using
There’s no evidence that there was any problem with our Cassandra cluster (network, timeouts, errors, etc.), and I also see nothing obvious in akka-persistence-cassandra that could cause this looking at the SelectSource though I’m not the most well-versed in akka streams.
While we already have plenty of fixes to prevent this from happening again (as well as moving to
CassandraSession#selectOne as that fits our use-case better), I’m hoping to understand why
selectAll may have not returned so that we can reproduce and ensure all of our betterments account for this failure mode.
- akka 2.5.13
- akka-persistence-cassandra 0.85
- Cassandra 3.0.9
I can give more details if needed, not sure what would be relevant.