NoHostAvailableException

(franz) #1

Hi,

Anybody what causes this and how to deal with this?

02:18:23.418 [error] com.lightbend.lagom.internal.javadsl.persistence.PersistentEntityActor [sourceThread=my-microservice-impl-application-akka.actor.default-dispatcher-16, akkaTimestamp=18:18:23.413UTC, akkaSource=akka.tcp://my-microservice-impl-application@127.0.0.1:50720/system/sharding/MyMicroserviceEntity/42/DUMMY, sourceActorSystem=my-microservice-service-impl-application] - Persistence failure when replaying events for persistenceId [MyMicroserviceEntityDUMMY]. Last known sequence number [0]
java.util.concurrent.ExecutionException: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency QUORUM (2 required but only 1 alive)))
        at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:503)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:462)
        at akka.persistence.cassandra.package$$anon$1.$anonfun$run$1(package.scala:18)
        at scala.util.Try$.apply(Try.scala:209)
        at akka.persistence.cassandra.package$$anon$1.run(package.scala:18)
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.UnavailableException: Not enough replicas available for query at consistency QUORUM (2 required but only 1 alive)))
        at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(RequestHandler.java:213)
        at com.datastax.driver.core.RequestHandler.access$1000(RequestHandler.java:49)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.findNextHostAndQuery(RequestHandler.java:277)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.retry(RequestHandler.java:441)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.processRetryDecision(RequestHandler.java:419)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:635)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1075)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:998)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:647)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:582)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:499)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:461)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        ... 1 common frames omitted

Thanks,
Franz

(Alan Klikic) #2

Hi @franz,

At what scale are you running cassandra?
From the log I would say 3 (journal query consistancy set to quorum, in your case needs 2).
Check this config:
reference.conf

Br,
Alan

(Alan Klikic) #3

Log is related to replying event for building entity state so you should check write side configuration.

(franz) #4

Thanks @aklikic. I’ve figured it out. One of my teammates configured cassandra replication for 2 but I only had one running in my local.

To fix it, I’ve moved that cassandra replication of 2 to our application.prod.conf (so that the dev application.conf is just 1).

Thanks!

(Darshak Sheth) #5

@aklikic @franz : I have one question regarding Journal writes. We are testing behaviour of write-retries and read-retries of Cassandra-journal and the question is whether it works as intended in-case of NoHostavailable exceptions? We are following below test steps and we are seeing journal writes fails. [since Persistent Actor dies]

Scenario:
1)Start the application (this will cause the keyspaces to be created)
2)Send event [ which started Persistent actor]
3)Event processed and seeing data in persistence table
4)Kill cassandra
5)Send event
6) Wait for a minute and start cassandra - [write failed no data in Persistence table for new event] [seeing NoHosatAvailable exception]

Note :There is only one Caasnadra node.

What my understading is regarding write-retries is : It will try to attempt write operation in Casandra DB no.of times whatever value we set of write-retries [i.e. write-reties = 12345].
Please correct me if my understanding is wrong and provide your thoughts on this.

[ Consider me as new bie in this area :) ]

(Tim Moore) #6

@dsheth28 when you restart the Cassandra node, is it running on the same or a different IP address and port? If the IP address changes, then it sounds like the same issue being discussed in this other topic:

and this issue:

The problem is that, when the Cassandra cluster completely shuts down and then comes back up on unknown IP addresses, the Cassandra driver is unable to reconnect without restarting.

(Darshak Sheth) #7

@TimMoore After restarting Cassandra it is running on same IP address and port.