cassandra - Stream error during end-stage of decommission -
i got following exception during end-stage of node's decommission:
exception in thread "main" java.lang.runtimeexception: java.util.concurrent.executionexception: org.apache.cassandra.streaming.streamexception: stream failed @ org.apache.cassandra.service.storageservice.unbootstrap(storageservice.java:2946) @ org.apache.cassandra.service.storageservice.decommission(storageservice.java:2903) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ sun.reflect.misc.trampoline.invoke(methodutil.java:75) @ sun.reflect.generatedmethodaccessor11.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ sun.reflect.misc.methodutil.invoke(methodutil.java:279) @ com.sun.jmx.mbeanserver.standardmbeanintrospector.invokem2(standardmbeanintrospector.java:112) @ com.sun.jmx.mbeanserver.standardmbeanintrospector.invokem2(standardmbeanintrospector.java:46) @ com.sun.jmx.mbeanserver.mbeanintrospector.invokem(mbeanintrospector.java:237) @ com.sun.jmx.mbeanserver.perinterface.invoke(perinterface.java:138) @ com.sun.jmx.mbeanserver.mbeansupport.invoke(mbeansupport.java:252) @ com.sun.jmx.interceptor.defaultmbeanserverinterceptor.invoke(defaultmbeanserverinterceptor.java:819) @ com.sun.jmx.mbeanserver.jmxmbeanserver.invoke(jmxmbeanserver.java:801) @ javax.management.remote.rmi.rmiconnectionimpl.dooperation(rmiconnectionimpl.java:1487) @ javax.management.remote.rmi.rmiconnectionimpl.access$300(rmiconnectionimpl.java:97) @ javax.management.remote.rmi.rmiconnectionimpl$privilegedoperation.run(rmiconnectionimpl.java:1328) @ javax.management.remote.rmi.rmiconnectionimpl.doprivilegedoperation(rmiconnectionimpl.java:1420) @ javax.management.remote.rmi.rmiconnectionimpl.invoke(rmiconnectionimpl.java:848) @ sun.reflect.generatedmethodaccessor25.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43) @ java.lang.reflect.method.invoke(method.java:606) @ sun.rmi.server.unicastserverref.dispatch(unicastserverref.java:322) @ sun.rmi.transport.transport$1.run(transport.java:177) @ sun.rmi.transport.transport$1.run(transport.java:174) @ java.security.accesscontroller.doprivileged(native method) @ sun.rmi.transport.transport.servicecall(transport.java:173) @ sun.rmi.transport.tcp.tcptransport.handlemessages(tcptransport.java:556) @ sun.rmi.transport.tcp.tcptransport$connectionhandler.run0(tcptransport.java:811) @ sun.rmi.transport.tcp.tcptransport$connectionhandler.run(tcptransport.java:670) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745) caused by: java.util.concurrent.executionexception: org.apache.cassandra.streaming.streamexception: stream failed @ com.google.common.util.concurrent.abstractfuture$sync.getvalue(abstractfuture.java:299) @ com.google.common.util.concurrent.abstractfuture$sync.get(abstractfuture.java:286) @ com.google.common.util.concurrent.abstractfuture.get(abstractfuture.java:116) @ org.apache.cassandra.service.storageservice.unbootstrap(storageservice.java:2941) ... 36 more caused by: org.apache.cassandra.streaming.streamexception: stream failed @ org.apache.cassandra.streaming.management.streameventjmxnotifier.onfailure(streameventjmxnotifier.java:85) @ com.google.common.util.concurrent.futures$4.run(futures.java:1160) @ com.google.common.util.concurrent.moreexecutors$samethreadexecutorservice.execute(moreexecutors.java:297) @ com.google.common.util.concurrent.executionlist.executelistener(executionlist.java:156) @ com.google.common.util.concurrent.executionlist.execute(executionlist.java:145) @ com.google.common.util.concurrent.abstractfuture.setexception(abstractfuture.java:202) @ org.apache.cassandra.streaming.streamresultfuture.maybecomplete(streamresultfuture.java:216) @ org.apache.cassandra.streaming.streamresultfuture.handlesessioncomplete(streamresultfuture.java:191) @ org.apache.cassandra.streaming.streamsession.closesession(streamsession.java:331) @ org.apache.cassandra.streaming.streamsession.convict(streamsession.java:600) @ org.apache.cassandra.gms.failuredetector.interpret(failuredetector.java:237) @ org.apache.cassandra.gms.gossiper.dostatuscheck(gossiper.java:643) @ org.apache.cassandra.gms.gossiper.access$700(gossiper.java:64) @ org.apache.cassandra.gms.gossiper$gossiptask.run(gossiper.java:170) @ org.apache.cassandra.concurrent.debuggablescheduledthreadpoolexecutor$uncomplainingrunnable.run(debuggablescheduledthreadpoolexecutor.java:75) @ java.util.concurrent.executors$runnableadapter.call(executors.java:471) @ java.util.concurrent.futuretask.runandreset(futuretask.java:304) @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.access$301(scheduledthreadpoolexecutor.java:178) @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.run(scheduledthreadpoolexecutor.java:293) ... 3 more
this exception trace came nodetool decommission command itself. no exception logged in decommissioned node's system.log
in receiving node, exception follows:
info [nonperiodictasks:1] 2014-06-02 04:40:53,101 secondaryindexmanager.java (line 146) index build of [myks.mycf] complete error [nonperiodictasks:1] 2014-06-02 04:40:53,240 cassandradaemon.java (line 198) exception in thread thread[nonperiodictasks:1,5,main] java.lang.runtimeexception: outgoing stream handler has been closed @ org.apache.cassandra.streaming.connectionhandler.sendmessage(connectionhandler.java:170) @ org.apache.cassandra.streaming.streamsession.maybecompleted(streamsession.java:620) @ org.apache.cassandra.streaming.streamsession.taskcompleted(streamsession.java:566) @ org.apache.cassandra.streaming.streamreceivetask$oncompletionrunnable.run(streamreceivetask.java:120) @ java.util.concurrent.executors$runnableadapter.call(executors.java:471) @ java.util.concurrent.futuretask.run(futuretask.java:262) @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.access$201(scheduledthreadpoolexecutor.java:178) @ java.util.concurrent.scheduledthreadpoolexecutor$scheduledfuturetask.run(scheduledthreadpoolexecutor.java:292) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745)
a brief background
i have 4 nodes in solr dc including node i'm decommissioning. 2 of 3 remaining nodes completed streams , index build few days ago; , last 1 (the slowest) completed index build few hours ago. shown in node's log above, exception happened right after completion of index build.
the question
can assume node has decommissioned despite exception? far can tell, files last node supposed receive transferred (nodetool netstats shows 100% files before exception happened). thinking stream error pertains failure close stream session - because noticed though streams completed days ago, session kept open until index build done (the proof netstats still showing output during several days index build running). need confirm if correct , whether can safely delete data files in decommissioned node.
some additional info
- dse 4.0.3 (cassandra 2.0.7)
- vnodes enabled
- centos 6 x86_64
- nodetool status , nodetool gossipinfo still show decommissioned node "leaving"
i think hit 1 of many streaming bugs present in 2.0 days. maybe one:
https://issues.apache.org/jira/browse/cassandra-8343
if not 1 there few others. in case main advice cluster latest in 4.8. if operational reasons in short term that's not practical @ least push latest 4.0. there lot of fixes between , current:
https://docs.datastax.com/en/datastax_enterprise/4.0/datastax_enterprise/rndse40.html
the 4.0.7 fix list alone eye-watering!
Comments
Post a Comment