centos - HPC compute node not running jobs -


i don't have lot of information, please let me know can diagnose this.

my hpc has few compute nodes , 1 of jobs had submitted last night paused after few hours of runtime. checked qstat morning , found had made no progress since had last checked it, yesterday. other nodes seem processing jobs fine.

i deleted job , resubmitted it, appears if it's in queue, though there no other jobs scheduled ahead of it.

gstat shows has no processes lined up, node active.

qstat -s says "not running: draining system allow starving job run"

if it's helpful, set in centos 6.5 environment.

what else can diagnose issue?

it turns out torque scripts running more 24 hours cause pause placed on other jobs submitted scheduler. needed kill responsible job , fell place.


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -