Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Can't get rid of a job
- Date: Mon, 04 Apr 2005 22:05:52 -0700
- From: Cathy Pfister <cathyjp@xxxxxxxxxx>
- Subject: [Condor-users] Can't get rid of a job
Hello,
I'm current playing around with condor to see how it works, and got into
a situation where I can't remove a job from the queue. I have one
machine set up as central manager and submitter, and another as
executer. I've been able to submit and complete jobs in a vanilla
universe and everything seemed OK, except that I see the jobs constantly
cycle between suspended and unsuspended like the manual warns about on
linux.
Anyway, then I tried to see if I could specify input and output files to
transfer (there's a shared file system, so it's not really necessary),
and intentionally specified an output file that I knew wouldn't be
found. It looked like the job completed (several suspends later) based
on logging that my script did, but the job stayed in the queue. I tried
to release it but condor says it can't be released (either with -all or
user name). I then shut down the condor daemons on both machines (which
evicted the job) and restarted them, but the job was still there. It
appears to be trying to run the job over and over again, but can't ever
transfer the output file. The job id never changes.
How can I get rid of the-job-that-won't-die? I was only fooling around!
Thanks,
Cathy