Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Can't get rid of a job
- Date: Tue, 5 Apr 2005 00:40:39 -0500
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Can't get rid of a job
On Apr 5, 2005, at 12:05 AM, Cathy Pfister wrote:
Hello,
I'm current playing around with condor to see how it works, and got
into a situation where I can't remove a job from the queue. I have
one machine set up as central manager and submitter, and another as
executer. I've been able to submit and complete jobs in a vanilla
universe and everything seemed OK, except that I see the jobs
constantly cycle between suspended and unsuspended like the manual
warns about on linux.
Anyway, then I tried to see if I could specify input and output files
to transfer (there's a shared file system, so it's not really
necessary), and intentionally specified an output file that I knew
wouldn't be found. It looked like the job completed (several
suspends later) based on logging that my script did, but the job
stayed in the queue. I tried to release it but condor says it can't
be released (either with -all or user name). I then shut down the
condor daemons on both machines (which evicted the job) and restarted
them, but the job was still there. It appears to be trying to run the
job over and over again, but can't ever transfer the output file. The
job id never changes.
How can I get rid of the-job-that-won't-die? I was only fooling
around!
You use condor_release for jobs in the held state ('H' status in
condor_q). One way for a job to become held is for you to run
condor_hold on it. Your job is oscillating between idle ('I') and
running ('R'). What you're looking for is condor_rm, which will remove
the job from the queue.
+----------------------------------+---------------------------------+
| Jaime Frey | Public Split on Whether |
| jfrey@xxxxxxxxxxx | Bush Is a Divider |
| http://www.cs.wisc.edu/~jfrey/ | -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+