Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Bug with TotalJobSuspendTime?
- Date: Thu, 19 Jan 2006 11:05:22 -0800
- From: "Finch, Ralph" <rfinch@xxxxxxxxxxxx>
- Subject: [Condor-users] Bug with TotalJobSuspendTime?
condor -version
$CondorVersion: 6.7.13 Nov 7 2005 $
$CondorPlatform: INTEL-WINNT50 $
# suspend job on VM1 if keyboard is touched
# and VM2 has a Condor job or high load;
# but don't suspend if job suspension time exceeds limit
SUSPEND = (VirtualMachineID == 1) \
&& ($(KeyboardBusy) ) \
&& ( (vm2_Activity == "Busy") || (vm2_LoadAvg > $(HighLoad)) ) \
&& (TotalJobSuspendTime <= $(MaxSuspendTime))
The classad section above in our condor_config.local worked before
(6.7.13 I think) but doesn't now. After some testing, I found the last
line involving TotalJobSuspendTime is the problem. The behavior is
peculiar:
- If the job has never suspended and tries to, the StartLog reports this
error,
1/19 10:44:51 ERROR "Can't evaluate SUSPEND" at line 1061 in file
..\src\condor_startd.V6\Resource.C
and kills all jobs on that machine.
- If I comment out that line and reconfig condor on that machine, then
it suspends properly.
- If I then uncomment the line and reconfig again, it again suspends
properly.
In other words once TotalJobSuspendTime has been defined once, the line
works OK.
So then I tried this line:
&& (TotalJobSuspendTime =!= UNDEFINED) && (TotalJobSuspendTime <=
$(MaxSuspendTime))
but got the same error on new jobs.
Ralph Finch, P.E.
Dept. of Water Resources
Bay-Delta Office, Room 215-13
Sacramento, CA 95814
916-653-7552
rfinch@xxxxxxxxxxxx