Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] windows xp log off kills jobs
- Date: Mon, 31 Dec 2007 12:45:03 +0000
- From: "Matt Hope" <matthew.hope@xxxxxxxxx>
- Subject: Re: [Condor-users] windows xp log off kills jobs
On Dec 28, 2007 3:52 PM, Finch, Ralph <rfinch@xxxxxxxxxxxx> wrote:
> > What are the values of SUSPEND and PREEMPT on these machines.
>
> WANT_SUSPEND = TRUE
> PREEMPT = FALSE
> PREEMPTION_REQUIREMENTS = FALSE
> KILL = FALSE
> # suspend job on VM1 if keyboard is touched
> # and VM2 has a Condor job or high load;
> # but don't suspend if job suspension time exceeds limit
> SUSPEND = (VirtualMachineID == 1) \
> && ($(KeyboardBusy) ) \
> && ( (vm2_Activity == "Busy") || (vm2_LoadAvg >
> $(HighLoad)) ) \
> && ( ((TotalJobSuspendTime =!= UNDEFINED) &&
> (TotalJobSuspendTime <= $(MaxSuspendTime))) \
> || (TotalJobSuspendTime =?= UNDEFINED))
>
> > It is possible the standard 'kick a job off this machine if
> > the owner wants to use it' routines are kicking in.
> > You may wish to change that behaviour...
>
> We try to suspend jobs in our pool when interactive use is wanted with
> the above settings. This has worked properly for a couple of years and
> works now; when keyboard activity happens the job on VM1 is suspended.
> Anyway, why would logging OFF a machine result in killing jobs even if
> we had SUSPEND and PREEMPT incorrect? :-(
>
> Ralph Finch
> 916-653-7552
>
>
>
> > -----Original Message-----
> > From: condor-users-bounces@xxxxxxxxxxx
> > [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matt Hope
> > Sent: Friday, December 28, 2007 7:09 AM
> > To: Condor-Users Mail List
> > Subject: Re: [Condor-users] windows xp log off kills jobs
> >
> > On Dec 27, 2007 10:00 PM, Finch, Ralph <rfinch@xxxxxxxxxxxx> wrote:
> > > condor -version
> > > $CondorVersion: 6.8.3 Jan 5 2007 $
> > > $CondorPlatform: INTEL-WINNT50 $
> > >
> > > I am submitting jobs from machine1 to a pool, all windows xp. If I
> > > then remote login to a machine running my jobs--say machine2--then
> > > logoff, the jobs on machine2 are killed and new jobs restart a few
> > > minutes later from the idle jobs in the pool. Damn
> > annoying as you can guess.
> > >
> > > In this thread
> > >
> > https://lists.cs.wisc.edu/archive/condor-users/2004-November/msg00076.
> > > sh
> > > tml
> > >
> > > the poster had the same problem but seemed to think it was
> > only Java
> > > jobs. Mine are not Java, my executable is a windows .bat
> > file which
> > > then runs a compiled exe. He had a klugy solution to his Java jobs
> > > which I doubt would work with mine, plus it seems a serious
> > deficiency
> > > and should have a better solution. I'm believing I'm not the first
> > > person to hit on this problem so is there a good solution?
> >
> > What are the values of SUSPEND and PREEMPT on these machines.
Hmm... Are you using RunAsOwner? If so does it happen if you run a job
and then someone else logs on then off?
Clutching at straws here...
Matt