[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] how to change requested memory (cpus) for running job



Dear all,

does no answer mean that there is no expert around these days or
is it just not possible with htcondor to change any ClassAdds for
a running job?

The idea is just to change the reserved memory in a way that the available 
memory decreases that no other job with big memory request can start which 
could crash the machine or a long running job. The available memory should not 
go to 0 if there is enough memory available and
the available memory should just inrease again if the job finish.
Therefore a reread of the reservedMemory ClasAdd on the start machine,
without killing any job, 
seems to be perfect, if possible.

We are working on checkpointing of our jobs, but for some it seems not 
possible.

Any ideas would be welcome

Harald

On Monday 23 January 2017 16:21:00 Harald van Pee wrote:
> Hi Jason,
> 
> yes its condor_qedit not qalter. qalter works for pbs/torque even
> for a running job, condor_qedit just change RequestMemory but does not
> change any reservation for a running job.
> 
> Harald
> 
> On Monday 23 January 2017 16:11:21 Jason Patton wrote:
> > Oh, I just noticed the disclaimer about *running* jobs. Not sure about
> > changing the ClassAd of running jobs.
> > 
> > Jason Patton
> > 
> > On Mon, Jan 23, 2017 at 9:09 AM, Jason Patton <jpatton@xxxxxxxxxxx> wrote:
> > > Harald,
> > > 
> > > Yes! Check out condor_qedit: http://research.cs.wisc.edu/
> > > htcondor/manual/v8.4/condor_qedit.html
> > > 
> > > Jason Patton
> > > 
> > > On Mon, Jan 23, 2017 at 9:04 AM, Harald van Pee <pee@xxxxxxxxxxxxxxxxx>
> > > 
> > > wrote:
> > >> Hi all,
> > >> 
> > >> is it possible to change the reserved memory for a running job?
> > >> 
> > >> The problem is, we have a cluster with very long running jobs (8 weeks
> > >> in average) in a vanilla universe. We never kill any job
> > >> automatically.
> > >> 
> > >> Now it can happen that
> > >> a user reserves 60GB for his job and finds out that it will need 120GB
> > >> after
> > >> one week of running. Most often this will be no problem because there
> > >> is enough memory available.
> > >> But it would be a problem if another job starts and requests another
> > >> 60GB. This we could avoid if at least the adminstrator can just change
> > >> the RequestMemory to 120GB.
> > >> With qualter this is possible for a idle job in the queue, but what
> > >> can I do
> > >> for a running job?
> > >> 
> > >> Any suggestions?
> > >> 
> > >> We use condor 8.4.10.
> > >> 
> > >> Best regards
> > >> Harald
> > >> 
> > >> 
> > >> _______________________________________________
> > >> HTCondor-users mailing list
> > >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> > >> with a
> > >> subject: Unsubscribe
> > >> You can also unsubscribe by visiting
> > >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> > >> 
> > >> The archives can be found at:
> > >> https://lists.cs.wisc.edu/archive/htcondor-users/