[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] how to terminate jobs automatically



If your jobs will only ever terminate in response to a vacate caused at the end of day or normally then trap the vacate signal and exit immediately - this will be treated as vacate succeded.

Then set your jobs up to transfer files on vacate.

This is not perfect since the job will remain in the queue.

Alternatively have your jobs keep track of the time themseves (making this time an additional argument perhaps) and have them kill themselves (nicely if possible with a message to that effect) a minute or so before condor would (to allow for clock differences)

There are simple ways of doing this as well as extremely fast but unpleasant ones ones if performance is really an issue with sufficient granularity to hit a minute no probs...

What you describe is not really very easy with condor since there are many reasons for jobs to be vacated from a machine so knowing that it is due to the time is more the responsiblity of the job itself than condor...

Not to mention the question of what to do with jobs that ran for a while, were vacated due to a better job then the night ends...

If you are running a vanilla (as opposed to standard) which you have to be on windows and require the ouput irrespective of whether the job completely succesfully or not the simplest solution is to write the output you care about directly to the netork / database and deal with restarts directly (again a central database for run counters etc).

In this way you can also layer vacate alike functionality in future by serializing sufficient info to restart either at regular check points or in response to the vacate signal.

Note that the above solution has some unpleasant security connotations you may not be able to accept.

A vacate-lite functionality for windows would be nice:
A seperate file was declared in the submit file which would be copied on vacate to the shadow and then copied back to the new machine to allow persistance of data without needing any hacky freely avail network shares or transmission of db/filesystem passwords...

the user's app still needs to know about the file and manage it's own persitance but with considerable security improvements...

Matt

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx]On Behalf Of Dr Ian C. Smith
> Sent: 22 July 2004 10:28
> To: condor-users@xxxxxxxxxxx
> Subject: [Condor-users] how to terminate jobs automatically
> 
> 
> Dear All,
> 
> I have a very simple problem but one that I cannot find
> a mention of in the Condor docs. We have a pool of Condor
> PCs running Win 2k under control of a Solaris master. The
> pool should only be available outside office hours
> (1730 - 0830). If any jobs are running at the start of the
> day (0830) they should be terminated and any output
> returned to the user.
> 
> How do I set this up in condor_config ?????
> 
> At the moment jobs go into the idle state at 0830 and
> no output is returned to the user. If jobs finish
> before this time everything is hunky dory.
> 
> many thanks,
> 
> -ian.
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 


*****************************************************************
Gloucester Research Limited believes the information 
provided herein is reliable. While every care has been 
taken to ensure accuracy, the information is furnished 
to the recipients with no warranty as to the completeness 
and accuracy of its contents and on condition that any 
errors or omissions shall not be made the basis for any 
claim, demand or cause for action.
*****************************************************************