Dear
users,
I have
recently put condor into production servers here. Everything runs nicely except
when we get long runs submitted - only recently have we started using condor for longer runs.
All
the runs affected are in the vanilla universe and over 10 hours long.
We currently have 4 servers that have the requirements needed for these jobs to
run.
What
is happening is that user1 may submit 4 jobs which run fine for a few hours. But
then user2 submits a job, and one of user1's jobs gets stopped so that user2's
job can run. I understand this is a result of user priorities*. The problem is
that when the initial run that was stopped goes to restart, it starts from the
beginning.
Some
of these longer jobs will never actually finish given that new jobs from new
users are being regularly submitted.
I have
tried adding the following to condor_config.local on the four servers. But it appears to still
do the same thing.
PREEMPTION_REQUIREMENTS = False
I think the best way to solve this would be to prevent
runs from being stopped at all, and just letting them continue running to the
end. Does anyone have any suggestions?
The servers
are:
$CondorVersion: 7.0.1
Feb 27 2008 BuildID: 76180 $
$CondorPlatform: INTEL-WINNT50 $ Desktop (the user that submits the
jobs)
$CondorVersion: 7.0.0 Jan 22 2008 BuildID: 72173 $
$CondorPlatform: INTEL-WINNT50 $ The condor master:
$CondorVersion: 7.0.0 Jan 22 2008 BuildID: 72173 $ $CondorPlatform: INTEL-WINNT50 $ Sorry I haven't provided any more specifics, I
really don't know what would be useful and don't want to drown you in useless
output! Let me know if there is anything else that would
help!
Thanks for any advice,
Rob
********************************************************************** HR Wallingford uses Faxes and Emails for confidential and legally privileged business communications. They do not of themselves create legal commitments. Disclosure to parties other than addressees requires our specific consent. We are not liable for unauthorised disclosures nor reliance upon them. If you have received this message in error please advise us immediately and destroy all copies of it. HR Wallingford Limited Howbery Park, Wallingford, Oxon, OX10 8BA HR Wallingford Limited is a company registered in: Companies House, Cardiff, Crownway, Maindy CF14 3UZ Company No. 02562099 VAT No. GB 570 039 752 ********************************************************************** |