Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] SPOOL file clash with multiple submitters
- Date: Fri, 27 Jan 2012 12:52:52 +0000
- From: "Smith, Ian" <I.C.Smith@xxxxxxxxxxxxxxx>
- Subject: [Condor-users] SPOOL file clash with multiple submitters
Hello All,
I am trying to set up mutiple schedulers on our SMP central manager/submit
host along the lines suggested by Cycle Computing
(see http://www.cyclecomputing.com/wiki/index.php?title=Running_Multiple_Condor_Schedds)
This seemed to be working well until I noticed there was a clash between the
checkpoint files of jobs from one schedd and those of another. As far as I
can see the job IDs of jobs in separate queues are not unique so if a user of one
scheduler has a checkpointed job with say ID 3.1, its checkpoint files will be in
$(SPOOL_ROOT)/3/1/cluster...
But then another user on another schedd has a job with same ID 3.1 and it
attempts to use the same directory which fails because of file permissions.
I've configured Condor with
SPOOL_ROOT = /condor_scratch/spool
SCHEDD1 = $(SBIN)/condor_schedd1
SCHEDD1_ARGS = -f -local-name Q1
SCHEDD1_LOG = $(LOG)/ScheddLog.1
SCHEDD.Q1.SCHEDD_NAME = Q1@$(HOSTNAME)
SCHEDD.Q1.SPOOL = $(SPOOL_ROOT)/schedd1
SCHEDD.Q1.SCHEDD_LOG = $(SCHEDD1_LOG)
SCHEDD2 = $(SBIN)/condor_schedd2
SCHEDD2_ARGS = -f -local-name Q2
SCHEDD2_LOG = $(LOG)/ScheddLog.2
SCHEDD.Q2.SCHEDD_NAME = Q2@$(HOSTNAME)
SCHEDD.Q2.SPOOL = $(SPOOL_ROOT)/schedd2
SCHEDD.Q2.SCHEDD_LOG = $(SCHEDD2_LOG)
...etc
but the checkpointing files always seem to get written under the common $(SPOOL)
directory rather than separate ones causing the clash.
Interestingly Condor does seem to put these files in indvidual directories (not
the common spool area):
job_queue.log job_queue.log.1 local_univ_execute spool_version
so it seems to be aware of SCHEDD.Q1.SCHEDD_LOG if not SCHEDD.Q2.SPOOL
If I take out the default spool/ directory and remove the $(SPOOL) definition,
the negotiator fails on start up. Since there's only one negotiator I would
expect it to use a common directory ???
Any suggestions would be very useful.
thanks in advance,
-ian.
---------------------------------------
Dr Ian C. Smith,
Advanced Research Computing,
University of Liverpool.