Hello...
I just did what you asked me to do. There is only one worker node is showing but in worker node it was not showing queue so I add "SCHEDD" in DAEMON_LIST in "condor_config.local" So when I try to submit my job from a master it is only showing in the queue of master node..Job is not showing in the worker node. "STARTD" is still not added in DAEMON_LIST in master the master node..... So can you tell me job is actually running on master node or worker node? and who can I migrate my jobs to other workers node? Greetings > Date: Tue, 21 May 2013 13:18:36 +0100 > From: B.Candler@xxxxxxxxx > To: htcondor-users@xxxxxxxxxxx > Subject: Re: [HTCondor-users] Job Scheduling > > On Tue, May 21, 2013 at 11:47:32AM +0000, Muak rules wrote: > > Hello > > I'm going to explain all that I'd done. > > I did configurations in /etc/condor/condor_config > > > > In client machine I did following configurations > > CONDOR_HOST = pucitServer.CentOSWorld.com(name of a server machine) > > ALLOW_WRITE = $(ALLOW_WRITE), $(CONDOR_HOST) > > COLLECTOR_HOST = 10.0.0.1 (IP Address of server) > > DAEMON_LIST = master,startd > > You are using a mix of names and IP addresses. Is > pucitServer.CentOSWorld.com the machine with IP address 10.0.0.1? Do you > have > > 10.0.0.1 pucitServer.CentOSWorld.com > > in your /etc/hosts file? > > I can describe a simple config where one job is the "master" (contains the > job queue and is where you submit jobs) and others are "workers" (where the > jobs actually execute). > > If pucitserver.centosworld.com is the 'master', then on a 'worker' machine I > would make condor_local.config something like this: > > ---- 8< ---- > ## What machine is your central manager? > > CONDOR_HOST = pucitserver.centosworld.com > > ## Other global settings > > UID_DOMAIN = centosworld.com > CONDOR_ADMIN = yourmail@xxxxxxxxxxxxxx > MAIL = /usr/bin/mail > > ## Pool's short description > > COLLECTOR_NAME = My org condor pool > > ## When is this machine willing to start a job? > > #START = TRUE > BackgroundLoad = 0.5 > START = $(CPUIdle) || (State != "Unclaimed" && State != "Owner") > > ## When to suspend a job? > > SUSPEND = FALSE > > ## When to nicely stop a job? > ## (as opposed to killing it instantaneously) > > PREEMPT = FALSE > > ## When to instantaneously kill a preempting job > ## (e.g. if a job is in the pre-empting stage for too long) > > KILL = FALSE > > ## This macro determines what daemons the condor_master will start and keep its watchful eyes on. > ## The list is a comma or space separated list of subsystem names > > DAEMON_LIST = MASTER, STARTD > ALLOW_WRITE = $(FULL_HOSTNAME), $(IP_ADDRESS), $(CONDOR_HOST) > > ## Optional: dynamic slots > > SLOT_TYPE_1 = cpus=100%, ram=75%, swap=100%, disk=100% > SLOT_TYPE_1_PARTITIONABLE = True > NUM_SLOTS_TYPE_1 = 1 > ---- 8< ---- > > And on the 'master' node I would use the same file but change the bit from > DAEMON_LIST onwards like this: > > DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, SCHEDD > ALLOW_WRITE = $(FULL_HOSTNAME), $(IP_ADDRESS), $(CONDOR_HOST), 10.0.0.* > # Optional if you are using dagman > DAGMAN_MAX_SUBMITS_PER_INTERVAL = 200 > DAGMAN_SUBMIT_DELAY = 0 > > condor_restart everywhere. Then login to the master node, check that > "condor_status" shows the worker node(s), and then submit some jobs. > > If you want to make the master node run jobs as well, then I believe it > should just be a question of adding STARTD to DAEMON_LIST. > > Regards, > > Brian. > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/ |