export VERY_RNUM=$RANDOM
export _CONDOR_STARTD_RESOURCE_PREFIX=slot_${VERY_RNUM}_
export _CONDOR_LOCAL_DIR=$SCRATCH/${SLURM_NODELIST%.*}/log/${VERY_RNUM}
mkdir -p ${_CONDOR_LOCAL_DIR}/log
mkdir -p ${_CONDOR_LOCAL_DIR}/execute
condor_master -f -n compute_condor_${VERY_RNUM}
```
Thank you very much!
Best,
Seung
_______________________________________________
 And I guess I left off some elements; one can also set a LOCAL_DIRÂ
and a prefix for the glide-in using the random number:
export VERY_RNUM=$RANDOM
export _CONDOR_STARTD_RESOURCE_PREFIX=slot_${VERY_RNUM}_export _CONDOR_LOCAL_DIR=/scratch.local/${USER}/${VERY_RNUM}${_CONDOR_SBIN}/condor_master -f -n compute_condor_${VERY_RNUM}
Those should be the elements that keep the different condor_masterÂ
from interfering with one another.
    ÂGregÂ
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Daues, Gregory Edward <daues@xxxxxxxxxxxx>
Sent: Monday, November 6, 2023 5:59 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Multiple HTCondor workers on a single compute nodeÂ
 ÂHello,
I use the -n option with a random number in a bash script like
#!/bin/bash
export VERY_RNUM=$RANDOM${_CONDOR_SBIN}/condor_master -f -n compute_condor_${VERY_RNUM}
but I imagine there could be other ways.
     GregÂ
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Seung-Jin Sul <ssul@xxxxxxx>
Sent: Monday, November 6, 2023 5:21 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Multiple HTCondor workers on a single compute nodeÂHi,Â
We use SLURM as a glide-in backend and sometimes need to run multiple HTCondor worker services on the same node. This happens when we request a part of a compute node like 1 CPU and 10GB memory from SLURM.Â
When we try to start another instance of HTCondor on the same node, we see below
```11/06/23 14:49:54 lock_file returning ERROR, errno=11 (Resource temporarily unavailable)
11/06/23 14:49:54 FileLock::obtain(1) failed - errno 11 (Resource temporarily unavailable)
11/06/23 14:49:54 ERROR "Can't get lock on "/clusterfs/jgi/scratch/dsi/aa/jaws/dori-dev/htcondor-log/n0099/log/InstanceLock"" at line 1691 in fil  Âe /var/lib/condor/execute/slot1/dir_3620933/userdir/.tmpdnieob/BUILD/condor-10.2.2/src/condor_master.V6/master.cpp
```
How can we start multiple HTcondor worker services on a node? Any info on setting the port and on the lock file will be helpful.Â
Thank you!
Best,ÂSeung
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/