Hi Stefano,
probably you can try within the job context to evaluate Condor's
'hidden' envvars like
${_CONDOR_SCRATCH_DIR}
If the user's ${HOME} dirs are on a shared fs, the job flow might should
check, if there is already the software unpacked/installed locally
If the normal users' HOME are not 'necessary' for the users in these
cases, it might be possible to mount the jobs' home always under scratch
from a batch node's startd config
MOUNT_UNDER_SCRATCH = {/tmp,/var/tmp,...,} /home/user~SOMEEXPRESSION~
SINGULARITY_TARGET_DIR = /scratch
(not tried - just a guess...)
Cheers,
Thomas
On 23/01/2020 21.25, Stefano Dal Pra wrote:
> Hello, it turns out that explicitly adding /home to
> SINGULARITY_BIND_EXPR solved the "disk full" problem. Now the job
> untar his stuff under /home/<user>, but i would prefer this to happen
> under the $(EXEC)/dir_<number> created by condor at job start.
>
> Stefano
>
>
>
> On 22/01/20 18:00, Stefano Dal Pra wrote:
>> Hello, i'm trying to adapt a configuration we have with LSF: jobs
>> from certain user groups are forced to run in a sl6 singularity
>> container;
>>
>> the LSF job starter does something like this (minor details
>> stripped)
>>
>> if [[ "$USER" =~ ^(user001|user002)$ ]]; then
>>
>> job_cmd=$* singularity exec --home $HOME --pwd $HOME -B
>> /cvmfs:/cvmfs -B /tmp:/tmp -B
>> /opt/exp_software/common:/opt/exp_software/common -B /data:/data
>> /opt/exp_software/common/UI_SL6.img ${job_cmd}
>>
>> fi
>>
>> With HTCondor (8.8.4) , I used the following conf in the WN:
>>
>> EXECUTE = /home/condor/execute/
>>
>> SINGULARITY_JOB = RegExp("^(user001|user002)$", TARGET.Owner)
>> SINGULARITY_IMAGE_EXPR = "/opt/exp_software/common/UI_SL6.img"
>> SINGULARITY_BIND_EXPR = "/cvmfs /tmp /opt/exp_software/common
>> /data"
>>
>> However i'm not sure how to pass: --home $HOME --pwd $HOME as these
>> vales are set at runtime, and trying something like
>>
>> SINGULARITY_EXTRA_ARGUMENTS = "--home $HOME --pwd $HOME"
>>
>> fails (does not resolve $HOME).
>>
>> Looking at the job env i see:
>>
>> HOME=/home/user001 PWD=/home/user001
>>
>> And PWD looks wrong to me, as it should better be
>> PWD=/home/condor/execute/dir_<number>
>>
>> which is the case when running without singularity (SINGULARITY_JOB
>> = False) where the job ends succesfully. Looking inside job.err
>> there is:
>>
>> 2020-01-22 17:55:37 (245 KB/s) - `dirac-install.py' saved
>> [91191/91191]
>>
>> tar: DIRAC/releasehistory.pdf: Wrote only 7680 of 10240 bytes tar:
>> DIRAC/RequestManagementSystem/scripts/dirac-rms-list-req-cache.py:
>> Cannot write: No space left on device [... lots more of these ...]
>>
>> Any suggestion on what i could be doing wrong? Thank You Stefano
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________ HTCondor-users
>> mailing list To unsubscribe, send a message to
>> htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You
>> can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>
> _______________________________________________ HTCondor-users
> mailing list To unsubscribe, send a message to
> htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You
> can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature