I wasnât looking to run them elsewhere. We have a user whose dag jobs havenât been running and that was my initial hunch, that they were trying to run on a machine that wasnât a submit node. So I was trying to force them to stay home by selecting C6 since our submit node it C6.
At this point, I have already started working on a new C7 submit node, but now Iâm curious why they arenât working if the requirements settings are ignored on those jobs.
Kevin
From: Mark Coatsworth <coatsworth@xxxxxxxxxxx>
Sent: Thursday, December 05, 2019 12:06 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: Kevin Leigeb <kevin.leigeb@xxxxxxxx>
Subject: Re: [HTCondor-users] dagman behavior question
Hi Kevin, I was about to get into a series of debugging steps, but then my coworker TJ just reminded me of something important. Since scheduler universe jobs can only run on the submit machine, the Requirements _expression_ is ignored. The job will run on your local machine regardless of what's in there.
Is there a specific reason you're looking to run scheduler universe jobs elsewhere in your pool? Let me know, there's probably a better way for you to do whatever it is you need.
Mark
On Wed, Dec 4, 2019 at 2:14 PM Kevin Leigeb via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
We have a unique situation as weâre still pulling up a few nodes from CentOS 6, but weâve used the trick here (https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=OsMigrationHints) to submit to C7 by default. Unfortunately, our submit server is still C6 so this is causing issues.
I tried to fix this by setting the dagman submit file to include (opsysandver == centos6), but in the resulting classad it showed centos7.
####
From the submit file
####
# Filename: blast_contigs.dag.condor.sub
# Generated by condor_submit_dag blast_contigs.dag
universe = scheduler
executable = /usr/bin/condor_dagman
getenv = True
output = blast_contigs.dag.lib.out
error = blast_contigs.dag.lib.err
log = blast_contigs.dag.dagman.log
remove_kill_sig = SIGUSR1
requirements = ( target.OpSysAndVer == "CentOS6" )
####
From the classad
####
Requirements = (Target.OpSysandVer == "CentOS7") && (TARGET.Arch == "X86_64") && (TARGET.OpSys == "LINUX") && (TARGET.Disk >= RequestDisk) && (TARGET.Memory >= RequestMemory)
####
Thanks for your help!
Kevin
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Mark Coatsworth
Sent: Wednesday, December 04, 2019 12:42 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] dagman behavior question
Hi Kevin,
Scheduler universe jobs can only run on the submit machine. They cannot get sent out to execute nodes. This behavior is baked in and cannot be changed.
As for running -no_submit and then editing the dagman submit file: this is just a regular submit file. It doesn't get any special treatment.
For the Requirements _expression_ showing up different in the job ad: we do some manipulation behind the scenes of the Requirements _expression_ based on other things in your submit file (and sometimes based on pool-wide settings). So it will not show up exactly as you provided it. Can you include both your modified dagman submit file, along with what you're seeing in the job ad, so we can see if this is expected behavior?
Mark
On Wed, Dec 4, 2019 at 11:21 AM Kevin Leigeb via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
Hello All -
Running 8.8.5 on our cluster and noticed a user having issues submitting a DAG. After looking into this, I have 2 questions about the behavior of DAGman.
1) Can scheduler universe jobs run on any node in the cluster?
2) When running -no_submit and editing the dagman submit file, does this operate differently than a regular submit file?
To clarify the last point, I tried editing the requirements of the job in the submit file but looking at the ClassAd it doesnât appear that the requirements were taken into consideration.
Thanks for any help yâall can provide and let me know if thereâs any more information I can send to help.
Kevin
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison