Michael, thank you for your reply.
Â
From your message pointing me to the â-nameâ option, I have been trying both the -name and -remote options in condor_submit and they are working just fine. Unfortunately I use DAG jobs and I cannot get them to correctly run with option "-r" (AFAIK, it
is equivalent to "condor_submit -remote", but there is no equivalent to "condor_submit -name" for DAG, right?)
Â
I am submitting a simple a.dag file, but the DAG job just gets stuck never running and I find the following in the a.dag.dagman.out file in the spool dir:
Â
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) DAGMAN_LOG_ON_NFS_IS_ERROR setting: False
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) Default node log file is: <C:\condor\spool\88\0\cluster88.proc0.subproc0\.\a.dag.nodes.log>
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) DAG Lockfile will be written to a.dag.lock
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) DAG Input file is a.dag
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) Parsing 1 dagfiles
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) Parsing a.dag ...
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) ERROR: Could not open file a.dag for input (cwd) (errno 2, No such file or directory)
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) Removing any/all submitted HTCondor jobs...
10/02/18 10:08:34 (fd:4) (pid:32476) (D_ALWAYS) Running: C:\condor\bin\condor_rm.exe -const DAGManJobId' '=?=' '88
10/02/18 10:08:35 (fd:4) (pid:32476) (D_ALWAYS) Warning: failure: C:\condor\bin\condor_rm.exe -const DAGManJobId' '=?=' '88
10/02/18 10:08:35 (fd:4) (pid:32476) (D_ALWAYS)Â (my_pclose() returned 1 (errno 2, No such file or directory))
10/02/18 10:08:35 (fd:4) (pid:32476) (D_ALWAYS) ERROR: Warning is fatal error because of DAGMAN_USE_STRICT setting
10/02/18 10:08:35 (fd:4) (pid:32476) (D_ALWAYS) Aborting DAG...
10/02/18 10:08:35 (fd:4) (pid:32476) (D_ALWAYS) Writing Rescue DAG to a.dag.rescue002...
Â
The a.dag file certainly has not been copied into that directory.
In the a.dag.dagman.log I am also getting this:
Â
ÂÂÂÂÂÂÂ (0) Abnormal termination (signal -1073741819)
Â
Any idea on how to fix this?
Â
Thanks
Oscar
Â
Â
-----Mensaje original-----
De: HTCondor-users <
htcondor-users-bounces@xxxxxxxxxxx> En nombre de Michael Pelletier
Enviado el: martes, 25 de septiembre de 2018 16:43
Para: HTCondor-Users Mail List <
htcondor-users@xxxxxxxxxxx>
Asunto: Re: [HTCondor-users] Submitting to one of several independent pools
Â
Oscar,
Â
The "-name" option to condor_submit is what you're looking for:
Â
ÂÂÂÂÂÂ -name schedd_name
Â
ÂÂÂÂÂÂÂÂÂ Submit to the specified condor_schedd . Use this option to submit to
 a condor_schedd other than the default local one. schedd_name is
 the value of the Name ClassAd attribute on the machine where the
ÂÂÂÂÂÂÂÂÂ condor_schedd daemon runs.
Â
You would set up the workstation with a default scheduler, probably the production one, and then to submit for test you'd add the "-name" option to the submission to specify the hostname of the test pool's schedd.
Â
If you want to avoid the need for the command line option while testing, so you don't have to change options going from test to production, you can set the _CONDOR_SCHEDD_NAME environment variable to override what's in the workstation's configuration file
setting for the default scheduler.
Â
Michael V. Pelletier
Information Technology
Digital Transformation & Innovation
Integrated Defense Systems
Raytheon Company
Â
_______________________________________________
HTCondor-users mailing list
subject: Unsubscribe
You can also unsubscribe by visiting
Â
The archives can be found at:
Â