I tried to use strace to see what was going on, but it was less than
helpful. I then tried
valgrind like so:
valgrind -v condor_submit_dag
This gives a lot of output. Buried in there I found the following:
==893403== at 0x6D76270: __close_nocancel (syscall-template.S:81)
==893403== by 0x5475DB0: ??? (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x5476022: my_popen(ArgList&, char const*, int, Env*,
bool, char const*) (in /usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x545C3AC: Copy_macro_source_into (in
/usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x54627FC: Parse_macros (in
/usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x535DBF9: process_config_source (in
/usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x5363C7A: real_config (in
/usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x5364603: config_ex (in
/usr/lib64/libcondor_utils_8_5_7.so)
==893403== by 0x40463E: main (in /usr/bin/condor_submit_dag)
I downloaded the source for 8.5.7, and realized that the
/etc/condor/condor_config file was the issue. I took a look at that and
found
the following:
##
## If you've installed the condor-ec2 package, this will set
TCP_FORWARDING_HOST
## to the instance's public IP and cause the startd to advertise that IP and
## the instance ID. It will also fetch and install additional config.d
files
## if the instance's IAM profile is configured correctly (pointing to a
single
## specific file in S3); see the manual for condor_annex for details.
##
include ifexist command into $(LOCAL_CONFIG_DIR)/49ec2-instance.config : \
/etc/condor/config.d/49ec2-instance.sh
I commented out these lines, since we don’t have that installed on our
system, and the error message
went away.
Hope this helps,
Steve
On Nov 14, 2016, at 1:28 PM, Pietrowicz, Stephen R <srp@xxxxxxxxxxxx
<mailto:srp@xxxxxxxxxxxx>> wrote:
Hi,
I’m seeing a weird error that I can’t quite figure out.
I executed the following commands:
bash-4.2$ cat simple.dag
JOB Simple srp.submit
bash-4.2$ cat srp.submit
executable = /usr/bin/hostname
universe = vanilla
input = /home/srp/short.input
output = test.out
error = test.error
log = test.log
queue
bash-4.2$ condor_submit_dag -force simple.dag
my_popenv: Failed to exec in child, errno=2 (No such file or directory)
Renaming rescue DAGs newer than number 0
-----------------------------------------------------------------------
File for submitting this DAG to HTCondor : simple.dag.condor.sub
Log of DAGMan debugging messages : simple.dag.dagman.out
Log of HTCondor library output : simple.dag.lib.out
Log of HTCondor library error messages : simple.dag.lib.err
Log of the life of condor_dagman itself : simple.dag.dagman.log
Submitting job(s).
1 job(s) submitted to cluster 297.
-----------------------------------------------------------------------
bash-4.2$
Note the “my_popenv: Failed to exec in child, errno=2 (No such file or
directory)”
This is under HTCondor version 8.5.7
Any ideas why this is happening? I tried this under 8.4.9, and didn’t
see this error.
Steve
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/