Xin, If I am reading this correctlyâ. Are you using python 3 or python 2 under Anaconda3? Mary Mary Romelfanger Space Telescope Science Institute Deputy Branch Manager â Data Systems Branch From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Xin Wang <xwang@xxxxxxxxxxxxx> At this moment I feel quite positive that this is a bug in python bindings.
Is there a way that I can submit a bug report for this issue? Thanks. Xin From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Xin Wang Hi, John, I tried your approach and use condor_submit -dump <dumpfile> to see the job classad for my submission file. It has ~80 lines, and most of them do not make any sense to me. I tried to add those extra settings
to my script but it did not help. The error when running schedd.submit(job_ad)
in my original script is below condor_exec.exe: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory which clearly indicates that something seems wrong with the environment and the condor cannot find the python3.6 shared libraries.
The strange thing is that I did set
PYTHONHOME in the environment, which is sufficient for the method of
condor_submit <submitfile> and the job submitted using
sub.queue() but not sufficient for schedd.submit(job_ad). To confirm my idea, when I updated the environment to
sub['environment'] = "PYTHONHOME=/my/path/to/anaconda3 LD_LIBRARY_PATH=/my/path/to/anaconda3/lib" , then my script works with schedd.submit(job_ad). Now the question is, does condor_submit and the job submitted using
sub.queue() do anything extra that
schedd.submt is not doing? For the job submitted using sub.queue(), Iâm 100% sure that the job ran without issues, as I can see all results generated by my script. The only thing is that output and error files specified in the condor
config are not updated at all for the job. Thank you. Xin From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of John M Knoeller [External Message]
First of all, the job submitted using schedd.submit(job_ad) doesnât run because the job ad is incomplete. When you use that method, you must fully specify the job classad,. To see what a fully specified job classad looks like, run condor_submit
-dump <submit_file> For the job submitted using sub.queue() â are you sure that the job ran and produced output? when the job is submitted, our output and error files will be created as 0 size files before the job ever runs. -tj From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Xin Wang Iâm trying to submit jobs to condor to run some python scripts. If I generate a job file and submit with condor_submit, everything works fine. Here is the job file: universe = vanilla environment = "PYTHONHOME=/my/path/to/anaconda3" executable = /my/path/to/anaconda3/bin/python arguments = /my/path/to/scripts/myrun.py log = /tmp/job.log output = /tmp/test.log error = /tmp/test.err queue For the same job, I tried to submit through python bindings, using two different methods but do not have luck with either. Firstly I tried schedd.Submit with the following codes: import htcondor schedd = htcondor.Schedd() sub = htcondor.Submit() sub['universe'] = 'vanilla' sub['environment'] = "PYTHONHOME=/my/path/to/anaconda3" sub['executable'] = '/my/path/to/anaconda3/bin/python' sub['arguments'] = '/my/path/to/scripts/myrun.py' sub['log'] = '/tmp/job.log' sub['output'] = '/tmp/test.log' sub['error'] = '/tmp/test.err' with schedd.transaction() as txn: sub.queue(txn) The job was submitted without any issues, can run successfully without issues, and have log file /tmp/job.log generated successfully. However, output and error does not work, and /tmp/test.log or /tmp/test.err are generated but with size
0 (empty). Secondly, I tried schedd.submit with the following codes: import htcondor schedd = htcondor.Schedd() job_ad = { "cmd" : â/my/path/to/anaconda3/bin/python', "arguments" : '/my/path/to/scripts/myrun.py', 'env': "PYTHONHOME=/my/path/to/anaconda3", "log": '/tmp/job.log', "out": '/tmp/test.log', "err": "/tmp/test.err", } clusterId = schedd.submit(job_ad) The job could not run. However, /tmp/test.err can be generated proper error messages: condor_exec.exe: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory I suspect that the error is because the environment is not properly set, but I had no luck when I also tried to set âenvironmentâ instead of âenvâ. How should I fix the settings so that I can submit condor task through python bindings properly? Thanks. Xin
|