[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_dagman not creating jobs



This indeed looks like a proxy-related issue. condor_submit doesnât send the proxy to the schedd, but it does read the proxy to verify itâs valid.

Can you try submitting one of the affected node jobs by hand from the command line? If that also crashes, then itâll be easier to track down the cause.

 - Jaime

> On Oct 27, 2021, at 1:02 PM, Vladimir Brik <vladimir.brik@xxxxxxxxxxxxxxxx> wrote:
> 
> Hello
> 
> I've run into an issue where dagman seems to be unable to create jobs because condor_submit segfaults.
> 
> .condor_dagman.out contains:
> 10/27/21 12:52:35 ERROR: submit attempt failed
> 10/27/21 12:52:35 submit command was: /usr/bin/condor_submit -a dag_node_name' '=' 'job2 -a submit_event_notes' '=' 'DAG' 'Node:' 'job2 -a dagman_log' '=' '/mnt/scratch/tyuan/refit/./refit.prob.dag.nodes.log -a +DAGManNodesMask' '=' '"0,1,2,4,5,7,9,10,11,12,13,16,17,24,27,35,36" -a JOB=job2 -a OUTPUT_DIR' '=' '/data/user/tyuan/studies/tablemaker/refits/prob -a INPUT_DIR' '=' '/data/user/chill/photo-table -a FILE_NAME' '=' 'cascade_halftable_spice_3.2.1_flat_z0_zen100_azi180_nevents40000_0_range.fits -a DAG_STATUS' '=' '2 -a FAILED_COUNT' '=' '1 -a notification' '=' 'never -a +DAGParentNodeNames' '=' '"" refit.prob.sub
> 10/27/21 12:52:35 Job submit try 1/6 failed, will try again in >= 1 second.
> 
> dmesg contains:
> [2335469.858471] condor_submit[2260162]: segfault at a ip 00007efd3f70e2cb sp 00007ffd24306b40 error 4 in libglobus_gsi_credential.so.1.6.14[7efd3f707000+9000]
> [2335469.864387] Code: 00 48 c7 44 24 08 00 00 00 00 48 85 ff 74 07 e8 9b 93 ff ff 89 c5 4d 85 ff 74 3f 4c 8d 6c 24 08 49 8b 07 4c 89 ee 48 8b 40 20 <48> 8b 78 08 e8 bc 92 ff ff 85 c0 75 78 48 8b 03 48 8b 54 24 08 48
> 
> We are running version 9.0.6 on Centos 8.
> 
> My simple test dags seem to be fine, so it doesn't always fail. Perhaps it has something to do with sending x509 proxies with the jobs?
> 
> Any help would be appreciated.
> 
> 
> Vlad