Hi, Here I am attaching the dagman.out file,
could you please let me know, is there any problem in DAGMan. And also I am using condor 6.8.4 and using
the same cluster ID. Could you please let me know is it
possible to submit the job (what I am trying) using DAGMan. Thanks, Senthil From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of Hi, I am trying to submit DAGMan job in linux. I have sixteen batches of job. Each job
inturn has 41 jobs. And my requirement is batch2 jobs
shouldn’t start until all batch1 jobs are done, similarly batch3 jobs
shouldn’t start until all batch2 job are done. I created dagman job like the one below,
the problem is dagman job fails randomly on the batch3 or batch4 etc and the
reason is some of the batch3 job needs input which will be output from some of
the batch2 job. And condor complains about the file is not found Read so far: Submitting
job(s).............................ERROR: Can't open "/u/Senthil/DAGMan/MatlabJobs/immuneic4401.txt"
with flags 00 (No such file or directory) Based on the time stamp this file was not
created during the above error msg, it was created after that. How this is
happening? Does condor dagman won’t wait until all the jobs for the
parent is done before start child job, or just wait the last job of the parent
to complete in order to start the child jobs. Is it possible to do what I am trying to
do with condor dagman. Could you please let me know. Thanks, Senthil JOB A Job_batch_1 JOB B Job_batch_2 JOB C Job_batch_3 JOB D Job_batch_4 JOB JOB F Job_batch_6 JOB G Job_batch_7 JOB H Job_batch_8 JOB I Job_batch_9 JOB J Job_batch_10 JOB K Job_batch_11 JOB L Job_batch_12 JOB M Job_batch_13 JOB JOB O Job_batch_15 JOB P Job_batch_16 PARENT A CHILD B PARENT B CHILD C PARENT C CHILD D PARENT D CHILD E PARENT E CHILD F PARENT F CHILD G PARENT G CHILD H PARENT H CHILD I PARENT I CHILD J PARENT J CHILD K PARENT K CHILD L PARENT L CHILD M PARENT M CHILD N PARENT N CHILD O PARENT O CHILD P Retry A 10 Retry B 10 Retry C 10 Retry D 10 Retry E 10 Retry F 10 Retry G 10 Retry H 10 Retry I 10 Retry J 10 Retry K 10 Retry L 10 Retry M 10 Retry N 10 Retry O 10 Retry P 10 |
Attachment:
simulation.dag.dagman.out
Description: simulation.dag.dagman.out