Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[condor-users] Question about Job Held in DAGMAN
- Date: Tue, 6 Apr 2004 03:33:54 -0700 (PDT)
- From: Yuhong Feng <april_continue@xxxxxxxxx>
- Subject: [condor-users] Question about Job Held in DAGMAN
Dear Condor-users,
When I run subnodes in a DAGMAN independently, it
works. However, when I run them as a DAG, it fails.
would you please help me figure out the reason.
Your valued time and patience are highly appreciated.
the job node description file:
#mccf_a submit script
Executable = JobExec.class
transfer_input_files=/usr5/postgrad/yuhong/Condor/dagman/JobExec.class
Universe = globus
Globusscheduler = surya.ntu.edu.sg/jobmanager-condor
GlobusRSL = (condor_submit=(Universe java))
Arguments = JobExec
\"/staff/yuhongf/Condor/Data/datasetsA\"
output =
/usr5/postgrad/yuhong/Condor/dagman/mccf_a/client_code_mccf_a.out
error =
/usr5/postgrad/yuhong/Condor/dagman/mccf_a/client_code_mccf_a.error
log =
/usr5/postgrad/yuhong/Condor/dagman/mccf_a/client_Code_MCCF.log
transfer_files = ALWAYS
should_transfer_files = YES
when_to_transfer_output = on_exit
Queue 1
When I run:
>condor_submit client_mccf_a.submit
it works.
However, the dag file :
# Cooresponding to agentcore1234.xml.7.3
Job MCCF_A
/usr5/postgrad/yuhong/Condor/dagman/mccf_a/client_mccf_a.submit
and when I run:
>condor_submit_dag client.dag
>condor_q
-- Submitter: kusu.sas.ntu.edu.sg :
<155.69.144.127:42548> : kusu.sas.ntu.edu.sg
ID OWNER SUBMITTED RUN_TIME ST
PRI SIZE CMD
503.0 yuhongf 4/6 18:26 0+00:04:02 R 0
3.5 condor_dagman -f -
504.0 yuhongf 4/6 18:26 0+00:00:00 H 0
0.0 JobExec.class JobE
2 jobs; 0 idle, 1 running, 1 held
the messages in the log file are:
000 (504.000.000) 04/06 18:26:51 Job submitted from
host: <155.69.144.127:42548>
DAG Node: MCCF_A
...
017 (504.000.000) 04/06 18:27:08 Job submitted to
Globus
RM-Contact: surya.ntu.edu.sg/jobmanager-condor
JM-Contact: X
Can-Restart-JM: 0
...
012 (504.000.000) 04/06 18:27:08 Job was held.
Globus error 43: the job manager failed to
stage the executable
Code 2 Subcode 43
...
Would you please tell me how to make this work?
Thank you very much!
Best regards,
Sincerely
yuhong
__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway
http://promotions.yahoo.com/design_giveaway/
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>