Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] shadow exception error?
- Date: Mon, 24 Jul 2006 09:42:26 -0700
- From: "yaoheng zhang" <yaoheng.zhang@xxxxxxxxxxxx>
- Subject: Re: [Condor-users] shadow exception error?
Hi Jun
I think you can set ALL_DEBUG=D_FULLDEBUG or D_FULLDEBUG to specific daemon
debug. Then more information can be found in log files.
Yaoheng Zhang
-----邮件原件-----
发件人: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] 代表 Jun Wang
发送时间: 19 July 2006 22:35
收件人: condor-users@xxxxxxxxxxx
主题: [Condor-users] shadow exception error?
Dear condor-users:
I just installed condor_6.6.11 on our Linux cluster consisting of 12 nodes
with NFS system.
When I typed condor_q and condor_status on the master node(central manager)
and slave nodes(compute nodes), I got the normal screen output which told me
how many jobs are running, etc. and which machines are in my pool. Then I
tried to run the test example "sh_loop" under condor-6.6.11/examples as user
condor by condor_submit sh_loop.cmd on my master node. The job terminated
normally. However, when I tried to submit the sh_loop.cmd on my slave node I
got shadow exception error message in file sh_loop.log as below:
000 (005.000.000) 07/19 22:13:57 Job submitted from host: <10.0.2.2:36083>
..
007 (005.000.000) 07/19 22:14:00 Shadow exception!
Can no longer talk to condor_starter on execute machine (10.0.2.1)
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
..
007 (005.000.000) 07/19 22:14:01 Shadow exception!
Can no longer talk to condor_starter on execute machine (10.0.2.1)
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
..
007 (005.000.000) 07/19 22:14:03 Shadow exception!
Can no longer talk to condor_starter on execute machine (10.0.2.1)
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
..
007 (005.000.000) 07/19 22:14:04 Shadow exception!
Can no longer talk to condor_starter on execute machine (10.0.2.1)
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
..
007 (005.000.000) 07/19 22:14:05 Shadow exception!
Can no longer talk to condor_starter on execute machine (10.0.2.1)
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
..
009 (005.000.000) 07/19 22:14:40 Job was aborted by the user.
via condor_rm (by user condor)
..
Does anybody know the possible reason?
Jun Wang
junwang@xxxxxxxxxxxxxx
2006-07-19
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR