Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Dagman stalling with shadow exception messages?
- Date: Wed, 07 Apr 2004 07:23:38 +0100
- From: Mark Calleja <mcal00@xxxxxxxxxxxxx>
- Subject: Re: [condor-users] Dagman stalling with shadow exception messages?
Hi Michael,
-- ShadowLog on submit host:
4/6 21:00:27 Initializing a VANILLA shadow
4/6 21:00:27 (22190.0) (7173): Request to run on <192.168.1.111:32771> was
ACCEPTED
4/6 21:00:27 (22190.0) (7173): ERROR "Can no longer talk to condor_starter
on execute machine (192.168.1.111)" at line 63 in file NTreceivers.C
----------------------
Note: The above message is repeated for any render host that gets matched,
and the hosts are definitely up and visible to the submit host. In
addition, that same render host will happily render other jobs from other
dags in other people's queues.
You say that the execute node is "visible" to the submit node: to what
extent? The times I have experienced these error messages it was due to
a firewall somewhere blocking some traffic back to the submitting node.
Is your environment firewalled, or does that submitting node run its own
firewall (pfilter, etc.)?
Just my two cents' worth...
Mark
--
Dr Mark Calleja
Department of Earth Sciences, University of Cambridge
Downing Street, Cambridge CB2 3EQ, UK
Tel. (+44/0) 1223 333408, Fax (+44/0) 1223 333450
http://www.esc.cam.ac.uk/~mcal00
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>