On 8/10/06, Nomura Kohei <kh-nomura@xxxxxxxxx> wrote:
> Have you set these to numbers which would give you 2 hours delay?
No. I checked my config file, these parameters are comment field.
#POLLING_INTERVAL=5
#ALIVE_INTERVAL = 300
#MAX_SHADOW_EXCEPTIONS = 5
Is this wrong?
no this is fine it just means it uses the defaults (indicated in the file)
> What is the schedd/shadow log indicating during this time?
I attached schedd and shadow log. (ClusterID of the job is 290.0)
Please check these files.
OK these definitely look bad:
8/3 13:41:29 Initializing a VANILLA shadow for job 290.0
8/3 13:41:30 (290.0) (372): Request to run on <192.168.0.2:3817> was
ACCEPTED
8/3 15:34:57 ******************************************************
8/3 15:34:57 ** condor_shadow (CONDOR_SHADOW) STARTING UP
8/3 15:34:57 ** C:\condor\bin\condor_shadow.exe
8/3 15:34:57 ** $CondorVersion: 6.8.0 Jul 19 2006 $
8/3 15:34:57 ** $CondorPlatform: INTEL-WINNT50 $
8/3 15:34:57 ** PID = 3472
8/3 15:34:57 ** Log last touched 8/3 15:34:30
8/3 15:34:57 ******************************************************
... snip
8/3 15:41:36 (290.0) (372): condor_read(): recv() returned -1, errno =
10054, assuming failure.
8/3 15:41:36 (290.0) (372): Can no longer talk to condor_starter
<192.168.0.2:3817>
8/3 15:41:37 (290.0) (372): Trying to reconnect to disconnected job
8/3 15:41:37 (290.0) (372): LastJobLeaseRenewal: 1154580099 Thu Aug 03
13:41:39 2006
8/3 15:41:37 (290.0) (372): JobLeaseDuration: 60 seconds
8/3 15:41:37 (290.0) (372): JobLeaseDuration remaining: EXPIRED!
8/3 15:41:37 (290.0) (372): Reconnect FAILED: Job disconnected too
long: JobLeaseDuration (60 seconds) expired
8/3 15:41:37 (290.0) (372): **** condor_shadow (condor_SHADOW) EXITING
WITH STATUS 107
incidentally did you snip a bunch of stuff from the logs after the second
line?
If you did are you sure there are no messages related to that pid
(372) in the log.
This is starting to smell like it might be a bug but the condor guys
would probably be much better at debugging it from here...
Since I am planning on using JobLeases I'll prob take a look at the
logic in a bit but don't have the time right now
Matt
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR