Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] startd hangs when using job hooks
- Date: Tue, 09 Feb 2010 10:04:16 -0500
- From: Matthew Farrellee <matt@xxxxxxxxxx>
- Subject: Re: [Condor-users] startd hangs when using job hooks
Michael Moore wrote:
I am trying to implement a set of fetch and prepare hooks. However, when
testing the hooks I experience hangs of condor_startd. When startd hangs
it quits responding to requests and condor shutdowns. Only a process
level kill ends the process.
The host running the hooks is a Windows Vista host running Condor 7.4.1.
The prepare hook does take some time to run (on the order of minutes).
However, startd does not always hang during the prepare hook. Sometimes
startd hangs after the job begins executing, sometimes it doesn't hang
at all.
Has anyone else seen similar behavior? Was there a way to work around
the problem? Apparently, there was a similar problem in 7.3.2 and prior
where a very simple fetch hook would cause startd to hang. I haven't
figured out what portion of the hook triggers this behavior, it's very
intermittent.
Thanks,
Michael Moore
A few issues with hooks on Windows...
http://condor-wiki.cs.wisc.edu/index.cgi/search?s=hook+windows
Specifically...
http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=422
http://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=864
Do either of those sound like your problem?
I believe one of those is related to using Windows on a machine with
many CPUs -- or at least it is more reproducible there.
Best,
matt