Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor SOAP hanging schedd.
- Date: Tue, 6 Jul 2010 14:51:37 -0700
- From: Patrick Armstrong <patricka@xxxxxxx>
- Subject: Re: [Condor-users] Condor SOAP hanging schedd.
On 6-Jul-10, at 10:01 AM, Matthew Farrellee wrote:
I imagine if the Schedd is using CPU and IO bandwidth then it's just
a matter of the response to the getJobAds taking a long time to
write. If this happened all the time I'd imagine maybe the Schedd is
just slow. However, it could be that your client is periodically
reading slowly. Maybe the client is interleaving reads with
computation.
No, I wrote the client myself. All it does is read, and once it's
completed reading, it starts parsing the XML. What actually happened
was that my client was timing out (its timeout was set to 90sec, and
it took the schedd about 5 minutes to generate and send the response).
My workaround for now is to just set a huge timeout, which seems to be
working okay.
I think there might actually be a bug here, since the schedd seems to
choke if the client times out. You can test this by running getJobAds
against a schedd, then canceling the request before it completes. The
schedd will just sit spinning your CPU forever until Master eventually
kills it.
It could be that I'm interpreting this wrong though. Any thoughts?
--patrick