Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] RuntimeError: Failed to receive remote ad.
- Date: Fri, 06 Apr 2018 14:51:11 -0400
- From: Larry Martell <larry.martell@xxxxxxxxx>
- Subject: Re: [HTCondor-users] RuntimeError: Failed to receive remote ad.
On Mon, Mar 26, 2018 at 4:29 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
> On 3/23/2018 2:41 PM, Larry Martell wrote:
>>
>> I have a python script that makes this call:
>>
>> schedd.xquery(requirements="ClusterId == %d" % id)
>>
>> Sometimes it throws an exception 'RuntimeError: Failed to receive remote
>> ad.'
>>
>
> Hi Larry,
Sorry for the lack of reply, but I am no longer working for the
company I was using condor with.
> About how often is "sometimes"... 50% of the time? 10% of the time? 1 in
> 5000000 ?
Very very infrequently. We were submitting around 4,000 jobs a day,
and this would happen maybe once every 10 days. But when it happened
it was always with the same job.
> How many queries / transactions per minute are you trying to do? For
> instance, if you submit 2,000 jobs and then subsequently do 2,000 xquery()
> calls every 10 seconds, that could be a problem.... better to submit the
> jobs and then do ONE query that fetches all 2,000 jobs every minute or
> so.... (i.e. batch your queries)
Yeah, I am doing something like that. This would be a good improvement.
> Maybe try Schedd.query() instead of xquery()? I ask because in the most
> recent versions of HTCondor, the Schedd.query() method gives more
> information about failures then Schedd.xquery(), and also query()'s
> implementation is less complex and thus less likely to have an intermittent
> failure. The only disadvantage to query() over xquery() is your python
> program may need to use more RAM as all your results are buffered in memory
> instead of streamed (probably only an issue if you are fetching many
> attributes from many thousands of jobs...).
I will keep this in mind for future work.
> Finally, what version of HTCondor are you using? (always good to include
> this)
>
> p.s. it is always a good practice to add a "projection" argument to every
> call to query() or xquery() unless you truly need all 80+ attributes about
> every job.