it is not meaningful for UID_DOMAIN to be *. If you want it to default, just donât set it.
-tj
From: Biruk Mammo [mailto:birukw@xxxxxxxxxx]
Sent: Monday, March 12, 2018 2:24 PM
To: John M Knoeller <johnkn@xxxxxxxxxxx>
Cc: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api
Changing UID_DOMAIN to a common string on the HTCondor submit and compute nodes seems to have fixed the problem. Thanks, John!
For future reference, was this a bug or is setting UID_DOMAN = * not supported in this way?
Yes. that seems likely.
From: Biruk Mammo [mailto:birukw@xxxxxxxxxx]
Sent: Monday, March 12, 2018 1:07 PM
To: John M Knoeller <johnkn@xxxxxxxxxxx>
Cc: htcondor-users@xxxxxxxxxxx
Subject: Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api
Aha, thanks John!
I have no map file configured. The scheduler's configuration is as follows:
ALLOW_WRITE = $(ALLOW_WRITE), $(CONDOR_HOST)
CONDOR_HOST = condor-master
DAEMON_LIST = MASTER, SCHEDD
DISCARD_SESSION_KEYRING_ON_STARTUP = False
Is the UID_DOMAIN setting the culprit?
Yes. the problem is here
myusername@*
The * here should be a domain name. Because it is a * instead, and * is used as a token separator, the remainder isnât being parsed correctly.
(more specifically, there should only be one * between the username and the condor version string)
So, something odd is going on in the SCHEDD when it authenticates. Do you have a map file?
-tj
From: Biruk Mammo [mailto:birukw@xxxxxxxxxx]
Sent: Saturday, March 10, 2018 10:00 PM
To: htcondor-users@xxxxxxxxxxx; John M Knoeller <johnkn@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api
Hi John, hope you had a chance to look at this.
Here is the full log line:
2.0.9:25777>*48*2*0*9CEBCCEB79FAB9851039EDEAF169AC16C98AC4C827A7CA5A*0* 0 0'
could you please send me the [REDACTED] bit from this ToolLog message?
condor_history: getInheritedSocks from CONDOR_INHERIT is ... [REDACTED]
The error indicates that the actual contents of that is incorrectly formatted.
thanks
-tj
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
On Behalf Of Biruk Mammo via HTCondor-users
Sent: Tuesday, February 27, 2018 5:11 PM
To: htcondor-users@xxxxxxxxxxx
Cc: Biruk Mammo <birukw@xxxxxxxxxx>
Subject: [HTCondor-users] "Failed to receive remote ad" runtime error when querying history with the python api
Hello HTCondor users,
I get a "Failed to receive remote ad"
error when using the Python bindings to query history immediately after submitting a job. Looking into the HTCondor logs, I see the following error in ToolLog:
[Timestamp] condor_history: getInheritedSocks from CONDOR_INHERIT is ... [REDACTED]
[Timestamp] ERROR "Assertion ERROR on (*ptmp == '*')" at line 2244 in file /slots/10/dir_3701941/userdir/.tmplMkQ9O/BUILD/condor-8.7.6/src/condor_io/sock.cpp
I also see a core dump in the log directory.
This error does not occur if I wait a few seconds before invoking schedd.history. Also, there is no error if I run the history query without submitting a job.
Below is the Python code that triggers the problem.
submit = htcondor.Submit({'executable': '/usr/bin/sleep', 'arguments': '300'})
schedd = htcondor.Schedd()
with schedd.transaction() as txn:
print list(schedd.history('true', ['ClusterId'], 10))
# RuntimeError: Failed to receive remote ad.
Is there something I am missing? Thanks in advance for your help!
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
|