I meant to include the execute node condor_who -daemon:
Daemon Alive PID PPID Exit
------ ----- --- ---- ----
Master yes 7570 1 no
SharedPort no 7604 no no
Startd yes 7605 7570 no
JK
> On Aug 18, 2023, at 3:38 PM, Justin Killebrew via HTCondor-users <
htcondor-users@xxxxxxxxxxx> wrote:
>
>
> External Email - Use Caution
>
>
>
> condor_who -daemons on the central manager (also configured as submit role) shows:
>
> Daemon Alive PID PPID Exit
> ------ ----- --- ---- ----
> Collector yes 1608 1494 no
> Master yes 1494 1 no
> Negotiator yes 1609 1494 no
> Schedd yes 1610 1494 no
> SharedPort yes 1607 1494 no
>
> This looks correct but on the execute machine, StartLog has several
> ERROR: AUTHENTICATE:1003:Failed to authenticate with any method
> and
> SECMAN: required authentication with collector failed
>
> The central manager CollectorLog shows similar errors:
> DC_AUTHENTICATE: required authentication of 192.168.1.5 failed
>
> The firewall isnât active â Where else should I look?
>
> condor_status returns nothing on the central manager. Is this because it doesnât see any execute machines?
>
>
> Thanks,
> JK
>
>
>
>> On Aug 17, 2023, at 12:28 PM, John M Knoeller <
johnkn@xxxxxxxxxxx> wrote:
>>
>>
>> External Email - Use Caution
>>
>>
>>
>> One way to troubleshoot is to run
>>
>> condor_who -daemons
>>
>> On the execute node. This tool scrapes log files to determine which daemons are alive and which are not.
>>
>> If the condor_master is running, then you can use
>>
>> condor_who -quick
>>
>> which sends a query to the condor_master about the state of the other daemons.
>>
>> -tj
>>
>> -----Original Message-----
>> From: HTCondor-users <
htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Justin Killebrew via HTCondor-users
>> Sent: Friday, August 11, 2023 3:03 PM
>> To: Todd L Miller <
tlmiller@xxxxxxxxxxx>
>> Cc: Justin Killebrew <
jk@xxxxxxx>; Justin Killebrew via HTCondor-users <
htcondor-users@xxxxxxxxxxx>
>> Subject: Re: [HTCondor-users] condor_status returns nothing
>>
>> The StartLog showed that /var/lib/condor/execute didnât exist. I created it and restarted condor and now condor_status works as expected.
>>
>> Thanks!
>>
>> JK
>>
>>
>>> On Aug 11, 2023, at 3:47 PM, Todd L Miller <
tlmiller@xxxxxxxxxxx> wrote:
>>>
>>>
>>> External Email - Use Caution
>>>
>>>
>>>
>>>> Should there be a startd running? How do I troubleshoot this installation?
>>>
>>> Yes. First thing to do is look at the MasterLog and StartLog
>>> files (which will probably be in /var/log/condor, but you can run
>>> `condor_config_val LOG` to find out for sure). From your process tree, it
>>> looks like either the master isn't starting the startd or the startd is
>>> crashing (almost?) immediately on start-up.
>>>
>>> - ToddM
>>
>>
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>>
https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
>
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
>
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to
htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/