[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Can't find address of local schedd



Hi Robert,

thank you for the help. After reinitializing the front-end by hand, I
continue with the problem:

$ condor_status
CEDAR:6001:Failed to connect to <xxx.xx.xxx.xx:xxxx>
Error: Couldn't contact the condor_collector on cluster-name.domain

Extra Info: the condor_collector is a process that runs on the central
manager of your Condor pool and collects the status of all the machines and
jobs in the Condor pool. The condor_collector might not be running, it might
be refusing to communicate with you, there might be a network problem, or
there may be some other problem. Check with your system administrator to fix
this problem.

If you are the system administrator, check that the condor_collector is
running on cluster-name.domain, check the HOSTALLOW configuration in your
condor_config, and check the MasterLog and CollectorLog files in your log
directory for possible clues as to why the condor_collector is not
responding. Also see the Troubleshooting section of the manual.

I am looking for the Masterlog files, but I can't find them. Where
they are suppose to be? The troubleshooting section of the manual
doesn't help.

The condor_master command doesn't help too:

# condor_master
# condor_status
CEDAR:6001:Failed to connect to ... <snip>


Thanks very much for the help

Marcelo


2009/4/13 Robert Rati <rrati@xxxxxxxxxx>:
> Looks like you need to restart condor.  You can do this by running
> "condor_master" as root.
>
> Rob
>
> Marcelo Chiapparini wrote:
>> Hello,
>>
>> I am running Condor 7.05 in a Rocks 5.1 cluster. Users were submitting
>> jobs normally, but today, suddenly, Condor doesn't accept any jobs.
>> Below is the message we obtain when trying to submit:
>>
>> $ condor_submit hello.sub
>>
>> ERROR: Can't find address of local schedd
>>
>> and whit condor_q
>>
>> $ condor_q
>> Error:
>>
>> Extra Info: You probably saw this error because the condor_schedd is not
>> running on the machine you are trying to query. If the condor_schedd is not
>> running, the Condor system will not be able to find an address and port to
>> connect to and satisfy this request. Please make sure the Condor daemons are
>> running and try again.
>>
>> Extra Info: If the condor_schedd is running on the machine you are trying to
>> query and you still see the error, the most likely cause is that you have
>> setup a personal Condor, you have not defined SCHEDD_NAME in your
>> condor_config file, and something is wrong with your SCHEDD_ADDRESS_FILE
>> setting. You must define either or both of those settings in your config
>> file, or you must use the -name option to condor_q. Please see the Condor
>> manual for details on SCHEDD_NAME and SCHEDD_ADDRESS_FILE.
>>
>>
>> Apparently Condor daemons are not running any more:
>>
>> $ ps -ef | grep condor
>> 500      21305 20730  0 17:25 pts/2    00:00:00 grep condor
>>
>> I am new to Condor. So, I will thanks any help.
>>
>> Thanks in advance
>>
>> Marcelo
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>