I was having a lot of problems with the
CRED in the past and UW made a lot of changes to this over the last year
(your early versions do not have all the updates, but you should be ok
with regard to this problem--we are running 7.6.1 still).
I have not had your problem (our credentials
also change quarterly) but since you can only see the host listed when
you run condor_status my guess is that there is a problem with permissions
for communication btw Condor nodes and the central manager.
Check that the pool password is stored
and first tackle that you can't see all the machines when you run condor_status,
because this seems to be the underlining problem. Make sure the global
config has the proper security settings and communication btw machines
are allowed.
Look for a CRED dump file on the CRED
server in the Condor log directory.
If you have not done so already, you
could restart the condor service on the central manager and make sure the
CRED service is not crashing. My guess is that the CRED is ok though.
mike
From:
Eric Abel <Eric.Abel@xxxxxxxxxx>
To:
Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Date:
02/17/2012 12:47 PM
Subject:
[Condor-users] condor_credd issues
Sent by:
condor-users-bounces@xxxxxxxxxxx
Fellow Condor users,
I have been wrestling with what appears to be a condor_credd problem for
about 2 days now. I have a windows pool of about 160 cpus, and it
has been working more or less problem free for about a 9 months. Our
IT polity is to change domain passwords every quarter, and in the past
I have done this without any trouble. However, this most recent time,
after resetting my password, I could no longer submit jobs (no password
stored for user error on schedd machine). When I try
condor_store_cred add
I get the error:
Operation failed. Make sure your ALLOW_WRITE setting includes this
host.
This is not a new problem, and I have followed all of the suggestions from
previous posts multiple times. In my case, nothing I do will allow
me to set the password, and furthermore, I cannot set it on any machine
in the pool, including the central host and credd host. This problem
is confounded further by the fact that when I run condor_status, I only
see the central host listed (different issue, but simultaneity of occurrence
makes me think both problems have same root cause?). Anyway, I have
spent a long time combing log files and editing config files only to get
the same result over and over again. I am running condor 7.6.6 on
central host, credd host, and schedd host, and the execute nodes are a
mixture of 7.6.1-7.6.6 installations. Any suggestions would be greatly
appreciated.
Thanks,
eric
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users