Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] MPI : strange globus error, though not using globus
- Date: Fri, 2 Feb 2007 11:59:57 +0100
- From: Nicolas GUIOT <nicolas.guiot@xxxxxxx>
- Subject: Re: [Condor-users] MPI : strange globus error, though not using globus
Details : I just figured out that this happens for every job I try to submit to my pool, even non-MPI.
It seems to be an authentication problem, but I don't understand I never had this before. What I recently changed on my pool is that I had te replace the Hard Disk of the central manager, and so to setup the computer again, but I had all data on backups, and everything should be exactly the same...
I'm still using the same NIS/NFS installation, users can still login to each computer, with their ~HOME/ correctly setup...
Any idea of what I forgot ?
Nicolas
----------------
On Thu, 1 Feb 2007 16:02:56 +0100
Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:
> Hi
>
> (FYI, I'm setting up the parallel applications, sorry to flood the list today...)
>
> So, I setup a dedicated scheduler, and 2 dedicated resources. This is all on a private LAN, nothing to do with globus, condor-g or any other stuff to link my pool to another.
>
> And now, When I'm submitting my MPI job, I get the following errors :
>
> $ condor_submit CondorMpiTest.cmd
> Submitting job(s)
> ERROR: Failed to connect to local queue manager
> AUTHENTICATE:1003:Failed to authenticate with any method
> AUTHENTICATE:1004:Failed to authenticate using GSI
> GSI:5003:Failed to authenticate. Globus is reporting error (851968:45). There is probably a problem with your credentials. (Did you run grid-proxy-init?)
> AUTHENTICATE:1004:Failed to authenticate using KERBEROS
> AUTHENTICATE:1004:Failed to authenticate using FS
>
> $ ps ax|grep cond
> 7602 ? Ss 0:02 /nfs/opt/condor_i686/sbin/condor_master
> 7603 ? Ss 0:00 condor_schedd -f
> 8120 pts/0 S+ 0:00 tail -f /scratch/condor/log/SchedLog
> 8129 pts/1 S+ 0:00 grep cond
>
> $ condor_q
> -- Submitter: seurat.lbt.ibpc.fr : <172.27.xx.xx:32795> : seurat.my.domain.fr
> ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
> 0 jobs; 0 idle, 0 running, 0 held
> ##################################
>
> - And I have in the SchedLog :
>
> 2/1 15:39:08 (pid:7603) authenticate_self_gss: acquiring self credentials failed. Please check your Condor configuration file if this is a server process. Or the user environment variable if this is a user process.
>
> GSS Major Status: General failure
> GSS Minor Status Error Chain:
> globus_gsi_gssapi: Error with GSI credential
> globus_gsi_gssapi: Error with gss credential handle
> globus_credential: Valid credentials could not be found in any of the possible locations specified by thecredential search order.
> Valid credentials could not be found in any of the possible locations specified by the credential search order.
>
> Attempt 1
>
> globus_credential: Error reading host credential
> globus_sysconfig: Could not find a valid certificate file: The host cert could not be found in:
> 1) env. var. X509_USER_CERT
> 2) /etc/grid-security/hostcert.pem
> 3) $GLOBUS_LOCATION/etc/hostcert.pem
> 4) $HOME/.globus/hostcert.pem
>
> The host key could not be found in:
> 1) env. var. X509_USER_KEY
> 2) /etc/grid-security/hostkey.pem
> 3) $GLOBUS_LOCATION/etc/hostkey.pem
> 4) $HOME/.globus/hostkey.pem
>
>
>
> Attempt 2
>
> globus_credential: Error reading proxy credential
> globus_sysconfig: Could not find a valid proxy certificate file location
> globus_sysconfig: Error with key filename
> globus_sysconfig: File does not exist: /tmp/x509up_u0 is not a valid file
>
> Attempt 3
>
> globus_credential: Error reading user credential
> globus_sysconfig: Error with certificate filename: The user cert could not be found in:
> 1) env. var. X509_USER_CERT
> 2) $HOME/.globus/usercert.pem
> 3) $HOME/.globus/usercred.p12
>
>
>
>
> 2/1 15:39:09 (pid:7603) AUTHENTICATE: no available authentication methods succeeded, failing!
> 2/1 15:39:09 (pid:7603) SCHEDD: authentication failed: AUTHENTICATE:1003:Failed to authenticate with any method|AUTHENTICATE:1004:Failed to authenticate using GSI|GSI:5003:Failed to authenticate. Globus is reporting error (851968:133). There is probably a problem with your credentials. (Did you run grid-proxy-init?)|AUTHENTICATE:1004:Failed to authenticate using KERBEROS|AUTHENTICATE:1004:Failed to authenticate usingFS
> 2/1 15:39:09 (pid:7603) IO: Failed to read packet header
> 2/1 15:39:25 (pid:7603) IO: Failed to read packet header
>
> #####################################
>
> So, what does this globus/grid/prixy error come to do here ?
>
> What did I miss ?
>
> Nicolas
----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE
Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------