Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] parallel universe and sshd.sh
- Date: Wed, 31 May 2006 16:05:41 +0200
- From: Nicolas GUIOT <nicolas.guiot@xxxxxxx>
- Subject: Re: [Condor-users] parallel universe and sshd.sh
Hi all
I'm coming back on this issue.
In the sshd.sh script I have by default (6.7.18, yeah I know, I plan to upgarde soon...), this line is already replaced with
if grep "Server listening" sshd.out > /dev/null 2>&1
But I still have a problem, and very strange things :
- First, I had to modify the sshd command line, since I'm in debian stable, and sshd is only 3.8.x, and doesn't understand "-oAcceptEnv" , so I removed it : Maybe it's the reason to my problem (if so, do you know a way to workaround this ?)
- Then, when I submit the job, it says it's running (condor_q state is R), but when I check on the node, I have the following things :
guiot@seurat:~/divers/MD$ tail -f /ibpc/charon/condor/execute/dir_28262/sshd.out
Disabling protocol version 1. Could not load host key
Bind to port 4465 on 0.0.0.0 failed: Address already in use.
Cannot bind any address.
guiot@seurat:~/divers/MD$ tail -f /ibpc/charon/condor/execute/dir_28264/sshd.out
Disabling protocol version 1. Could not load host key
Server listening on 0.0.0.0 port 4468.
So, as you can see : 1 of the process seems to be fine, and the other not, but in truth, if I check a "ps ax|grep sshd", I can see none of them running (or just the one trying to be created, which changes constantly)
#ps ax|grep sshd
758 ? Ss 0:03 /usr/sbin/sshd
10819 ? Ss 0:00 sshd: root@pts/0
28727 ? SN 0:00 /usr/sbin/sshd -p4474 -oAuthorizedKeysFile=/scratch/condor/execute/dir_28262/tmp/0.key.pub -h/scratch/condor/execute/dir_28262/tmp/hostkey -De -f/dev/null -oStrictModes=no -oPidFile=/dev/null
and if I check again for the process which was fine (tail sshd.out), it keeps telling me it's fine, but it's listening on a new port !!?!?!
So : Is this related to the changes I had to make (-oAcceptEnv), or is it something really apart ? What could I check to solve this ?
Thanks in advance
Nicolas
>
> Unfortunately the jobs starts 'running' but is blocked. For some reason
> it starts some connections, but does not seem to recognize them (and
> then try with a next new port, again and again). I tried to look at the
> files and find out what might be the reason for this. In
> /usr/local/condor/libexec/sshd.sh there is a line like this :
>
> if grep "^Server listening on 0.0.0.0 port" sshd.out > /dev/null 2>&1
>
> I replaced this by :
>
> if grep "Server listening on :: port" sshd.out > /dev/null 2>&1
>
> Not sure at all if there was a typo, but I had the '^' this on the two
> computers.
>
----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE
Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------