Re: [HTCondor-devel] condor_ssh_to_job is cool, but DNS would be cooler
On 3/21/13 4:01 PM, Erik Paulson wrote:
On Wed, Mar 20, 2013 at 5:19 PM, Dan
Bradley <dan@xxxxxxxxxxxx>
wrote:
Interesting stuff,
Erik.
The bummer about condor_ssh_to_job is that
condor_ssh_to_job is your ssh client - and if you have
a real ssh client you want to use, you're SOL.
condor_ssh_to_job calls out to whatever ssh client you give
it (and whatever sshd server the admin gives it). However,
it does depend on the ability to proxy the connection (e.g.
via the OpenSSH ProxyCommand option), and this may not be
supported by all ssh clients. I've seen condor_ssh_to_job
work with rsync, scp, sftp, and sshfs.
My dilemma is that I'm not using a stand-alone ssh client -
I've got a fabric ( http://docs.fabfile.org/en/1.6/
) setup, which uses its own ssh library to connect directly to
the remote host.
However, it looks like the underlying library may support
ProxyCommand, so I might be able to convince it that it should
fire up condor_ssh_to_job before it tries to open an ssh
connection
I will report back. (After building a more modern version
of Python for submit-1.chtc)
We may have a slight plumbing problem here. ssh_to_job wants to set
up the connection, and keys, and then invoke the ssh client. Now
for the convoluted part: the command-line options ssh_to_job passes
to the ssh client by default contain a ProxyCommand option, which
invokes ssh_to_job. When this inner ssh_to_job runs, it gets the
file descriptor of the connection passed to it from the outer
ssh_to_job, and it then proxies this connection via its stdin/stdout
for the ssh client.
It sounds like what you want is to dispense with the top level
ssh_to_job and just have the inner one set up the connection and do
the proxying. But how will your ssh know to use the key that
ssh_to_job sets up for the connection? Also, how will it know which
username to try to login as? These things are normally passed from
ssh_to_job as options to the ssh client that it launches.
Anyway, once those questions are answered, here is a way to trick
ssh_to_job into forming the connection and proxying it for an outer
ssh client:
Instead of launching an ssh client, it launches itself in proxy
mode. (That's what the %%x expands to. I had to double the % to
get it to pass through the outer ssh's parser for ProxyCommand.
Your case may differ.) Clearly, if this were something we wanted to
support for real, we could make the outer ssh_to_job do the proxying
directly, rather than having it invoke a second copy of itself to do
it.
If you try the above example, you will find that you can't log in,
because the outer ssh doesn't have the right key, and it probably
isn't logging in as the right user.
--Dan
-Erik
The real bummer about
condor_ssh_to_job is that with the standard sshd, there is
no way for us to avoid the target user's login shell being
used to launch the command that is executed. So if the
login shell of the user running the job is /sbin/nologin,
condor_ssh_to_job doesn't work. A small patch to sshd can
make it work irrespective of the user's login shell. I have
mixed feelings about actually shipping that.
--Dan
On 3/20/13 1:12 PM, Erik Paulson wrote:
Jeff would kill me if I spent time
implementing this, but what would be more awesome than
condor_ssh_to_job would be for the schedd to speak
DNS, so you could have things like:
<cluster>.<proc>.<scheddname>.condorjobs.cs.wisc.edu mapped
to IP addresses. Ideally, you'd give the startd
several IP addresses that it can map to slots, and
as your job moves around the DNS mapping is updated
(obviously, short DNS TTLs)
Have the startd also fire up an sshd for the job.
Have a tool like condor_get_job_hostkey that gets
the updated hostkey from the schedd when the job
starts running somewhere. At job submit time,
specify the public key(s) you'll want to be able to
log in with.
If the job is idle, map it back to the schedd*.
Have the schedd (and maybe the startd) listen on a
port speaking HTTP to cough up info about the job.
Alternatively, just return host not found when the
job is idle.
The bummer about condor_ssh_to_job is that
condor_ssh_to_job is your ssh client - and if you
have a real ssh client you want to use, you're SOL.
Having a job to DNS mapping gets you a good part
of the way with Condor as a player in the IaaS world
- you could run VM Universe jobs and be able to find
them again.
-Erik
*it might be kind of fun to able to ssh to a
schedd and interact with a simple shell.