Re: [HTCondor-devel] condor_ssh_to_job is cool, but DNS would be cooler


Date: Thu, 21 Mar 2013 16:53:53 -0500
From: Dan Bradley <dan@xxxxxxxxxxxx>
Subject: Re: [HTCondor-devel] condor_ssh_to_job is cool, but DNS would be cooler

On 3/21/13 4:01 PM, Erik Paulson wrote:


On Wed, Mar 20, 2013 at 5:19 PM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:
Interesting stuff, Erik.


The bummer about condor_ssh_to_job is that condor_ssh_to_job is your ssh client - and if you have a real ssh client you want to use, you're SOL. 


condor_ssh_to_job calls out to whatever ssh client you give it (and whatever sshd server the admin gives it).  However, it does depend on the ability to proxy the connection (e.g. via the OpenSSH ProxyCommand option), and this may not be supported by all ssh clients.  I've seen condor_ssh_to_job work with rsync, scp, sftp, and sshfs.


My dilemma is that I'm not using a stand-alone ssh client - I've got a fabric ( http://docs.fabfile.org/en/1.6/ ) setup, which uses its own ssh library to connect directly to the remote host. 

However, it looks like the underlying library may support ProxyCommand, so I might be able to convince it that it should fire up condor_ssh_to_job before it tries to open an ssh connection

I will report back. (After building a more modern version of Python for submit-1.chtc)

We may have a slight plumbing problem here.  ssh_to_job wants to set up the connection, and keys, and then invoke the ssh client.  Now for the convoluted part: the command-line options ssh_to_job passes to the ssh client by default contain a ProxyCommand option, which invokes ssh_to_job.  When this inner ssh_to_job runs, it gets the file descriptor of the connection passed to it from the outer ssh_to_job, and it then proxies this connection via its stdin/stdout for the ssh client.

It sounds like what you want is to dispense with the top level ssh_to_job and just have the inner one set up the connection and do the proxying.  But how will your ssh know to use the key that ssh_to_job sets up for the connection?  Also, how will it know which username to try to login as?  These things are normally passed from ssh_to_job as options to the ssh client that it launches.

Anyway, once those questions are answered, here is a way to trick ssh_to_job into forming the connection and proxying it for an outer ssh client:

ssh -oProxyCommand='/usr/bin/condor_ssh_to_job -ssh "/bin/sh -c %%x"  jobid' whatever

Instead of launching an ssh client, it launches itself in proxy mode.  (That's what the %%x expands to.  I had to double the % to get it to pass through the outer ssh's parser for ProxyCommand.  Your case may differ.)  Clearly, if this were something we wanted to support for real, we could make the outer ssh_to_job do the proxying directly, rather than having it invoke a second copy of itself to do it.

If you try the above example, you will find that you can't log in, because the outer ssh doesn't have the right key, and it probably isn't logging in as the right user.

--Dan


-Erik

 
The real bummer about condor_ssh_to_job is that with the standard sshd, there is no way for us to avoid the target user's login shell being used to launch the command that is executed.  So if the login shell of the user running the job is /sbin/nologin, condor_ssh_to_job doesn't work.  A small patch to sshd can make it work irrespective of the user's login shell.  I have mixed feelings about actually shipping that.

--Dan


On 3/20/13 1:12 PM, Erik Paulson wrote:
Jeff would kill me if I spent time implementing this, but what would be more awesome than condor_ssh_to_job would be for the schedd to speak DNS, so you could have things like:

<cluster>.<proc>.<scheddname>.condorjobs.cs.wisc.edu mapped to IP addresses. Ideally, you'd give the startd several IP addresses that it can map to slots, and as your job moves around the DNS mapping is updated (obviously, short DNS TTLs)

Have the startd also fire up an sshd for the job. Have a tool like condor_get_job_hostkey that gets the updated hostkey from the schedd when the job starts running somewhere. At job submit time, specify the public key(s) you'll want to be able to log in with. 

If the job is idle, map it back to the schedd*. Have the schedd (and maybe the startd) listen on a port speaking HTTP to cough up info about the job. Alternatively, just return host not found when the job is idle. 

The bummer about condor_ssh_to_job is that condor_ssh_to_job is your ssh client - and if you have a real ssh client you want to use, you're SOL. 

Having a job to DNS mapping gets you a good part of the way with Condor as a player in the IaaS world - you could run VM Universe jobs and be able to find them again.

-Erik
 
*it might be kind of fun to able to ssh to a schedd and interact with a simple shell.


_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel


_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel



_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel

[← Prev in Thread] Current Thread [Next in Thread→]