Re: [HTCondor-devel] condor_ssh_to_job is cool, but DNS would be cooler


Date: Thu, 21 Mar 2013 16:01:23 -0500
From: Erik Paulson <epaulson@xxxxxxxxxxxx>
Subject: Re: [HTCondor-devel] condor_ssh_to_job is cool, but DNS would be cooler


On Wed, Mar 20, 2013 at 5:19 PM, Dan Bradley <dan@xxxxxxxxxxxx> wrote:
Interesting stuff, Erik.


The bummer about condor_ssh_to_job is that condor_ssh_to_job is your ssh client - and if you have a real ssh client you want to use, you're SOL. 


condor_ssh_to_job calls out to whatever ssh client you give it (and whatever sshd server the admin gives it).  However, it does depend on the ability to proxy the connection (e.g. via the OpenSSH ProxyCommand option), and this may not be supported by all ssh clients.  I've seen condor_ssh_to_job work with rsync, scp, sftp, and sshfs.


My dilemma is that I'm not using a stand-alone ssh client - I've got a fabric ( http://docs.fabfile.org/en/1.6/ ) setup, which uses its own ssh library to connect directly to the remote host. 

However, it looks like the underlying library may support ProxyCommand, so I might be able to convince it that it should fire up condor_ssh_to_job before it tries to open an ssh connection

I will report back. (After building a more modern version of Python for submit-1.chtc)

-Erik

 
The real bummer about condor_ssh_to_job is that with the standard sshd, there is no way for us to avoid the target user's login shell being used to launch the command that is executed.  So if the login shell of the user running the job is /sbin/nologin, condor_ssh_to_job doesn't work.  A small patch to sshd can make it work irrespective of the user's login shell.  I have mixed feelings about actually shipping that.

--Dan


On 3/20/13 1:12 PM, Erik Paulson wrote:
Jeff would kill me if I spent time implementing this, but what would be more awesome than condor_ssh_to_job would be for the schedd to speak DNS, so you could have things like:

<cluster>.<proc>.<scheddname>.condorjobs.cs.wisc.edu mapped to IP addresses. Ideally, you'd give the startd several IP addresses that it can map to slots, and as your job moves around the DNS mapping is updated (obviously, short DNS TTLs)

Have the startd also fire up an sshd for the job. Have a tool like condor_get_job_hostkey that gets the updated hostkey from the schedd when the job starts running somewhere. At job submit time, specify the public key(s) you'll want to be able to log in with. 

If the job is idle, map it back to the schedd*. Have the schedd (and maybe the startd) listen on a port speaking HTTP to cough up info about the job. Alternatively, just return host not found when the job is idle. 

The bummer about condor_ssh_to_job is that condor_ssh_to_job is your ssh client - and if you have a real ssh client you want to use, you're SOL. 

Having a job to DNS mapping gets you a good part of the way with Condor as a player in the IaaS world - you could run VM Universe jobs and be able to find them again.

-Erik
 
*it might be kind of fun to able to ssh to a schedd and interact with a simple shell.


_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel


_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel

[← Prev in Thread] Current Thread [Next in Thread→]