[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Easy Setup Guide
- Date: Fri, 30 Mar 2012 13:21:57 -0500
- From: Spuds <spuds1@xxxxxxxxx>
- Subject: [Condor-users] Easy Setup Guide
I inherited a cluster of condor machines. I don't know anything about condor and I have a mess on my hands.
Is there access to an easy setup guide to just set up a simple, no-nonsense, nothing special, just the basics cluster?
We have 3 windows hosts and a linux host, and I'm having all kinds of issues. Some of them I have solved, others still exist.
1) Can't submit jobs on linux
rholloway@rebelbase:~$ condor_submit submit
Submitting job(s)
ERROR: Failed to connect to local queue manager
CEDAR:6001:Failed to connect to <127.0.1.1:60211>
2) Jobs get submitted to the cluster and then show up as "held" and never do anything.
3) I get all kinds of errors in the Collector Log on what is supposed to be the master:
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129419:23, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:54 DC_AUTHENTICATE: attempt to open invalid session Dagobah:1228:1333129427:24, failing.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
03/30 12:55:56 Failed to send DC_INVALIDATE_KEY to daemon at <127.0.1.1:53521>: SECMAN:2003:TCP connection to daemon at <127.0.1.1:53521> failed.
But I have no problems joining other machines to the cluster.
And if there are any contractors out there that do this for a living, I'll even pay to have someone fix the environment. We just need it to work.
If I shouldn't be running this cluster at all and have no business doing it, I'll accept that as an answer as well.
--
<script language="_javascript_">
action = "">
user = "spuds1"
connector = "@"
domain = "gmail.com"
emailAddr= "Email Spuds"
document.write("<A HREF="" + action + user + connector + domain + ">"+ emailAddr +"</A>")
</script>