NOTE: no problem here...all machines are recognized by central manager..
2) condor_q - analyze 2.0 -- Submitter: comparch.binghamton.edu : <128.226.128.31:39183> :
comparch.binghamton.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD --- 002.000: Run analysis summary. Of 24 machines, 16 are rejected by your job's requirements 8 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool 0 match but reject the job for unknown reasons 0 match but will not currently preempt their existing job 0 are available to run your job
No successful match recorded. Last failed match: Thu Apr 26 14:32:52 2007 Reason for last match failure: no match found ---------------------------------------------------------------------------------------------------------------------------------------------------------
NOTE: 8 reject your job because of their own requirements
3) condor_q -better 2.0
-- Submitter: comparch.binghamton.edu : <
128.226.128.31:39183> : comparch.binghamton.edu --- 002.000: Run analysis summary. Of 24 machines, 16 are rejected by your job's requirements 8 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool 0 match but reject the job for unknown reasons 0 match but will not currently preempt their existing job 0 are available to run your job
No successful match recorded. Last failed match: Thu Apr 26 14:32:52 2007 Reason for last match failure: no match found
4/26 14:24:50 (pid:1888) Sent ad to central manager for
condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:24:50 (pid:1888) Sent ad to 1 collectors for condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:24:50 (pid:1888) Called reschedule_negotiator() 4/26 14:24:50 (pid:1888) DaemonCore: Command received via TCP from host <128.226.128.31:42297> 4/26 14:24:50 (pid:1888) DaemonCore: received command 493 (NEGOTIATE_WITH_SIGATTRS), calling handler (doNegotiate)
4/26 14:24:50 (pid:1888) Negotiating for owner: condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:24:50 (pid:1888) AutoCluster:config() significant atttributes changed to JobUniverse,LastCheckpointPlatform,NumCkpts
4/26 14:24:50 (pid:1888) Checking consistency running and runnable jobs 4/26 14:24:50 (pid:1888) Tables are consistent 4/26 14:24:50 (pid:1888) Out of servers - 0 jobs matched, 1 jobs idle, 1 jobs rejected 4/26 14:29:50 (pid:1888) Sent ad to central manager for
condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:29:50 (pid:1888) Sent ad to 1 collectors for condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:29:50 (pid:1888) Activity on stashed negotiator socket 4/26 14:29:50 (pid:1888) Negotiating for owner: condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:29:50 (pid:1888) Checking consistency running and runnable jobs
4/26 14:29:50 (pid:1888) Tables are consistent 4/26 14:29:50 (pid:1888) Out of servers - 0 jobs matched, 1 jobs idle, 1 jobs rejected 4/26 14:32:48 (pid:1888) DaemonCore: Command received via TCP from host <
128.226.128.31:54711> 4/26 14:32:48 (pid:1888) DaemonCore: received command 478 (ACT_ON_JOBS), calling handler (actOnJobs) 4/26 14:32:52 (pid:1888) DaemonCore: Command received via UDP from host <
128.226.128.31:35612> 4/26 14:32:52 (pid:1888) DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator) 4/26 14:32:52 (pid:1888) Sent ad to central manager for
condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:32:52 (pid:1888) Sent ad to 1 collectors for condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:32:52 (pid:1888) Called reschedule_negotiator()
4/26 14:32:52 (pid:1888) Activity on stashed negotiator socket 4/26 14:32:52 (pid:1888) Negotiating for owner: condor@xxxxxxxxxxxxxxxxxxxxxxx 4/26 14:32:52 (pid:1888) Checking consistency running and runnable jobs
4/26 14:32:52 (pid:1888) Tables are consistent 4/26 14:32:52 (pid:1888) Out of servers - 0 jobs matched, 1 jobs idle, 1 jobs rejected
5) StartLog on central manager:
4/26 13:57:49 ******************************************************
4/26 13:57:49 ** condor_startd (CONDOR_STARTD) STARTING UP 4/26 13:57:49 ** /home/condor/condor/sbin/condor_startd 4/26 13:57:49 ** $CondorVersion: 6.8.4 Feb 1 2007 $ 4/26 13:57:49 ** $CondorPlatform: I386-LINUX_RHEL3 $
4/26 13:57:49 ** PID = 1887 4/26 13:57:49 ** Log last touched 4/26 13:57:43 4/26 13:57:49 ****************************************************** 4/26 13:57:49 Using config source: /home/condor/condor/etc/condor_config
4/26 13:57:49 Using local config sources: 4/26 13:57:49 /home/condor/hosts/comparch/condor_config.local 4/26 13:57:49 DaemonCore: Command Socket at <128.226.128.31:34245
> 4/26 13:57:56 New machine resource allocated 4/26 13:57:56 About to run initial benchmarks. 4/26 13:58:00 Completed initial benchmarks. 4/26 14:13:00 State change: IS_OWNER is false 4/26 14:13:00 Changing state: Owner -> Unclaimed
4/26 14:23:00 State change: IS_OWNER is TRUE 4/26 14:23:00 Changing state: Unclaimed -> Owner
I think these are all the stats needed to debug ..
I haven't specified any Requirements in the Job submit file. I don't have any PERMISSION_DENIED errors either... My condor_config file is correct...its all set...
I have been tryin to debug this for 24 hours now...