On 11/29/2011 08:41 AM, Dan Bradley wrote: > > After testing to see which machines match the job, the negotiator sorts > the matching machines and chooses the most desirable one. If it chooses > an offline machine, it should inform the collector and update > MachineLastMatchTime. Can you confirm from your negotiator log whether > it is choosing the offline machine or not? From the log you posted, I > can only see that the offline machine was selected as a candidate, not > whether it was actually chosen. This is unfortunately useless on a real-life pool. I'm getting close to a hundred meg of 11/29/11 15:34:40 Job 977853.0 does match with slot8@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot9@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot10@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot5@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot11@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot12@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot6@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot13@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot7@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot14@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot8@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot15@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot16@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot1@xxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot2@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot3@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot4@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot5@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot6@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Job 977853.0 does match with slot7@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Rejected 977853.0 bmrbgrid@xxxxxxxxxxxxx <144.92.167.254:9617?sock=13250_c2fa_3>: no match found 11/29/11 15:34:40 Got NO_MORE_JOBS; done negotiating 11/29/11 15:34:40 Phase 4.2: Negotiating with schedds ... -- then a job ad, then another list of "does match" slots, then 11/29/11 15:34:40 Job 977853.0 does match with slot7@xxxxxxxxxxxxxxxxxxxxxxx 11/29/11 15:34:40 Rejected 977853.0 bmrbgrid@xxxxxxxxxxxxx <144.92.167.254:9617?sock=13250_c2fa_3>: no match found 11/29/11 15:34:40 Got NO_MORE_JOBS; done negotiating 11/29/11 15:34:40 negotiateWithGroup resources used scheddAds length 0 11/29/11 15:34:40 ---------- Finished Negotiation Cycle ---------- Without any visible indication as to why "no match found": falcon and robin are off-line. I did manage to get one hibernating machine (falcon: the 1st one in alphabetical order of hostnames) to wake up once today, it ran jobs for maybe 5-10 minutes and went back to sleep. The other one (robin) never woke up at all. If I could log only negotiation for one specific user, maybe I could then find something in there. As it is, I've already spent more time than I can afford on this and I see no light at the end of the tunnel. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature