Hi all,
after 7.8.3 update I have 4 jobs out of a dag of 8000+ stuck with:
---------------------------------------------
The Requirements expression for your job is:
( ( TARGET.Memory > 0 ) && ( .RIGHT.Memory > 0 ) ) &&
( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) &&
( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) &&
( TARGET.FileSystemDomain == MY.FileSystemDomain )
    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   (
    [
    ].Memory > 0 )       0                   REMOVE
2   ( TARGET.Memory >= 7325 )         0                   MODIFY TO 1968
3   ( TARGET.Memory > 0 )             32
...
---------------------------------------------
Last time condor pulled this TARGET.Memory requirement out of the ether
I added "( TARGET.Memory > 0 ) && ( .RIGHT.Memory > 0 )" to job's submit
file. That worked until now.
The other change is I added another machine to the pool in the middle of
the run -- a 2x2 AMD, but stuck jobs are not on it.
What's curious this time all 4 jobs are stuck on one node and before
they got stuck a whole lot of jobs successfully ran to completion on
that node.
The jobs are BLAST sequence searches, execute nodes are all centos 6.3
x86_64 AMDs (2..8-core), the whole setup's been running weekly for years.
Any suggestions?
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature