(a) let the matchmaker forget about the assignment of the job to that machine (there must be a timeout somewhere?) and
I suspect there isn't a memory -- although I could be wrong -- and that the problem, as you suggest below, is that the unwakeable machine(s) sort to the same position every time, so if you have k jobs and k unwakeable machines, you won't ever wake a machine.
(b) modify the NEGOTIATOR_PRE_JOB_RANK (I suppose this is the right one) to reorder Offline machines so this particular one gets ranked down /excluded in the next cycle (as long as there are other machines...)
(Could MachineLastMatchTime be used for (b)?
Probably.
How to balance it against LastHeardFrom which is already used to get even "wear"?
Assuming your wear-blancing is `+(k * (time() - LastHeardFrom))`, where `k` is a scaling factor depending on what else is in NEGOTIATOR_PRE_JOB_RANK you probably want `-(l * (time() - MachineLastMatchTime))`, where `l` is a (positive, nonzero) constant less than `k`, so as not to overwhelm it.
What else comes to mind?)
The wake-up script could record the last (k) time(s) it tried to wake up a given machine and set the unwakeable machine's START expression to FALSE?
-- ToddM