Hi Brian, Indeed, that was what I was looking for, thanks. I could test ghings with this config : ABSENT_REQUIREMENTS = True ABSENT_EXPIRE_ADS_AFTER = 30*3600*24 COLLECTOR_PERSISTENT_AD_LOG = $(LOG)/AbsentLog EXPIRE_INVALIDATED_ADS = True De : HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx]
De la part de Brian Bockelman Hi Frederic, I believe you are looking for the "absent ads" feature: I link to the 8.2 manual, but I believe this was introduced in 7.6. Brian On Jul 1, 2014, at 7:33 AM, SCHAER Frederic <frederic.schaer@xxxxxx> wrote:
Hi, Ah, not great… I guess I’d be able to work that around with a script parsing the history (but parsing classads might not be that easy for the
newbies that I am), or even just by building an auto-updated “nodes” file with puppet... I’m wondering though how people do debug batch issues if they can’t even identify there are failing nodes from a batchsystem point of view ? I guess people have monitoring scripts checking for the presence of a stard process (at least), and probably some other trivial things (but which
ones ?) in order to be sure the start processes are correctly registered in the pool ? Regards De : HTCondor-users
[mailto:htcondor-users-bounces@xxxxxxxxxxx] De la part de Marc Volovic You can see drained nodes with condor_status. For nodes that are down, that is a more difficult question – I'd do it using an external means. From: HTCondor-users
[mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of SCHAER Frederic Hi, I’m used to torque, in which there is a “pbsnodes –l” command that displays nodes that are down or drained. Strangely, I don’t find how to see this information in condor : what would be the condor way of finding this information ? I’m sure this can become hard when the pool is dynamic, but even then there must be traces of nodes which belonged to the pool “one day” or in the last X days
? Thanks _______________________________________________ |