Is there a way to get a little more information about
condor jobs and where they run, exactly what happened other than having
separate log files for each job e.g. Log = log_$(PROCESS).log In the submit file? There’s an issue when we’re submitting
1000+ jobs and we need to know which ones failed, and where they
executed. We can of course get the failures via the return codes and
error output but it would be helpful to know exactly where this job
executed. All we have at the minute is 001 (021.000.000) 09/29 09:58:54 Job executing on
host: <xxx.xxx.xxx.xxx:1104> And while this is useful, it would be helpful to have
the execute node actually in the following: 005 (021.000.000) 09/29 09:58:55 Job terminated. (0)
Abnormal termination (signal 53) (0) No
core file
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 0
- Run Bytes Sent By Job
384684 - Run Bytes Received By Job 0
- Total Bytes Sent By Job
384684 - Total Bytes Received By Job . Rather than just the job id. E.g. what about: 005 (021.000.000) 09/29 09:58:55 Job terminated
(after executing on node xxx.xxx.xxx.xxx) This probably seems trivial, but if anyone can
suggest other methods I’d be more than happy to hear them. Kind Regards, Shaun |