[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] exit hook not always report correct ImageSize



On 1/26/2023 1:25 PM, JM wrote:
HTCondor users,

I have an exit hook to send .job.ad to a database. However, I noticed that in some uncertain cases, ImageSize is 1250 instead of the real ImageSize from condor_history. The jobs run much longer than 15 seconds. I would expect startd will update .job.ad. I even tried to sleep 30 seconds in exit hook to make sure the update happens.

Does anyone have a clue why?

Hi,

I did not positively confirm this, but my guess is the .job.ad file sitting in the scratch directory is written at the start of job execution, and not re-written every time the job ad is updated. 

However, note that HTCondor will give a current/updated copy of the job classad to your exit hook script via stdin [*].  Instead of having your exit hook read the .job.ad file, I suggest you use the information passed to it via stdin.  Let us know if you have any additional problems or questions here.  It would not be a big deal for us to patch HTCondor to update the .job.ad upon job exit (i.e. before invoking the exit hook), but using the standard input should do what you want today....

Hope this helps,
Todd

[*] = In the manual at link:
   https://htcondor.readthedocs.io/en/latest/admin-manual/hooks.html#work-fetching-hooks-invoked-by-htcondor
look for "HOOK_JOB_EXIT" and note what it says in the section "Standard input given to the hook".



-- 
Todd Tannenbaum <tannenba@xxxxxxxxxxx>  University of Wisconsin-Madison
Center for High Throughput Computing    Department of Computer Sciences
Calendar: https://tinyurl.com/yd55mtgd  1210 W. Dayton St. Rm #4257