On 10/5/23 10:03, Vikrant Aggarwal wrote:
> Hello Experts,
>
> We want to capture the signal to copy some logs before the scratch
> directory disappears after the job goes into hold status because of
> memory breach but we are unsuccessfulÂto do it. Do we have any way to
> achieve this? We thought it was probably a job wrapper which is doing
> exec to run actual condor jobs not allowing us to capture the signal
> but that's not the case.
The Linux out-of-memory signal uses signal 9, which is uncatchable. You
could write a startd policy which evicts jobs when their MemoryUsage is
some percentage of the total, and if the job has
when_to_transfer_output = ON_EXIT_OR_EVICT
then the scratch directory would get copied back to the spool on the AP
-greg
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/