[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] /var/lib/condor/spool usage



On 2015-04-01 17:05, Todd Tannenbaum wrote:

Currently the schedd makes a subdirectory in SPOOL for each running job
that holds intermediate checkpoint files if the submit file for the job
requests ON_EXIT_OR_EVICT for when_to_transfer_output.  I've long wanted
the option to store these intermediate files in the home directory of
the user instead of SPOOL so that the space for these intermediate files
comes out of that user's own disk quota...

Well I think it would fill up /home and get in the state where schedd cannot create anymore spool subdirectories and then what? (/home of course is NFS-mounted and spool should be on a local drive according to TFM.) No, ideally it should stop creating them. This may not be doable, but on the other hand if you create one up front on submit, kinda like swap prefetch, you should be able to throttle the submissions once spool space gets low.

Of course the other interesting question is why this submit node ran just fine for a couple of years and this afternoon decided to write ~100GB of spool all of a sudden. I temporarily shut down all condor daemons because it just keeps filling up space I free up -- until I figure out what's going on there's no point in letting condor run.

Dimitri