On 2015-04-01 17:05, Todd Tannenbaum wrote:
Currently the schedd makes a subdirectory in SPOOL for each running job that holds intermediate checkpoint files if the submit file for the job requests ON_EXIT_OR_EVICT for when_to_transfer_output. I've long wanted the option to store these intermediate files in the home directory of the user instead of SPOOL so that the space for these intermediate files comes out of that user's own disk quota...
Well I think it would fill up /home and get in the state where schedd cannot create anymore spool subdirectories and then what? (/home of course is NFS-mounted and spool should be on a local drive according to TFM.) No, ideally it should stop creating them. This may not be doable, but on the other hand if you create one up front on submit, kinda like swap prefetch, you should be able to throttle the submissions once spool space gets low.
Of course the other interesting question is why this submit node ran just fine for a couple of years and this afternoon decided to write ~100GB of spool all of a sudden. I temporarily shut down all condor daemons because it just keeps filling up space I free up -- until I figure out what's going on there's no point in letting condor run.
Dimitri