Hi everyone, we have jobs that transfer a gigabyte zipped tarball of data, unzip and start crunching. The problem we're running into is when, say, 16 of them land on a 16-core worker node at once, the untar'ing completely chokes the drive. To the point where condor daemons wait too long trying to write their logs and die with status 44. I expect that's a fairly common job pattern and I'm wondering what would be the best way to deal with it (FVVO "best"). Any suggestions? TIA -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature