On 08/30/2017 04:09 AM, Thomas Hartmann wrote: > Hi all, > > has somebody experiences using the cgroup blkio controller to limit a > job's I/O to the disk? > > Background is, that a user recently send a task whose jobs were doing > primarily merging, i.e., heavily churning on the local disk with r/w. > When nodes got 'too many' jobs of this type, they became somewhat stuck > in I/O wait. We had jobs fail because of too much unzip/untarring and I added /etc/cgconfig.d/condor.conf: group htcondor { cpu {} cpuacct {} memory {} freezer {} blkio { blkio.throttle.write_bps_device = "8:0 104857600 8:16 104857600"; } } The errors seems to have disappeared since. Note that you have get the major:minor for each disk you want to throttle on each node which could be a bit of a PITA. And the newline syntax is silly, but that's how you specify multiple disks. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature