On 08/30/2017 04:09 AM, Thomas Hartmann wrote:
> Hi all,
>
> has somebody experiences using the cgroup blkio controller to limit a
> job's I/O to the disk?
>
> Background is, that a user recently send a task whose jobs were doing
> primarily merging, i.e., heavily churning on the local disk with r/w.
> When nodes got 'too many' jobs of this type, they became somewhat stuck
> in I/O wait.
We had jobs fail because of too much unzip/untarring and I added
/etc/cgconfig.d/condor.conf:
group htcondor {
cpu {}
cpuacct {}
memory {}
freezer {}
blkio {
blkio.throttle.write_bps_device = "8:0 104857600
8:16 104857600";
}
}
The errors seems to have disappeared since.
Note that you have get the major:minor for each disk you want to
throttle on each node which could be a bit of a PITA. And the newline
syntax is silly, but that's how you specify multiple disks.
--
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
Attachment:
signature.asc
Description: OpenPGP digital signature