[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] STARTD_ENFORCE_DISK_LIMITS problem with LVM and QEMU live migration



FYI, it looks like QEMU has an intersting limitation for Condor KVM EP with STARTD_ENFORCE_DISK_LIMITS enabled to use LVM, e.g.,

Dec 27 14:17:38 hov1 QEMU[3456386]: kvm: ../util/bitmap.c:167: bitmap_set: Assertion `start >= 0 && nr >= 0' failed.


If anyone else is running a similar configuration, or knows a QEMU developer, I am interested in confirming/denying the hypothesis:

"QEMU is vunlerable to a bitmap asertion failure for Linux KVM that dynamical provision filesystems with LVM during live migration."


Note, from my testing any of the following options appear to avoid this assertion failure:
* Condor EP idle.
* STARTD_ENFORCE_DISK_LIMITS = False
* Use of LVM_BACKING_FILE rather than LVM.
* EP SPOOL on shared storage that does not need to be copied during live migration.


And I was able to reproduce the problem outside of Condor with the following LVM stress test.

#!/bin/bash

while [ 1 ]; do
  echo "START"
  lvcreate -y -L 100G -n test1 test
  wipefs /dev/mapper/test-test1
  mkfs.ext4 /dev/mapper/test-test1
  mount /dev/mapper/test-test1 /mnt
  dd if=/dev/zero of=/mnt/tst.dat bs=1024k count=5000
  umount /mnt
  lvremove -y test/test1
  echo "SLEEP"
  sleep 60
done


P.S. My testing was done with Proxmox, but I am curious if anyone else sees this (or not) with another VM platform.

Thanks.


â
Stuart Anderson
sba@xxxxxxxxxxx