Hi folks,
During the break, I've been reading an interesting paper from VLDB 2010:
http://infoscience.epfl.ch/record/149436/files/vldb10aether.pdf
It talks about scalability issues of write-ahead-logs in DBs, but I
found the topics relevant to the I/O scalibility issues in the Condor
schedd. Particularly, there are two concepts which may be applicable
(without knowing enough about the Condor code to determine if they are
or not):
1) Early-lock-release (ELR) and flush pipelining. ELR is simply
releasing the I/O resources prior to the I/O completing, but not
returning back to the calling routine until the I/O is finished; flush
pipelining is taking several commits and flushing them as one I/O
operation. Together, they can offer the same data guarantees while
greatly decreasing the number of small I/O operations are performed.
Note: the kernel community calls this I/O plugging. This technique works
when there are many independent transactions occurring - if there are
too many dependent transactions (transactions which can't be started
until the previous one finishes).
- Obviously, these techniques were designed for heavily-threaded
environments. It's not known to me whether the schedd can continue on
other work while it waits for a transaction to finish.
2) Asynchronous commit - i.e., lie to the user about their changes being
safely on disk. The paper uses this as an anti-pattern, and works to
show techniques such as (1) can have equivalent performance. However,
there's a reason that databases (even Oracle) allow this mode to be used
- they give the power to the user to consider the value of data
integrity and recoverability versus the value of scalability. There are
certainly times when the extra factor of 2 in scalability (making the
numbers up) is worth the cost of being able to lose 30s of status
changes. I'm not advocating that the Condor team should change their
opinions about the relevant merits, or that the default should be
changed - but this should be something the site should be allowed to
decide on.
At any rate, the paper is a good read; hope others enjoy it.
Brian