Hi, On 27.12.2010 16:12, slebodnik wrote:
This problem is still actual. I attach a graph of actual quill memory usage. First 130 hours, value of variable QUILL_POLLING_PERIOD was "1", and then we change value to "10". We would like to also know how much memory use quill daemon on yours submit machine (where schedd daemon is running).
I remember there is somewhere presentation from condor-week 20?? which provides lots of useful info regarding memory footprint vs. quill usage.
Can someone, maybe from condor folks, recall this presentation and point it to us? Or maybe provide other recommended quill setup we might use for further tests? Qull logs don't say much using QUILL_DEBUG=D_FULLDEBUG.
Currently, we have done some valgrind and gdb work to find out source of the trouble. Our investigation shows it's likely typical memory leak and using simplified code as follows helps:
FILE * fp; while (true){ fp = fdopen(file_descriptor, "w") //here is memory leak //work wait(QUILL_POOLING_PERIOD) }I've talked to Lukas what he found out also on our development environment - see attached plot. What is strange, same code on CentOS takes all the memory and crashes virtual host on our dev env. Whereas, on debian it takes 300MB of res mem and that's it... Maybe different glibc on CentOS vs debian, but this is what Lukas can comment more in details if needed.
Well, we have option to compile quill and see what is going on with memory in our test case or we can live with gdb hook Lukas patched to the code. However we would be happy if someone from condor experts can review our report and give a hint in here. Or, should we open condor-admin ticket track the issue yet?
Thanks in advance! Cheers, MarianPS: Happy New Year 2011 to all condor-users and wishing successful new lines of code into next condor releases ;)
Thanks Lukas Slebodnik On Tue, 7 Dec 2010 09:41:11 +0100, Vladimir Motoska <motoska@xxxxxxxx> wrote:Hi, We have some issues with quill daemon on the submitter node. We use condor 7.4.4 , x86_64 rhel5 dynamic running on CentOS with postgresql 8.1.22-1.el5_5.1. The problem is that quill accumulates memory. Here is the memory usage log http://pastie.org/1354799 The data are divided to 3 columns. 1st percentage of memory used by quill 2rd usage of memory in KB 3th usage of memory in MB Each line represents a time stamp one minute. Here is also our condor_config. http://pastie.org/1354821 On all other nodes quill runs just fine. Can anybody give me some hint ? Thanks_______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/
Attachment:
quill_dev_mem_2.png
Description: PNG image