Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] memory leak in Condor 7.4.2 schedd ???
- Date: Wed, 9 Jun 2010 15:08:40 +0100
- From: "Smith, Ian" <I.C.Smith@xxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] memory leak in Condor 7.4.2 schedd ???
> Is that virtual, resident or private memory usage?
>
> Output of,
>
> top -n1 -b -p $(pidof condor_schedd)
I think some of the options may be different under Solaris but
from what I can see most of it is memory resident
$ top
PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
15373 root 1 59 0 31M 27M sleep 0:07 0.06% condor_schedd
(I've since restarted the schedd but this still looks a bit excessive).
> pmap -d $(pidof condor_schedd)
This is the output of pmap (we actually run two condor instances separately - one is for Condor-G).
It looks to me that there is ~ 20 MB of dynamically allocated storage used by the schedd which
makes up most of the memory footprint.
00010000 6744K r-x-- /opt1/condor_7.4.2/sbin/condor_schedd
006B4000 544K rwx-- /opt1/condor_7.4.2/sbin/condor_schedd
0073C000 784K rwx-- [ heap ]
00800000 20480K rwx-- [ heap ]
FEF00000 608K r-x-- /lib/libm.so.2
FEFA6000 24K rwx-- /lib/libm.so.2
FF000000 1216K r-x-- /lib/libc.so.1
FF130000 40K rwx-- /lib/libc.so.1
FF13A000 8K rwx-- /lib/libc.so.1
FF160000 64K rwx-- [ anon ]
FF180000 584K r-x-- /lib/libnsl.so.1
FF222000 40K rwx-- /lib/libnsl.so.1
FF22C000 24K rwx-- /lib/libnsl.so.1
FF240000 64K rwx-- [ anon ]
FF260000 64K rwx-- [ anon ]
FF280000 16K r-x-- /lib/libm.so.1
FF292000 8K rwx-- /lib/libm.so.1
FF2A0000 240K r-x-- /lib/libresolv.so.2
FF2E0000 24K rwx-- [ anon ]
FF2EC000 16K rwx-- /lib/libresolv.so.2
FF300000 48K r-x-- /lib/libsocket.so.1
FF310000 8K rwx-- [ anon ]
FF31C000 8K rwx-- /lib/libsocket.so.1
FF320000 128K r-x-- /lib/libelf.so.1
FF340000 8K rwx-- /lib/libelf.so.1
FF350000 8K rwx-- [ anon ]
FF360000 8K r-x-- /lib/libkstat.so.1
FF372000 8K rwx-- /lib/libkstat.so.1
FF380000 8K r-x-- /lib/libdl.so.1
FF392000 8K rwx-- /lib/libdl.so.1
FF3A0000 8K r-x-- /platform/sun4u-us3/lib/libc_psr.so.1
FF3A4000 8K rwxs- [ anon ]
FF3B0000 208K r-x-- /lib/ld.so.1
FF3F0000 8K r--s- dev:32,12 ino:70306
FF3F4000 8K rwx-- /lib/ld.so.1
FF3F6000 8K rwx-- /lib/ld.so.1
FFBEC000 80K rwx-- [ stack ]
total 32160K
> ?
>
> FYI, Condor uses string interning to minimize the memory footprint of
> jobs (all classads actually), but, iirc, does not always garbage collect
> the string pool. If you have a lot of jobs passing through your Schedd,
> say with large unique Environments, you could certainly see memory usage
> increase. Then of course there could just be a memory leak.
All of the jobs are separate clusters but with the same requirements and
as I said the clustering does seem to be working fine. AFAIK everything
was OK with Condor 7.4.0 and this only surfaced when I moved to 7.4.2 to get rid of the
" long message still waiting to be closed" problem when restarting the
daemons. Incidently I get the same thing with a pre-release 7.4.3 from Dan.
thanks for speedy reply,
regards,
-ian.