Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] memory leak in Condor 7.4.2 schedd ???
- Date: Thu, 24 Jun 2010 10:31:52 +0100
- From: "Smith, Ian" <I.C.Smith@xxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] memory leak in Condor 7.4.2 schedd ???
Hi Dan,
I've copied this here:
http://pcwww.liv.ac.uk/~smithic/core.17281.Z
Its about 500 MB so I'm not sure how much luck you will have
with downloading it. As I write the scheduler is using a stonking
1700 MB and we have only one job in the queue !
regards,
-ian.
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-
> bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
> Sent: 23 June 2010 15:12
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] memory leak in Condor 7.4.2 schedd ???
>
> Ian,
>
> We might be able to tell where the problem is by looking at a core file
> from the bloated schedd process. One way to generate one is this:
>
> gdb -p <PID of schedd>
> (gdb) gcore
> (gdb) quit
>
> It will write the core file into your current working directory, so make
> sure there is enough space. Also, it will take some time (minute or
> two, I imagine), during which the schedd will be unresponsive.
>
> --Dan
>
> Smith, Ian wrote:
> > Apologies for the rather long running thread but I've just now seen a
> > repeat of the excessive schedd memory usage described earlier.
> >
> > Running top
> >
> > PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU
> COMMAND
> >
> > 17281 root 1 59 0 1043M 1038M sleep 55:11 0.50% condor_schedd
> >
> > and pmap:
> >
> > 17281: condor_schedd -f
> > 00010000 6752K r-x-- /opt1/condor_7.4.3/sbin/condor_schedd
> > 006B6000 536K rwx-- /opt1/condor_7.4.3/sbin/condor_schedd
> > 0073C000 784K rwx-- [ heap ]
> > 00800000 1056768K rwx-- [ heap ]
> > FEF00000 608K r-x-- /lib/libm.so.2
> > FEFA6000 24K rwx-- /lib/libm.so.2
> > FF000000 1216K r-x-- /lib/libc.so.1
> > FF130000 40K rwx-- /lib/libc.so.1
> > FF13A000 8K rwx-- /lib/libc.so.1
> > FF160000 64K rwx-- [ anon ]
> > FF180000 584K r-x-- /lib/libnsl.so.1
> > FF222000 40K rwx-- /lib/libnsl.so.1
> > FF22C000 24K rwx-- /lib/libnsl.so.1
> > FF240000 64K rwx-- [ anon ]
> > FF260000 64K rwx-- [ anon ]
> > FF280000 16K r-x-- /lib/libm.so.1
> > FF292000 8K rwx-- /lib/libm.so.1
> > FF2A0000 240K r-x-- /lib/libresolv.so.2
> > FF2E0000 24K rwx-- [ anon ]
> > FF2EC000 16K rwx-- /lib/libresolv.so.2
> > FF300000 48K r-x-- /lib/libsocket.so.1
> > FF310000 8K rwx-- [ anon ]
> > FF31C000 8K rwx-- /lib/libsocket.so.1
> > FF320000 128K r-x-- /lib/libelf.so.1
> > FF340000 8K rwx-- /lib/libelf.so.1
> > FF350000 8K rwx-- [ anon ]
> > FF360000 8K r-x-- /lib/libkstat.so.1
> > FF372000 8K rwx-- /lib/libkstat.so.1
> > FF380000 8K r-x-- /lib/libdl.so.1
> > FF38E000 8K rwxs- [ anon ]
> > FF392000 8K rwx-- /lib/libdl.so.1
> > FF3A0000 8K r-x-- /platform/sun4u-us3/lib/libc_psr.so.1
> > FF3B0000 208K r-x-- /lib/ld.so.1
> > FF3F0000 8K r--s- dev:32,12 ino:70306
> > FF3F4000 8K rwx-- /lib/ld.so.1
> > FF3F6000 8K rwx-- /lib/ld.so.1
> > FFBEC000 80K rwx-- [ stack ]
> > total 1068448K
> >
> > So it does look to me that around 1 GB of heap is allocated to the schedd.
> > Currently I have 889 jobs in total, 450 idle and 439 running which seems
> > pretty modest.
> >
> > regards,
> >
> > -ian.
> >
> >
> >> Is that virtual, resident or private memory usage?
> >>
> >> Output of,
> >>
> >> top -n1 -b -p $(pidof condor_schedd)
> >> pmap -d $(pidof condor_schedd)
> >>
> >> ?
> >>
> >> FYI, Condor uses string interning to minimize the memory footprint of
> >> jobs (all classads actually), but, iirc, does not always garbage collect
> >> the string pool. If you have a lot of jobs passing through your Schedd,
> >> say with large unique Environments, you could certainly see memory usage
> >> increase. Then of course there could just be a memory leak.
> >>
> >> Best,
> >>
> >>
> >> matt
> >> _______________________________________________
> >> Condor-users mailing list
> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> >> subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/condor-users/
> >>
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> >
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/