HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Thoughts on decreasing shadow memory use



Hi Brian -

Thanks for the nice investigation!  Good work.

Curious - do you and/or Burt (aka CMS) have a goal in mind in terms of number of jobs being managed per machine ? It is pretty cheap to max out the RAM in submit nodes, and even inexpensive 1U machines can hold 24GB allowing them to host more than 25k shadows, right? I guess with a multi-shadow you could go north of 100k shadows on the same inexpensive server (assuming some other bottleneck doesn't prevent that).... but would you want to?

thanks
Todd

On 7/18/2012 4:38 PM, Brian Bockelman wrote:

On Jul 17, 2012, at 6:37 PM, Brian Bockelman wrote:

Hi,

When I last talked to Miron about multi-shadow, he suggested first wringing every last byte out of the current one before even proposing the multi-shadow.  So, I spent about an hour with igprof and staring at smaps.

I measured a shadow as having 360KB of heap, about 550KB total unshared space, and 274KB of data live on the heap (so about 25% waste due to fragmentation).

Here's what I found that we could save.  List is in ascending order of difficulty to implement.
0) Turn off classad caching: 55KB.
1) Copy of job's classad inside the file transfer object: 8KB
2) gethostbyaddr -> gethostbyaddr_r (including all callsites, even in the logging code!  See ExecuteEvent::writeEvent): 5KB.
3) getpwnam, getpwuid to reentrant versions: 2KB
4) Remove stats object from DaemonCore for shadow: 7KB
5) libcondor_utils has 156KB of dirty writable memory (non-const statics?) that can't be shared: 100KB?  This part was not included in my heap calculations, but is indeed non-shared.
6) Cleanup of auth code to reduce heap fragmentation: 5-15KB
7) Un-loading the IpVerify table after usage: 9KB.
8) The configuration subsystem.  This would be one tough nugget to crack (note: would all be shared with the multi-shadow), but is very lightly used after the shadow fires up.  70KB.

Lessons learned:
- Classad caching does more harm than good for a single shadow (20% of heap)
- If we squeeze really hard at odds-n-ends in the heap, we can shrink the heap by 10%.  I don't think all the items listed above are plausible (especially 8).
- Non-const globals in libcondor_utils consist of 25% of the total memory footprint.  There are 332 source files in libcondor_utils - whack-a-mole time?
  - Similarly, there are a few things sitting around in the other Condor libraries, but nothing as sizable.
- Obviously sharable resources for the multi-shadow (parameter subsystem, auth hash maps and tables, daemon core object) make up 50% of the heap.
- It's not immediately obvious how much the ClassAd cache will affect the multi-shadow, but I would expect a bit of sharing.  Let's estimate 50% of the current cache is sharable, or 10% of the total heap.

So, we can squeeze about 15% of the shadow size by continuing to shave things and turning off caching.

Assuming 10 jobs per 1 shadow, we could realize a 60% memory gain.

Both numbers become more dramatic if we can figure out who's hanging out in the data segment.

Brian

PS - all numbers have been rounded and self-consistency is limited to my ability to do mental math.

PPS - after 5 minutes with 'nm', it appears the data segment consists primarily of the parameter table.  DOH!

_______________________________________________
Condor-devel mailing list
Condor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel


Hi all,

Updated estimates after playing with the shadow for more than an hour:
- All the linked libraries add up a bit more drastically than I originally thought (I had ignored all the small ones and just looked at the big ones in my first estimates).  Adding in all my optimizations, there are 888KB of Private_Dirty memory.  I squeezed things down to 96KB for libcondor_utils's data segment from 156KB.  The savings are more drastic on more recent versions of GCC (where LTO is available).  There's not much more left to squeeze out due to vtables, got, and plt.
- There's probably 150KB of unique data per shadow, once you subtract out the things I mention above.  Hence, 738KB can be thought of as "overhead".  So, running 10 jobs per shadow would result in an 75% memory savings.

In all, it indicates that multi-shadow would be beneficial and "shadow squeezing" is going to continue to have diminishing returns.

Brian

_______________________________________________
Condor-devel mailing list
Condor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
Condor Project Technical Lead          1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685