HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] Thoughts on decreasing shadow memory use



Hi,

When I last talked to Miron about multi-shadow, he suggested first wringing every last byte out of the current one before even proposing the multi-shadow.  So, I spent about an hour with igprof and staring at smaps.

I measured a shadow as having 360KB of heap, about 550KB total unshared space, and 274KB of data live on the heap (so about 25% waste due to fragmentation).

Here's what I found that we could save.  List is in ascending order of difficulty to implement.
0) Turn off classad caching: 55KB.
1) Copy of job's classad inside the file transfer object: 8KB 
2) gethostbyaddr -> gethostbyaddr_r (including all callsites, even in the logging code!  See ExecuteEvent::writeEvent): 5KB.
3) getpwnam, getpwuid to reentrant versions: 2KB
4) Remove stats object from DaemonCore for shadow: 7KB
5) libcondor_utils has 156KB of dirty writable memory (non-const statics?) that can't be shared: 100KB?  This part was not included in my heap calculations, but is indeed non-shared.
6) Cleanup of auth code to reduce heap fragmentation: 5-15KB
7) Un-loading the IpVerify table after usage: 9KB.
8) The configuration subsystem.  This would be one tough nugget to crack (note: would all be shared with the multi-shadow), but is very lightly used after the shadow fires up.  70KB.

Lessons learned:
- Classad caching does more harm than good for a single shadow (20% of heap)
- If we squeeze really hard at odds-n-ends in the heap, we can shrink the heap by 10%.  I don't think all the items listed above are plausible (especially 8).
- Non-const globals in libcondor_utils consist of 25% of the total memory footprint.  There are 332 source files in libcondor_utils - whack-a-mole time?
  - Similarly, there are a few things sitting around in the other Condor libraries, but nothing as sizable.
- Obviously sharable resources for the multi-shadow (parameter subsystem, auth hash maps and tables, daemon core object) make up 50% of the heap.
- It's not immediately obvious how much the ClassAd cache will affect the multi-shadow, but I would expect a bit of sharing.  Let's estimate 50% of the current cache is sharable, or 10% of the total heap.

So, we can squeeze about 15% of the shadow size by continuing to shave things and turning off caching.

Assuming 10 jobs per 1 shadow, we could realize a 60% memory gain.  

Both numbers become more dramatic if we can figure out who's hanging out in the data segment.

Brian

PS - all numbers have been rounded and self-consistency is limited to my ability to do mental math.

PPS - after 5 minutes with 'nm', it appears the data segment consists primarily of the parameter table.  DOH!