Hi, We have a test case which takes between 20mins to 30mins to complete locally, but takes around 50 mins to finish when run as a condor job. We do not see any problem from the log:  Partitionable Resources : Usage Request Allocated But condor_q command displays a huge SIZE of the job 17089.8MB.
Manual condor_q (http://research.cs.wisc.edu/htcondor/manual/current/condor_q.html)
shows the definition of SIZE: SIZE the Size Should come from MemoryUsage (if defined) or ImageSize (Otherwise). Condor_q shows the attributes of this job:(Non-batch mode only) The peak amount of memory in Mbytes consumed by the job; note this value is only refreshed periodically. The actual value reported is taken from the job ClassAd attribute MemoryUsageÂif this attribute is defined, and from job attributeÂImageSizeÂotherwise. ÂÂÂ ÂÂÂ MemoryUsage = ( ( ResidentSetSize + 1023 ) / 1024 )
ÂÂÂ ÂÂÂ ImageSize = 17500000
ÂÂÂ ÂÂÂ ImageSize_RAW = 15226024Apparently, the SIZE matches ImageSize attribute of this job. So why does this job have huge ImageSize? Based on manual (http://research.cs.wisc.edu/htcondor/manual/v7.6/7_3Running_Condor.html#SECTION008310000000000000000), I added Requirements = Memory > 2100 to submit file, but after this change, the job takes more than 6 hours to complete. I hope someone can answer some of my questions or give me some hints on what is going on: 1. Why this condor job run time is always about twice of the local machine run time? 2. How SIZE is calculated? 3. Why does a simple addition of "Requirements = Memory > 2100" affect the run time dramatically? Thank you for your time and help in advance, Zhuo |