Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Memory requests increasing
- Date: Thu, 21 Feb 2013 15:47:06 +0000
- From: "Rochford, Steve" <s.rochford@xxxxxxxxxxxxxx>
- Subject: [HTCondor-users] Memory requests increasing
We're running Condor 7.8.2 and seeing that some jobs never complete. The log file below is from a job using Abaqus. I submit the job via Condor and it gets picked up by a machine. Provided that no-one reboots the machine then the file gets processed in about 3 hours on a machine with 4GB of RAM. There's a a lot of swapping to disk but it all works.
I'm not sure that I understand what the log below is telling me; the final lines are easy - the user aborted because nothing had happened but is there anything significant about the increasing "ResidentSetSize"?
Steve
000 (1299.000.000) 02/17 13:56:02 Job submitted from host: <155.198.30.249:58189>
...
001 (1299.000.000) 02/17 13:57:14 Job executing on host: <155.198.72.65:50149>
...
006 (1299.000.000) 02/17 13:57:23 Image size of job updated: 1
1 - MemoryUsage of job (MB)
128 - ResidentSetSize of job (KB)
...
006 (1299.000.000) 02/17 14:02:25 Image size of job updated: 2582400
2522 - MemoryUsage of job (MB)
2582400 - ResidentSetSize of job (KB)
...
006 (1299.000.000) 02/17 14:07:27 Image size of job updated: 3298708
3222 - MemoryUsage of job (MB)
3298708 - ResidentSetSize of job (KB)
...
006 (1299.000.000) 02/17 14:12:27 Image size of job updated: 3446956
3367 - MemoryUsage of job (MB)
3446956 - ResidentSetSize of job (KB)
...
006 (1299.000.000) 02/17 14:17:26 Image size of job updated: 3446972
3367 - MemoryUsage of job (MB)
3446972 - ResidentSetSize of job (KB)
...
006 (1299.000.000) 02/17 14:37:32 Image size of job updated: 3446980
3367 - MemoryUsage of job (MB)
3446980 - ResidentSetSize of job (KB)
...
009 (1299.000.000) 02/20 21:55:54 Job was aborted by the user.
via condor_rm (by user jw508)