Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] CondorLoadAvg and memory information on RHEL3 versus FC4
- Date: Wed, 3 May 2006 07:27:24 -0700 (PDT)
- From: Jeff Mausolf <jeff_mausolf@xxxxxxxxx>
- Subject: [Condor-users] CondorLoadAvg and memory information on RHEL3 versus FC4
We have a mix of RHEL 3 and FC4 machines in our pool
and are experiencing problems with jobs suspending and
then being evicted on Fedora Core 4 machines running
the latest stable release 6.6.11. The CondorLoad and
memory information do not appear to be accurate:
VirtualMemory = 1073741824
Memory = 3
TotalVirtualMemory = 2147483647
...
CondorLoadAvg = 0.000000
The memory related data may be due to meminfo format
differences on the platforms. We've tested this with
the development release 6.7.18 for FC4 and the meminfo
data accurately represents what is on the machines:
VirtualMemory = 3144718
Memory = 1009
TotalVirtualMemory = 6289436
TotalMemory = 2019
The CondorLoadAvg appears to work intermittently on
some of the FC4 machines. Here is the load from our
jobs:
vm1@hoeplx144 LINUX INTEL Claimed Busy
1.000 1009 0+00:14:33
vm2@hoeplx144 LINUX INTEL Claimed Busy
1.020 1009 0+00:14:43
vm1@hoeplx144 LINUX INTEL Claimed Busy
1.000 1009 0+00:14:16
vm2@hoeplx144 LINUX INTEL Claimed Busy
1.200 1009 0+00:14:05
vm1@hoeplx144 LINUX INTEL Claimed Busy
1.000 1009 0+00:14:21
vm2@hoeplx144 LINUX INTEL Claimed Busy
1.400 1009 0+00:14:09
vm1@hoeplx144 LINUX INTEL Claimed Busy
1.050 1009 0+00:10:22
vm2@hoeplx144 LINUX INTEL Claimed Busy
1.030 1009 0+00:14:32
vm1@hoeplx145 LINUX INTEL Claimed Busy
1.150 1009 0+00:14:32
vm2@hoeplx145 LINUX INTEL Claimed Busy
1.140 1009 0+00:14:18
vm1@hoeplx145 LINUX INTEL Claimed Busy
1.050 1009 0+00:10:29
vm2@hoeplx145 LINUX INTEL Claimed Busy
1.050 1009 0+00:14:14
vm1@hoeplx145 LINUX INTEL Claimed Busy
1.000 1009 0+00:14:34
vm2@hoeplx145 LINUX INTEL Claimed Busy
1.630 1009 0+00:14:29
vm1@hoeplx145 LINUX INTEL Claimed Busy
1.000 1009 0+00:14:23
vm2@hoeplx145 LINUX INTEL Claimed Busy
1.540 1009 0+00:14:07
Here is the CondorLoadAvg and TotalCondorLoadAvg for
the same timeframe. These are dedicated machines which
are only running our jobs.
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 1.030000
TotalCondorLoadAvg = 2.070000
CondorLoadAvg = 1.030000
TotalCondorLoadAvg = 2.070000
CondorLoadAvg = 1.140000
TotalCondorLoadAvg = 2.280000
CondorLoadAvg = 1.140000
TotalCondorLoadAvg = 2.280000
CondorLoadAvg = 1.040000
TotalCondorLoadAvg = 2.090000
CondorLoadAvg = 1.050000
TotalCondorLoadAvg = 2.090000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
CondorLoadAvg = 0.000000
TotalCondorLoadAvg = 0.010000
Any suggestions on how to resolve the inconsistency we
are seeing with CondorLoad or is there something else
that we should investigate that could be causing our
jobs to suspend then be evicted?
Thanks, Jeff
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com