[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] POOL_HISTORY_MAX_STORAGE



Hi—

In earlier condor versions the viewhist files used to get much bigger than that (I’ve seen one grow to 2 GB).  But now Greg is right, condor 7.6 and greater does limit the size of all of the viewhist, even though some of them never grow to the full size.

It seems that someone has changed the definition of POOL_HISTORY_MAX_STORAGE—it used to be size of kilobytes

And now it has gone to being bytes as with most other condor variables.

 

I have 6000 cores in my pool and my condor_stats still goes back a full year, with pool_history_max_storage currently

Set to 500000000.  I would think that Greg should be able to boost the value further and get more data.

 

If I had actually read the release notes correctly, would I have seen these changes mentioned?

 

Steve

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg.Hitchen@xxxxxxxx
Sent: Tuesday, July 03, 2012 9:10 PM
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] POOL_HISTORY_MAX_STORAGE

 

Any chance this can get looked at so that we can store stats for > 1 month?

 

It appears that the POOL_HISTORY_MAX_STORAGE applies to the whole

directory, and makes the assumption that there will be 27 viewhist* files

so therefore assumes that if all files reach max size then any one file

can’t be > 66.7 Mb in size?

 

Thanks

 

Cheers

 

Greg

 

P.S. the silence was deafening from my previous post J (below) so should I

be sending this to condor-admin instead?

 

From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg.Hitchen@xxxxxxxx
Sent: Wednesday, 4 April 2012 4:09 PM
To: condor-users@xxxxxxxxxxx
Subject: [ExternalEmail] [Condor-users] POOL_HISTORY_MAX_STORAGE

 

We keep pool history info in our Condor setup on our Condor ViewServer machine.

This is a standalone machine that collects info from 5 separate Central Managers.

 

For historical reasons we had forced condor to represent each machine as one

“resource”, i.e. NUM_CPUS=1

 

We have recently enabled core detection and now have a total of ~ 10,000 cores

in across all 5 pools. In recently using condor_stats to produce some monthly

stats it’s become obvious that information is being lost, i.e. it appears we don’t

have info going back far enough (> 1 month).

 

Just increasing the POOL_HISTORY_MAX_STORAGE doesn’t work (currently set

at 2000000000 = 2 Gb, increased to 20000000000 = 20Gb) as we get the following

error message in CollectorLog.

 

04/04/12 13:21:05 ERROR "POOL_HISTORY_MAX_STORAGE in the condor configuration is

out of bounds for an integer (20000000000).  Please set it to an integer in the

range -2147483648 to 2147483647 (default 10000000)." at line 1693 in file /home

/condor/execute/dir_30458/userdir/src/condor_utils/condor_config.cpp

 

From what I can see our viewhistory directory is only 737Mb in size (see below).

There does seem to be some forced file rotation though at individual file

sizes of ~66.7Mb. Can anyone confirm what’s meant to happen with these

storage limits and file sizes and rotations?

 

Thanks

 

Cheers

 

Greg

 

>ll

total 736780

-rw-r--r-- 1 condor condor 31434648 Apr  4 15:49 viewhist0.0.new

-rw-r--r-- 1 condor condor 66666695 Aug  9  2011 viewhist0.0.old

-rw-r--r-- 1 condor condor 41337118 Apr  4 15:37 viewhist0.1.new

-rw-r--r-- 1 condor condor   333359 Apr  6  2006 viewhist0.1.old

-rw-r--r-- 1 condor condor 10537682 Apr  4 15:21 viewhist0.2.new

-rw-r--r-- 1 condor condor   333380 Jan 19  2006 viewhist0.2.old

-rw-r--r-- 1 condor condor 42825820 Apr  4 15:49 viewhist1.0.new

-rw-r--r-- 1 condor condor 67153008 Apr  4 09:36 viewhist1.0.old

-rw-r--r-- 1 condor condor 27489661 Apr  4 15:37 viewhist1.1.new

-rw-r--r-- 1 condor condor 66981236 Apr  3 23:24 viewhist1.1.old

-rw-r--r-- 1 condor condor 35099274 Apr  4 15:21 viewhist1.2.new

-rw-r--r-- 1 condor condor 66884552 Mar 31 17:43 viewhist1.2.old

-rw-r--r-- 1 condor condor  1208195 Apr  4 15:49 viewhist2.0.new

-rw-r--r-- 1 condor condor 66666869 Mar 27 18:51 viewhist2.0.old

-rw-r--r-- 1 condor condor  1889444 Apr  4 15:37 viewhist2.1.new

-rw-r--r-- 1 condor condor 66666889 Feb 15 05:28 viewhist2.1.old

-rw-r--r-- 1 condor condor 17508889 Apr  4 15:21 viewhist2.2.new

-rw-r--r-- 1 condor condor   333437 Mar  7  2006 viewhist2.2.old

-rw-r--r-- 1 condor condor 41038505 Apr  4 15:49 viewhist3.0.new

-rw-r--r-- 1 condor condor 66666970 Nov  4  2010 viewhist3.0.old

-rw-r--r-- 1 condor condor 27010376 Apr  4 15:37 viewhist3.1.new

-rw-r--r-- 1 condor condor   333372 Mar 15  2006 viewhist3.1.old

-rw-r--r-- 1 condor condor  6818397 Apr  4 15:21 viewhist3.2.new

-rw-r--r-- 1 condor condor   333371 Mar 13  2006 viewhist3.2.old

-rw-r--r-- 1 condor condor        0 Sep 12  2005 viewhist4.0.new

-rw-r--r-- 1 condor condor        0 Sep 12  2005 viewhist4.1.new

-rw-r--r-- 1 condor condor        0 Sep 12  2005 viewhist4.2.new