Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] MRTG/HotSaNIC monitoring of Condor pools
- Date: Thu, 26 Oct 2006 15:30:42 +0200
- From: "Alexandru IOSUP" <A.Iosup@xxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] MRTG/HotSaNIC monitoring of Condor pools
Hello,
The Condor Team has developed the Condor View tool [
http://condor-view.cs.wisc.edu/condor-view-applet/ ], which displays
statistics about:
+ machine utilization/status, for the last hour/day/week/month, and per
month for the last year
+ number of idle/running jobs per user, and peaks of this metric, for the
last hour/day/week/month, and per month for the last year
The Condor View set of tools should come with the Condor distribution; if
not, you may probably ask the Condor Team for help.
We (Parallel Systems Group/TU Delft) have developed a set of Python scripts
that analyze various logs, including Condor Schedd logs, and output the
following statistics:
+ system utilization over time
+ job arrival rate during hourly intervals
+ CDFs of the most important job characteristics (run time, wait time,
memory consumption, etc.)
+ number of submitted jobs and consumed CPU time per user/group, total and
per week
+ number of running/waiting jobs during hourly intervals
+ system throughput during hourly intervals
+ overall system metrics: A(W)WT, A(W)RT, A(W)SD, average (wait) time
deviation A(W)TD
An article describing the application of these scripts on (full/partial)
traces from Grid3, LCG, TeraGrid, DAS-2 can be found at [
http://pds.twi.tudelft.nl/reports/2006/PDS-2006-003/PDS-2006-003.pdf ]: A.
Iosup, C. Dumitrescu, D.H.J. Epema, H. Li, L. Wolters, How are Real Grids
Used? The Analysis of Four Grid Traces and Its Implications, The 7th
IEEE/ACM International Conference on Grid Computing (Grid), Barcelona,
September 28-29, 2006.
We have also applied our tools on Condor Schedd logs from the GLOW
environment [ http://www.cs.wisc.edu/condor/glow/ ], and a technical report
is currently under way.
All our scripts are available upon request.
You may also check the available Condor Tools [
http://www.cs.wisc.edu/condor/tools/ ].
Best regards,
Alexandru
----- Original Message -----
From: "Heiko Reese" <Heiko.Reese@xxxxxxxxxxxxxxxxxxx>
To: "Condor-Users Mail List" <condor-users@xxxxxxxxxxx>
Sent: Thursday, October 26, 2006 3:02 PM
Subject: Re: [Condor-users] MRTG/HotSaNIC monitoring of Condor pools
Fabricio Chalub Barbosa do Rosário wrote:
Hello,
I am thinking about writing some HotSaNIC and MRTG scripts to monitor our
Condor pool; gather statistics Such as how many machines are owned,
unclaimed, etc. If anyone here has anything already done, I would love
to take a look!
I recently started to monitor our Condor pools with munin
(http://munin.sf.net). I wrote three scripts to visualize
- the queue (by state and by universe)
- age of the submitted jobs and
- state of the pool or part of it (selectable by regexp).
If somebody is interested, I could put the scripts into a "releasable
state" (add code comments and documentation) and
post them here.
Cheers,
Heiko
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR