Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] htcondor-sysview

Date: Thu, 09 May 2013 00:59:28 -0000
From: Nathan Yehle <nyehle@xxxxxxxxxxxxxxxxxxxxx>
Subject: [HTCondor-users] htcondor-sysview

As shown at HTCondor week 2013, the UW CHTC and mwt2.org presents:

htcondor-sysview

https://github.com/DHTC-Tools/htcondor-sysview

-Nate

SYSVIEW README

htcondor-sysview is an efficiency monitor for HTCondor pools and jobs.

05.01.2013 1.13 release. Originally written as Mosaic Sysview by CharlesWaldman and Sarah Williams and Rob Gardner of MWT2.org. Modified byRebekah Gietzel (bgietzel@xxxxxxxxxxx) to work with UW-MadisonCHTC pool and HTCondor features including partitionable slots, multiplepools and submitters. Packaged by Nate Yehle (nyehle@xxxxxxxxxxx)

This 1.13 release should work with most HTCondor pool configs includingstatic and partionable slots.

The program draws the grid of cpus in HTCondor pools. Each cpu (core) isone square on the grid. Nodes produce squares based on the # of cpuslisted in their names in the nodes.list file. Jobs aredisplayed on a slot basis and map 1:1 by default. When partitionable slotsare used, each square represents a slot. The color of each squareindicates the status of that core and/or node.

Red squares are slots where sysview detects a htcondor startd is notrunning correctly.


Efficiency is computed as cputime/walltime of the job running on a slot.

Green squares are slots where efficient jobs are running.

Blue squares are slots where inefficient jobs are running.

Lighter green or blue squares are new jobs trending efficient orinefficient respectively. As the jobs age and the cputime/walltime ratiostabilizes the colors darken.

Other multicolored squares are jobs using more than 100% efficiency, as inmulticore jobs. They are represented by only one square showing how onemulticore job prevents other jobs from using the total

number of slots.

Once you have a mosaic output using information about your cluster, try todrag the mouse across a slot on the mosaic.

Mouseover various squares shows slotname, user, online/down, rss/vm memorystatus, cpu time, and efficiency for the current job on the slot. We usethis as an easy way to spot down nodes or jobs whichhave low efficiency and are wasting slots. Clicking a slot takes you tothe full dump of condor_q -l for the job running on that core.

Next by Date: Re: [HTCondor-users] Need installation and configuration instructions
Next by thread: Re: [HTCondor-users] Need installation and configuration instructions
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[HTCondor-users] htcondor-sysview