HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] XML/JSON formatting for condor_q output



On Friday, 25 May, 2012 at 10:50 AM, Samir Cury wrote:
" -- schedd1
  <XML>
  -- schedd2
  <XML>
  -- schedd3
  <XML>
"
You can get similar output this by querying the collector for schedd's and then asking each schedd for status. It's equivalent to what condor_q does, and you don't have to parse the non-XML separators for each schedd in the output this way.

condor_status -xml -schedd

is what you want. 
* Within the XML itself, the schema looks very unusual, which gave me a hard time to parse it with standard libs :

"
<classads><c> <a n="MyType"><s>Job</s></a>
"
To be more compatible with everything(more specifically XML::Simple), I ended up having to re-format the output to :

"
<classads> <c> <MyType> Job </MyType> <TargetType> Machine </TargetType> "
At the end, I'm going to provide through CPAN, Parse::CondorQueue which can get the raw output of condor_q -global -l -xml and give people jobs data through the reformatted XML, JSON or Perl data structure (hashref).
Why do you have to do this? I've been using that XML schema for…a long time and it's never given any parser I've passed it to any trouble (aside from the missing document start/end tags).

<classads></classads> -- a collection of ads
<c></c> -- a single ad, possible in a collection
<a n="name /> -- an attribute with a non-optional name
<s /> -- a string value
<i /> -- an integer value
<b /> -- a boolean
…etc.. for the different value types

Your rework isn't valid (or schema-fi-able [is that word?]) XML because the names of the tags depend on the attributes in the ad and that is a non-deterministic characteristic of ClassAds.  You don't know a priori which attributes will be in any ad. So you can't write a schema for it. The schema the output from condor_q uses can be described a priori, without any knowledge of the *contents* of the ad, because none of the tags depend on the contents of the specific ads being represented. Only attributes on the tags depend on contents of the tag.

In addition to now having tag names that are completely unknown and depend on the contents of the ad, you've also dropped the type casting tags (<s>, <i>, etc.) that are used to provide information to parsers about what kind of data is represented by the attribute. In less-forgiving languages than Perl, where everything isn't a string, that's pretty important.
What I would like to ask here, is your opinion about the XML formatting, not asking for changing the original one, but more about the motivation for that schema.
Maybe check out the ClassAd primer doc. As this will give you a background on the datastore used in Condor and the motivations for why the XML schema is the way it is.

http://research.cs.wisc.edu/condor/classad/
 
Regards,
- Ian

---
Ian Chesal

Cycle Computing, LLC
Leader in Open Compute Solutions for Clouds, Servers, and Desktops
Enterprise Condor Support and Management Tools

http://www.cyclecomputing.com
http://www.cyclecloud.com
http://twitter.com/cyclecomputing