Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Malformed ClassAd
- Date: Wed, 13 Jul 2005 13:48:02 -0500 (CDT)
- From: Steven Timm <timm@xxxxxxxx>
- Subject: [Condor-users] Malformed ClassAd
I am trying to make a ClassAd for condor-G based matching.
Machine A: to be the machine which receives the classads from N different
different clusters B1...BN
The idea is that you submit a condor-G job on machineA, and it does
matching based on the classAds and forwards the job respectively.
I copied a shell script that someone else had, and modified
the script to make what I thought was a good condor ClassAd:
Contents of Classad for cluster B1:
[timm@fermigrid1 ~]$ condor_status -long
fngp-osg.fnal.gov:2119/jobmanager-condor
MyType = "Machine"
TargetType = "Job"
Name = "fngp-osg.fnal.gov:2119/jobmanager-condor"
gatekeeper_url = "fngp-osg.fnal.gov:2119/jobmanager-condor"
Requirements = TRUE
Rank = 0.000000
CurrentRank = 0.000000
WantAdRevaluate = TRUE
CurMatches = 0
UpdateSequenceNumber = 1121277498
gluehostapplicationsoftwareruntimeenvironment = "VO-atlas-release-9.0.3
VO-atlas-lcg-release-0.0.2"
glueceinfohostname = "fnal.gov"
gluesubclustername = "fnal.gov"
gluecestatestatus = "Production"
gluecepolicymaxcputime = 2880
gluecepolicymaxwallclocktime = 2880
glueceaccesscontrolbaserule = "VO:*"
GlueCEStateTotalCPUs = 80
gluecestatefreecpus = 0
GlueCEStateRunningJobs = 26
GlueCEStateWaitingJobs = 0
gluecestateestimatedresponsetime = 0
MyAddress = "<131.225.167.42:0>"
LastHeardFrom = 1121277499
UpdatesTotal = 1
UpdatesSequenced = 0
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"
The ClassAd is sent to machineA via condor_advertise.
(above is the output of condor_status -long).
MachineA sees the ClassAD but claims that it's malformed.
[timm@fermigrid1 ~]$ condor_status
Name OpSys Arch State Activity LoadAv Mem
ActvtyTime
fngp-osg.fnal [?????????] [????] [????????] [???] [??] [Unknown]
vm1@fermigrid LINUX INTEL Claimed Busy 1.170 997
0+02:30:58
vm2@fermigrid LINUX INTEL Claimed Busy 1.420 997
0+01:22:47
vm3@fermigrid LINUX INTEL Claimed Busy 1.170 997
0+16:14:03
vm4@fermigrid LINUX INTEL Claimed Busy 1.170 997
0+16:14:03
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/LINUX 4 0 4 0 0 0
Total 4 0 4 0 0 0
(Omitted 1 malformed ads in computed attribute totals)
So 3 questions:
1) Is it legal and/or advisable to try to have both job
execution slots from a startd, and a pool ad, in the same condor pool,
as I have above... e.g., condor_status shows 1 remote cluster and 4
cpu's on this machine
2) what's malformed about the classad as included above?
3) Is there a shortcut condor mechanism to have condor itself create
the classad for condor_g type matching.
Steve
--------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525 timm@xxxxxxxx http://home.fnal.gov/~timm/
Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team