[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Computing On Demand (COD) job diagnostics



I can confirm that the Schedd is bypassed by COD, so there is no equivalent of condor_q for COD jobs.

 

condor_status is the only tool that can tell you anthing about the COD jobs.  I had a look at condor_status -cod.

 

The RemoteUser column is ??? because condor_status fails to ask for that attribute, so that is a bug in condor_status.

 

condor_status *does* ask for the ClaimState attribute, so if that is showing ??? that means that attribute doesn’t exist

in the Machine ad in the Collector.  I don’t know the COD code at all so I don’t know if that is normal for completed jobs.

 

-tj

 

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Bockelman, Brian
Sent: Wednesday, August 7, 2019 8:00 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Computing On Demand (COD) job diagnostics

 

Hi Ivan,

 

Hm - I suspect it's been quite some time since someone looked too closely into COD.

 

From my fuzzy memory (last I touched it was when implementing the python bindings for it https://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/htcondor.html#htcondor.Claim), COD is strictly between the client and the startd.  There's no schedd involved, so there's no equivalent of "condor_q")

 

So, in order to monitor, you need to use condor_status.  I wouldn't be surprised if the attribute names have changed since "condor_status -cod" was done -- could simply be a rendering issue.

 

If you do "condor_status -l" on the particular job slot, do you see the attributes you are looking for?

 

Brian



On Aug 6, 2019, at 12:52 PM, don_vanchos <hozblok@xxxxxxxxx> wrote:

 

Hello,

I am using `condor_cod request -name 748ddf6a9537.htcondor` and `condor_cod activate -id "..." -jobad cod.txt` commands. And it works as expected. But what commands or tools can I use to ask HTCondor about the status of the COD job? (and about all COD claims?)

The question is in addition, why do I see the next diagnostics (`condor_status -cod` command) with the question symbols instead of `ClaimState` and `RemoteUser`. And I see that the job is successful, because the output file on the disk appeared. But this diagnosis does not change:

root@e255e610a51e:/# condor_status -cod

Name          ID    ClaimState  TimeInState  RemoteUser   JobId  Keyword      

slot1@e255e61 COD4  [????????] 18114+16:15:2 [?????????]                      

                 Total  Idle  Running  Suspended  Vacating  Killing

  X86_64/LINUX       1     0        0          0         0        0

         Total       1     0        0          0         0        0


Thanks in advance for your reply!

My condor_version is:
$CondorVersion: 8.9.2 Jun 04 2019 BuildID: Debian-8.9.2-1 PackageID: 8.9.2-1 Debian-8.9.2-1 $
$CondorPlatform: X86_64-Ubuntu_18.04 $

 

--

Sincerely yours,
Ivan Ergunov                                                 mailto:hozblok@xxxxxxxxx

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/