[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Computing On Demand (COD) job diagnostics



Hi Brian,

Thank you very much for the link to python api. This is what I was looking for.
However, `htcondor.Claim` takes `ad` argument with description about the startd slot. The documentation does not explain: should I give her all the description fields? When I give her all the fields, it works, but these are more than 80 attributes in my case. When I pass only the class_ad object with the correct `Name` attribute, I see "E    ValueError: No contact string in ClassAd" error. Maybe there is a limited list of attributes that I can pass?

IÂtried `condor_status -l`, but so far I donât see how this can tell me the status of the job.

I found `RequestCpus`, `RequestDisk`, `RequestMemory` in the list of `COD Application Attributes`. Can I enter requirements for a resource called manually (e.g. _CONDOR_MACHINE_RESOURCE_X=1 setting) for the cod job?

ÑÑ, 7 ÐÐÐ. 2019 Ð. Ð 16:00, Bockelman, Brian <BBockelman@xxxxxxxxxxxxx>:
Hi Ivan,

Hm - I suspect it's been quite some time since someone looked too closely into COD.

From my fuzzy memory (last I touched it was when implementing the python bindings for itÂhttps://htcondor.readthedocs.io/en/latest/apis/python-bindings/api/htcondor.html#htcondor.Claim), COD is strictly between the client and the startd. There's no schedd involved, so there's no equivalent of "condor_q")

So, in order to monitor, you need to use condor_status. I wouldn't be surprised if the attribute names have changed since "condor_status -cod" was done -- could simply be a rendering issue.

If you do "condor_status -l" on the particular job slot, do you see the attributes you are looking for?

Brian

On Aug 6, 2019, at 12:52 PM, don_vanchos <hozblok@xxxxxxxxx> wrote:

Hello,

I am using `condor_cod request -name 748ddf6a9537.htcondor` and `condor_cod activate -id "..." -jobad cod.txt` commands. And it works as expected. But what commands or tools can I use to ask HTCondor about the status of the COD job? (and about all COD claims?)

The question is in addition, why do I see the next diagnostics (`condor_status -cod` command) with the question symbols instead of `ClaimState` and `RemoteUser`. And I see that the job is successful, because the output file on the disk appeared. But this diagnosis does not change:

root@e255e610a51e:/# condor_status -cod

Name     ÂID  ÂClaimState ÂTimeInState ÂRemoteUser  JobId ÂKeyword   Â

slot1@e255e61 COD4 Â[????????] 18114+16:15:2 [?????????] Â Â Â Â Â Â Â Â Â Â Â

        ÂTotal ÂIdle ÂRunning ÂSuspended ÂVacating ÂKilling

 X86_64/LINUX    1   0    Â0     Â0     0    Â0

    ÂTotal    1   0    Â0     Â0     0    Â0


Thanks in advance for your reply!

My condor_version is:
$CondorVersion: 8.9.2 Jun 04 2019 BuildID: Debian-8.9.2-1 PackageID: 8.9.2-1 Debian-8.9.2-1 $
$CondorPlatform: X86_64-Ubuntu_18.04 $

--
Sincerely yours,
Ivan Ergunov                         mailto:hozblok@xxxxxxxxx
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Sincerely yours,
Ivan Ergunov                         mailto:hozblok@xxxxxxxxx