[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Some way to automatically add a resource monitoring tool (like collect) to every job in a DAG?



Cole,

   Thanks for this! Glad to know that at least I wasn’t missing something obvious. Perhaps this is something for the condor team to think about?

 

   I believe that the suggestion in your second paragraph is not accessible to me because all my jobs are actually grid universe jobs that are scheduled by SGE. We used to be a condor only shop and then I was able to get much more information. Currently I don’t believe I have access to this kind of information? If you think I might be wrong about this and missing a way to get this info please do let me know.

 

Thanks Again!

 

John

 

John Calley, Ph.D.

Exec. Director - Biology

Genomics and Bioinformatics,

Statistics – Discovery & Development

Eli Lilly and Company

Lilly Corporate Center, Indianapolis, IN 46285 USA

317.433-3399 (office)
calley_john_n@xxxxxxxxx | www.lilly.com 

CONFIDENTIALITY NOTICE:  This e-mail message (including all attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure, copying or distribution is strictly prohibited.  If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

 

 

From: Cole Bollig <cabollig@xxxxxxxx>
Date: Tuesday, January 17, 2023 at 3:46 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Some way to automatically add a resource monitoring tool (like collect) to every job in a DAG?

Hi John,

 

Coming at this problem from the HTCondor view, there does not seem to be a to send a tool to alongside a condor job to monitor resource usage. You could in theory make each job in the DAG a wrapper job that runs both the monitoring tool and the original payload job. However, that is a lot of work and involves changing the DAG as opposed to just using an existing DAG.

 

On a slightly different note, HTCondor does keep track/record a good amount of information within the various class ads specifically the Job Ad and Machine Ad. Some of the Machine Ad attributes are recorded into the Job Ad based on the configuration knob SYSTEM_JOB_MACHINE_ATTRS. With that in mind, you could add a service node as a local universe job to a DAG that runs a job querying data about jobs from the job queue periodically and get the final values from condor_history then record the data in a file.

 

Hope this helps some,

Cole Bollig


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of John N Calley via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Friday, January 13, 2023 3:49 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: John N Calley <calley_john_n@xxxxxxxxx>
Subject: [HTCondor-users] Some way to automatically add a resource monitoring tool (like collect) to every job in a DAG?

 

Hi,

   I’d like to be able to take an existing DAG and somehow arrange to run collectl (https://collectl.sourceforge.net), or some other similar resource monitoring tool along with every job. All my jobs are scheduled through SGE (not by condor directly) so I need this to be independent of condor facilities. I was wondering if anyone else has done anything like this or might have thoughts on how best to approach it?

 

Thank You,

 

John

 

John Calley, Ph.D.

Exec. Director - Biology

Genomics and Bioinformatics,

Statistics – Discovery & Development

Eli Lilly and Company

Lilly Corporate Center, Indianapolis, IN 46285 USA

317.433-3399 (office)
calley_john_n@xxxxxxxxx | www.lilly.com 

CONFIDENTIALITY NOTICE:  This e-mail message (including all attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized review, use, disclosure, copying or distribution is strictly prohibited.  If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.