[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Using htcondor.EventLog to to study Condor File Transfer



Thanks Cole,
str(event) is what I was missing. It will work just fine for my analysis
Best,
Joe


On 4/11/25 06:00, Cole Bollig via HTCondor-users wrote:
Hi Joe,

Unfortunately, the file transfer event only contains the string saying what type of transfer occurred and what happened (i.e. Input started/finished, and output started/finished). The good news is you can process the message string from the event. While it is sad that it has to be done this way, it does work. I actually recently updated condor_watch_q to check the file transfer events to inform users of jobs doing input and output transfer. See the code snippet below (pulled from the source code) to see what we are currently doing:

Âif event.type == htcondor.JobEventType.FILE_TRANSFER:
  Ânew_status = None
  Âmsg = str(event).lower()
  Âif "started" in msg:
    Âif "input" in msg:
      Ânew_status = JobStatus.TRANSFERRING_INPUT
    Âelif "output" in msg:
       new_status = JobStatus.TRANSFERRING_OUTPUT

Note that for this tool we only care about when the respective transfer starts because other events will transition our counters to different and appropriate states.

Hope this helps,
Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Joseph Areeda <newsreply@xxxxxxxxxx>
Sent: Thursday, April 10, 2025 6:16 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Using htcondor.EventLog to to study Condor File Transfer
Â

I am trying to analyze job logs that use Condor Fie Transfer for containers. I am specifically interested in timing and error detection.

The log entries I want to look at with the htcondor2.JobEvent class:

000 (45036388.002.000) 2025-04-10 10:16:18 Job submitted from host: <10.14.0.39:9618...
...
040 (45036388.002.000) 2025-04-10 10:17:15 Started transferring input files
ÂÂÂÂÂÂÂ Transferring to host: <10.14.9.111:9618?addrs=10.14.9.111-9618&alias=node2111.cluster.ldas.cit&noUDP&sock=slot1_6_12567>
...
040 (45036388.002.000) 2025-04-10 10:17:19 Finished transferring input files
â

My question for event type 40 how tell the difference between Started and Finished? I cannot find anything in the event object.
Maybe it is possible to get the text of the line that created the event?

How are errors and retries reported?

Any suggestions will be appreciated
Joe


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/