HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] DRMAA : wait_job never returns



Hello guys,

After submitting a dude job (quick stuff ~2sec) via drmaa,

1- First configuration :
I call wait_job() with a time_out of 10 seconds and it returns as follow :

DEBUG: -> wait_job(my_job_id)
DEBUG: Sleeping for a moment, timeout      0 / 10 seconds
DEBUG: Sleeping for a moment, timeout      1 / 10 seconds
DEBUG: Sleeping for a moment, timeout      2 / 10 seconds
DEBUG: Sleeping for a moment, timeout      3 / 10 seconds
DEBUG: Sleeping for a moment, timeout      4 / 10 seconds
DEBUG: Sleeping for a moment, timeout      5 / 10 seconds ** mail for job done received at this moment **   
DEBUG: Sleeping for a moment, timeout      6 / 10 seconds
DEBUG: Sleeping for a moment, timeout      7 / 10 seconds
DEBUG: Sleeping for a moment, timeout      8 / 10 seconds
DEBUG: Sleeping for a moment, timeout      9 / 10 seconds
DEBUG: Wait timeout detected after scanning file for my_job_id
DEBUG: Unreferencing job my_job_id
DEBUG: Not removing job my_job_id yet (ref_count: 1 -> 0)
DEBUG: <- wait_job(my_job_id)
"timeout? :("

After reading auxDramaa.c (7.4.4 version), Iine 1687 :
if (cur->ref_count >= 0)
=> Is ">=" the good test ? I expected to see "ref_count > 0"


2- second configuration
I call wait_job() without time_out (-1) and it never returns 
DEBUG: -> wait_job(minerve.dosigray.com.5.0)

DEBUG: Sleeping for a momentDEBUG: Sleeping for a momentDEBUG: Sleeping for a moment ......


Can you tell me if the rm_job observation is good and if the wait_job behaviour is straight.

Thanks