[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor_qsub condor_q is not working in "sentinel scripts"



Dear all,

just an update in case anyone else runs into the same problem. I found a solution, which is to add the following to the task that is being executed (and could no run the condor_q command before): cd /opt/local/bin (which is where condor_qsub is on my computer. And then the next line ./condor_qsub. So it seems that there was a problem with it not recognising condor_qsub as a command in the same way that it could from the terminal.Â

I don't understand much at all about this kind of thing; could anyone explain to me how this could be? Or is it maybe to do with having condor installed via macports?

Many thanks
Jacquie

On Thu, Jan 29, 2015 at 10:09 PM, Jacqueline Scholl <jacqueline.scholl@xxxxxxxxxxxx> wrote:
Hi Brian,

thanks for your quick reply!
It says 'hold on user request' - and when I manually release it, it works; which is why I thought in the first place to check whether the 'sentinel' script could be the culprit. From the 'condor_qsub' manual, it says 'condor_qsubÂpermits the submission of dependent jobs without the need to specify the full dependency graph at submission time. Doing things this way is neither as efficient as HTCondor's DAGMan, nor as functional as SGE'sÂqsubÂorÂqalter.' I think there is nothing I can do about this, given that I can't reallyÂprogram. Also, this script seems to run for other people, so I'm wondering whether it's something about the mac environment that might make a difference to whether commands are run from within condor or from the terminal directly?

Many thanks
Jacquie

On Thu, Jan 29, 2015 at 7:56 PM, Brian Candler <b.candler@xxxxxxxxx> wrote:
On 29/01/2015 16:29, Jacqueline Scholl wrote:
-- Submitter: users-mac-pro-2.local : <127.0.0.1:51199> : users-mac-pro-2.local
ÂID Â Â ÂOWNER Â Â Â Â Â ÂSUBMITTED Â Â RUN_TIME ST PRI SIZE CMD Â Â Â Â Â Â ÂÂ
4157.0 Â jacquelinescho Â1/29 15:59 Â 0+00:01:00 R Â0 Â 1.2 Âbash /usr/local/fs
4158.0 Â jacquelinescho Â1/29 15:59 Â 0+00:00:00 H Â0 Â 1.2 Âbash /usr/local/fs
Try doing the following on any job which is in the "H" (hold) state, for more info on why they are held:

condor_q -analyze 4158.0

However, this idea of 'sentinel files' doesn't sound like a good way of sequencing jobs. The normal way to do this in condor would be with dagman, where you can declare all your submit files and the parent/child relationships between them.

HTH,

Brian.



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/