[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [condor-users] Condor and MPI



No, though it is very similar to this. The idea is that dedicated
scheduler should have full control over the nodes after it has claimed
them. For this reason SUSPEND and PREEMPT expressions should be changed
to be FALSE when the job has DedicatedScheduler parameter. Mind that
KILL expression should not be changed, as it allows Condor to get rid of
the job. If you make it FALSE, in the case of condor_rm with the job
that would fail to leave gracefully, Condor would fail to remove the job
at all ( from our experience, at least).
On the other hand, START expression should evaluate to TRUE with
DedicatedScheduler present, since otherwise such scheduler is not
dedicated. This effectively means that your statement is correct, and
Condor ignores any resource owner's activity. However for regular jobs
it can still be FALSE.
The bottom line is that if most of your jobs are MPI,  it's not a good
idea to use non-dedicated resources (with nervous owners especially) for
MPI. 
On Mon, 2003-10-20 at 10:11, Thomas Bauer wrote:
> >We did use MPI on  Windows. By the way, MPICH 1.2.5 does not work with
> >Condor (reported to the team), for unknown reason, while MPICH 1.2.4
> >worked. However I should admit that this was extremely unstable, and we
> >finally gave up.
> >Anyway, try it, you might be more lucky than we were.
> 
> Ok, thanks for that hint with the 1.2.5-version. By the way, did I get it
> right, that Condor and MPI works only, if I make some machines available for
> executing jobs whether that machines are used by its owners or not?
> 
> Thomas
> 
> 
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>

Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>