Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] about setting up dedicated resources in Condor Windows cluster
- Date: Thu, 22 Jul 2010 20:23:46 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] about setting up dedicated resources in Condor Windows cluster
fly zebra wrote:
Hi All,
This maybe a newbie question about how to setting up dedicated
resources in Condor (condor-7.5.2-winnt50-x86) Windows Cluster, please
offer help if you can.
I read the document about "3.13.10.1 Selecting and Setting Up a
Dedicated Scheduler " in Condor manual, but still can not make
parallel job work after trying several times.
Couple pointers:
1. You do not need to setup a Condor "dedicated scheduler" to use
dedicated resources. The only reason you would need to setup a
dedicated scheduler in Condor is if you must submit parallel universe
jobs, i.e. jobs that require multiple machines at the same time.
Typical examples of this are MPI or PVM jobs, usually on Unix - folks
using MPI on Windows are fairly rare. If you do not need to use
parallel universe, e.g. just want to submit loads of vanilla universe
jobs, you do not need to set this up. Maybe the "dedicated scheduler" in
Condor would be better named "the parallel scheduler" :).
2. If you truly need to submit parallel universe jobs, you need to
customize more settings than you mentioned below. The easiest way to do
this is to consult the example file:
c:\condor\etc\condor_config.local.dedicated.resource
for the rest of the settings you need. It is well-commented.
Hope the above pointers help
Todd
my Condor Windows cluster consist of one central
manager(headnode.condor.org), and two execution machine
(c01.condor.org, c02.condor.org)
I added following configuration string in the condor_config file of
the above three machines
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxxxxx
STARTD_EXPRS = ${STARTD_EXPRS}, DedicatedScheduler
following is the job description file (jdf.sub)
universe = parallel
environment = path=c:\winnt\system32
executable = simpleCounter.exe
output = simpleCounter.out
error = simpleCounter.err
log = simpleCounter.log
machine_count = 1
arguments = 1 100
queue
after submitting the job with "condor_submit jdf.sub"
there is no error but the job never run (just stay in the idle status)
any suggestion?
Thanks in advance,
Kimaru
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
--
Todd Tannenbaum University of Wisconsin-Madison
Center for High Throughput Computing Department of Computer Sciences
tannenba@xxxxxxxxxxx 1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132 Madison, WI 53706-1685