Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] dagman: Possible to run a "PRE" script on same node as the main program will run?
- Date: Thu, 22 Jul 2010 12:44:33 +0200
- From: Carsten Aulbert <carsten.aulbert@xxxxxxxxxx>
- Subject: [Condor-users] dagman: Possible to run a "PRE" script on same node as the main program will run?
Hi all,
maybe tricky maybe stupid question:
Quite a few of our users need to read data from our data servers and of course
would not like to thrash those with too many jobs hitting the servers at the
same time.
Initially I thought, hey dagman has the ability of PRE scripts and along with
-maxpre this should solve this problem. However, the manual states
"Scripts are optional for each job, and any scripts are executed on the
machine from which the DAG is submitted; this is not necessarily the same
machine upon which the node's Condor or Stork job is run. Further, a single
cluster of Condor jobs may be spread across several machines. "
If I understand this correctly, it is not guaranteed that the pre script is
run on the same compute/worker node than the main task but rather on the
submit host - is that correct?
If so, how should I model this:
I have 20k jobs each needing to read in say 5 GB of data from central file
servers, each job will run for a couple of hours on this data and to spare the
servers from too much load, I want to limit the number of nodes reading from
the data servers at any time.
Thus, I would like to run a "PRE"-like script which copies data from the data
servers to the local workers' disk where the main task can then read from.
That in itself will not reduce the load of the data servers, however, if I
could limit the number of "PRE"-like jobs and ensure that only say 50 are
running at any time, but many more compute tasks can be running simultaneously
over time.
Any idea how to do that[1]? Is the problem clear enough?
Cheers
Carsten
[1] Other then starting to write "lock" files into a "lock" directory to keep
track who is allowed to read data at the moment...