Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Running "batch" jobs on different platforms
- Date: Mon, 16 Aug 2004 10:29:40 -0500
- From: Mike Frederick <Mike@xxxxxxxxxx>
- Subject: [Condor-users] Running "batch" jobs on different platforms
Title: Message
All,
OK, I've decided
that as usual I am asking the wrong questions. Let me tell you what I want
to do and you guys tell me how to do it...
I want to be able to
run pre-built "batch" jobs on different machines in a Condor pool. By
"pre-built", I mean that I will have a file of commands that perform a specific
task on a machine that resides on that machine. If I log onto the
machine directly and execute the file as a shell procedure on Red Hat--or a
batch file on Windows--the commands are executed and the task is
completed.
In a Windows-only
environment we have a set of batch files sitting on the box which we can
fire-off. The problem is that now we need to integrate a Red Hat (and
eventually a Sun) box into this environment so that we issue a command and the
Windows box runs its batch files, the Red Hat box runs its set of commands and
the Sun box runs its set of commands. It appeared to me that Condor (with
the possible future addition of DAGman) could perform the control function of
this process.
I thought I could
build/debug/test each the system's independent "batch" procedures, build a
Condor submit file for each procedure which was configured in such a way so that
each submit job would only run on its appropriate system and run the local
"batch" file and wait for all jobs to complete.
Is this
doable? The Windows portion works fine; I have a small batch file defined
for 2 Windows boxes in the pool; I built a submit file to direct each system to
run one copy and it all works. But what about Unixes? I though if I
built a small shell script and submitted it to run on the Red Hat box in the
pool it would work. But it doesn't, I get the log file you see
below. I'll include all the appropriate files:
====================================================
The shell script to
be executed on Red Hat:
#!/bin/csh
echo
"Howdy!"
echo "Here is the output from 'hostname':"
hostname -v
-i
echo ""
echo
"Output from 'ls' command:"
ls -la
echo ""
echo
"Output from a 'ping' command:"
ping -c 4
stargate.nuview.com
echo "That's all
folks!"
=====================================================
The submit file for
Red Hat:
universe =
vanilla
requirements = OpSys == "LINUX"
should_transfer_files
= YES
when_to_transfer_output = ON_EXIT
executable =
linux.bat
output =
linux.out
error =
linux.err
log
= linux.log
queue
======================================================
The resultant log
file:
001 (012.000.000)
08/16 10:24:17 Job executing on host: <192.168.1.222:1620>
...
007
(012.000.000) 08/16 10:24:17 Shadow
exception!
Error from starter on Mike_RH.nuview.com: Failed to execute
'/opt/condor-6.6.6/home/execute/dir_20253/condor_exec.exe
condor_exec.exe': No such file or
directory
0 - Run Bytes Sent By
Job
246 - Run Bytes Received By Job
...
======================================================
So am I just going
at this all wrong? Am I using Condor in a way that it was not
intended? Are there other software solutions I should consider other than
Condor? Any help appreciated!
--
Mike
Frederick