Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor-G - job submission problem - updated
- Date: Thu, 1 Sep 2005 11:12:28 -0500
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Condor-G - job submission problem - updated
On Sep 1, 2005, at 9:27 AM, duane waktu wrote:
Thanks for your email. I just had a chance to get back to playing
around with condor again today.
Btw, I followed your suggestion to simply use '/tmp' directory and
it 'seemed' to work fine. I said it 'seemed' because I was not sure
if everything went smoothly. The GridmanagerLog file still shows
that I got SIGTERM at the end, although I seemed to get the
expected output and the condor queue shows no more jobs to be
submitted.
The SIGTERM is normal. When the gridmanager has no more jobs to
manage, it exits by sending itself a SIGTERM. A little unusual, but
that's how it works.
Another question is what do I need to do so that Condor transfers
the jobs to other machines too?
Before I was able to do that when I was testing the simple
Hello.java example from Condor guide, which was using JAVA
universe. Using this example, I can see the job went to the Condor
Manager and executed there instead of in the local machine.
However, once I changed to GRID universe, the job seems to be
executed only on the local machine, e.g. didn't get transfer to
other machines. Is this because my 'globusscheduler' is set to my
local machine? What do I need to do in order to have my jobs
transferred to other machines?
Below is my submit file:
=============================================
...
globusscheduler = https://<my_local_machine>:8443
jobmanager_type = Fork
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
queue
=============================================
First, the machine you want to submit to has to have GT4 installed.
Then you have to change globusscheduler in your submit file to point
at that machine.
If you want Condor to pick the machine using match-making (like it
does for the other universes), you have to make the machine advertise
itself to your central manager. There are instructions for how to do
this in the Condor manual:
http://www.cs.wisc.edu/condor/manual/
v6.7/5_3Grid_Universe.html#SECTION00634000000000000000
Jaime, back to the permission problem, I hope you don't mind me
asking simple questions as follows:
1. What is sticky bit?
% ls -ld /tmp
drwxrwxrwt 7 root root 8192 Sep 1 04:26 /tmp/
The 't' at the end of the permissions means the sticky bit is set. It
means that although all users can write files in the directory, they
can rename and remove only their own files. If the sticky bit isn't
set, user A can delete user B's files (even if he's not allowed to
read them).
2. What do I need to do to set a sticky bit on a directory?
You can set the sticky bit on a directory like so:
chmod +t /scratch/grid-jobs
+----------------------------------+---------------------------------+
| Jaime Frey | Public Split on Whether |
| jfrey@xxxxxxxxxxx | Bush Is a Divider |
| http://www.cs.wisc.edu/~jfrey/ | -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+