Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Problem to submit example jobs
- Date: Wed, 12 Oct 2005 15:12:35 +0200
- From: Nicolas GUIOT <nicolas.guiot@xxxxxxx>
- Subject: [Condor-users] Problem to submit example jobs
Hi all
I started with the example jobs, just to check if my condor was fine... and it's not... or not really :
If I "condor_submit" jobs as user "condor", everything if fine, but when I try to submit jobs with my username, it just doesn't work :
guiot@chagall:~/tmp/TestCondor$ condor_submit env.cmd
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 40.
WARNING: File /ibpc/chagall/guiot/tmp/TestCondor/env.out is not writable by condor.
WARNING: File /ibpc/chagall/guiot/tmp/TestCondor/env.err is not writable by condor.
guiot@chagall:~/tmp/TestCondor$
I checked the rights : it seems to be fine since both user guiot and condor (end everyone) can write in this actual directory : (this was done before the submit)
guiot@chagall:~/tmp/TestCondor$ ll
-rw-r--r-- 1 guiot users 816 Oct 12 11:53 Makefile
drwxr-xr-x 2 guiot users 4096 Oct 12 11:54 PVM
-rw-r--r-- 1 guiot users 13190 Oct 12 11:53 README
drwxr-xr-x 2 guiot users 4096 Oct 12 11:54 dagman
-rw-r--r-- 1 guiot users 3210 Oct 12 11:53 env.C
-rw-r--r-- 1 guiot users 296 Oct 12 12:00 env.cmd
-rwxr-xr-x 1 guiot users 12422015 Oct 12 12:10 env.remote
-rwxr-xr-x 1 guiot users 205 Oct 12 11:54 submit
-rw-r--r-- 1 condor users 16384 Oct 12 14:34 tmp
guiot@chagall:~/tmp/TestCondor$ ll ../
drwxrwxrwx 4 guiot users 4096 Oct 12 14:55 TestCondor
The weird thing is that it _does_ create the .out and .err files : (this was done just after the submit) :
guiot@chagall:~/tmp/TestCondor$ ll
-rw-r--r-- 1 guiot users 816 Oct 12 11:53 Makefile
drwxr-xr-x 2 guiot users 4096 Oct 12 11:54 PVM
-rw-r--r-- 1 guiot users 13190 Oct 12 11:53 README
drwxr-xr-x 2 guiot users 4096 Oct 12 11:54 dagman
-rw-r--r-- 1 guiot users 3210 Oct 12 11:53 env.C
-rw-r--r-- 1 guiot users 296 Oct 12 12:00 env.cmd
-rw-r--r-- 1 guiot users 0 Oct 12 14:58 env.err
-rw-r--r-- 1 guiot users 83 Oct 12 14:58 env.log
-rw-r--r-- 1 guiot users 0 Oct 12 14:58 env.out
-rwxr-xr-x 1 guiot users 12422015 Oct 12 12:10 env.remote
-rwxr-xr-x 1 guiot users 205 Oct 12 11:54 submit
-rw-r--r-- 1 condor users 16384 Oct 12 14:34 tmp
guiot@chagall:~/tmp/TestCondor$
Here is my ScheddLog, from the moment I "condor_submit" the job. The break is when I run condor_rm.
10/12 14:51:16 (pid:4829) DaemonCore: Command received via UDP from host <193.49.27.24:49460>
10/12 14:51:16 (pid:4829) DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)
10/12 14:51:16 (pid:4829) Sent ad to central manager for guiot@xxxxxxxxxxxxxx
10/12 14:51:16 (pid:4829) Sent ad to 1 collectors for guiot@xxxxxxxxxxxxxx
10/12 14:51:16 (pid:4829) Called reschedule_negotiator()
10/12 14:51:16 (pid:4829) Activity on stashed negotiator socket
10/12 14:51:16 (pid:4829) Negotiating for owner: guiot@xxxxxxxxxxxxxx
10/12 14:51:16 (pid:4829) Checking consistency running and runnable jobs
10/12 14:51:16 (pid:4829) Tables are consistent
10/12 14:51:16 (pid:4829) Out of jobs - 1 jobs matched, 0 jobs idle, flock level = 0
10/12 14:51:20 (pid:4829) Starting add_shadow_birthdate(40.0)
10/12 14:51:20 (pid:4829) Started shadow for job 40.0 on "<193.49.27.11:33430>", (shadow pid = 11193)
10/12 14:51:20 (pid:4829) Shadow pid 11193 for job 40.0 exited with status 4
10/12 14:51:20 (pid:4829) ERROR: Shadow exited with job exception code!
10/12 14:51:21 (pid:4829) Sent ad to central manager for guiot@xxxxxxxxxxxxxx
10/12 14:51:21 (pid:4829) Sent ad to 1 collectors for guiot@xxxxxxxxxxxxxx
10/12 14:51:23 (pid:4829) Starting add_shadow_birthdate(40.0)
10/12 14:51:24 (pid:4829) Started shadow for job 40.0 on "<193.49.27.11:33430>", (shadow pid = 11194)
10/12 14:51:24 (pid:4829) Shadow pid 11194 for job 40.0 exited with status 4
10/12 14:51:24 (pid:4829) ERROR: Shadow exited with job exception code!
10/12 14:51:26 (pid:4829) Sent ad to central manager for guiot@xxxxxxxxxxxxxx
10/12 14:51:26 (pid:4829) Sent ad to 1 collectors for guiot@xxxxxxxxxxxxxx
10/12 14:51:26 (pid:4829) Starting add_shadow_birthdate(40.0)
10/12 14:51:26 (pid:4829) Started shadow for job 40.0 on "<193.49.27.11:33430>", (shadow pid = 11196)
10/12 14:51:26 (pid:4829) Shadow pid 11196 for job 40.0 exited with status 4
10/12 14:51:26 (pid:4829) ERROR: Shadow exited with job exception code!
10/12 14:51:28 (pid:4829) Starting add_shadow_birthdate(40.0)
10/12 14:51:28 (pid:4829) Started shadow for job 40.0 on "<193.49.27.11:33430>", (shadow pid = 11197)
10/12 14:51:28 (pid:4829) Shadow pid 11197 for job 40.0 exited with status 4
10/12 14:51:28 (pid:4829) ERROR: Shadow exited with job exception code!
10/12 14:51:30 (pid:4829) Starting add_shadow_birthdate(40.0)
10/12 14:51:30 (pid:4829) Started shadow for job 40.0 on "<193.49.27.11:33430>", (shadow pid = 11200)
10/12 14:51:30 (pid:4829) Shadow pid 11200 for job 40.0 exited with status 4
10/12 14:51:30 (pid:4829) ERROR: Shadow exited with job exception code!
10/12 14:51:30 (pid:4829) Match for cluster 40 has had 5 shadow exceptions, relinquishing.
10/12 14:51:30 (pid:4829) Sent RELEASE_CLAIM to startd on <193.49.27.11:33430>
10/12 14:51:30 (pid:4829) Match record (<193.49.27.11:33430>, 40, 0) deleted
10/12 14:51:30 (pid:4829) DaemonCore: Command received via TCP from host <193.49.27.11:33607>
10/12 14:51:30 (pid:4829) DaemonCore: received command 443 (VACATE_SERVICE), calling handler (vacate_service)
10/12 14:51:30 (pid:4829) Got VACATE_SERVICE from <193.49.27.11:33607>
10/12 14:51:31 (pid:4829) Sent ad to central manager for guiot@xxxxxxxxxxxxxx
10/12 14:51:31 (pid:4829) Sent ad to 1 collectors for guiot@xxxxxxxxxxxxxx
10/12 14:51:50 (pid:4829) DaemonCore: Command received via TCP from host <193.49.27.24:51440>
10/12 14:51:50 (pid:4829) DaemonCore: received command 478 (ACT_ON_JOBS), calling handler (actOnJobs)
10/12 14:51:50 (pid:4829) UserLog::initialize: open("/ibpc/chagall/guiot/tmp/TestCondor/env.log") failed - errno 13 (Permission denied)
10/12 14:51:50 (pid:4829) WARNING: Invalid user log file specified: /ibpc/chagall/guiot/tmp/TestCondor/env.log
Any help would be greatly appreciated..
Nicolas GUIOT
-----------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE
Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
------------------------------------------------