Darryl,
Oops. I overlooked your statement that you saw no errors in the grid
logs. For anyone to help you, I think you need to elaborate on, "The
job runs but gets rejected for some unknown reason." How are you seeing
that the job is rejected? Do you see messages in the jobmanager logs
indicating the starting and eventual exiting of the job?
--Dan Bradley
Dan Bradley wrote:
Darryl,
Your latest test indicates that Condor's "standard universe" is
working fine, but it does not indicate that Condor-G submissions to
your gatekeeper's fork jobmanager are working. As Erik suggested, I'd
recommend looking in the gridmanager log file and the jobmanager logfile.
--Dan Bradley
Darryl Cook wrote:
Thanks for the input Erik...I dont see anything in the logs that
looks like an error. So I decided to try and find a different
example submit to run and found one on the condor site for fibanocci
numbers. Its basically a c program which gets compile with :
condor_compile cc fibonacci.c -o fibonacci
and then submitted with:
##############
# fibonacci.sdf - Fibonacci demo for condor - submit description file
##############
Executable = fibonacci
Output = fib.out
Log = foo.log
Queue 1
this works like a charm! so i guess the question is why wouldnt an
executable of
/bin/ls not work?? If the fib example works would you conclude that
condor and globus are setup correctly?
Sorry for my ignorance but I am simply an Admin given the job to get
this up and working and know not much about either globus or condor
except for what I have been reading. Thanks for everyones help.....
darryl
Erik Paulson wrote:
On Thu, Feb 10, 2005 at 02:10:42PM -0500, Darryl Cook wrote:
Ok, I have re-installed Condor on machines that I have and am
*really* close to getting this thing to work now....but still
having a couple of problems.
I am using two machines: grid0 and node1.
I installed the central manager on node1.
I submit a job on grid0 to node1 with the following:
executable=/bin/ls
transfer_executable = false
globusscheduler = node1.cs.appstate.edu/jobmanager-fork
universe=globus
output=test1.out
log=test1.log
error=test1.error
requirements=true
queue
The job runs but gets rejected for some unknown reason. If I do a
condor_q -analyze I get the following:1 reject your job because of
their own requirements.
There is no matchmaking going on in this case, so you don't need to
have a 'requirements', and there is nothing that condor_q -analyze can
tell you.
<...>
PID: 2911 -- Notice: 5: Authenticated globus user:
/O=Grid/OU=GlobusTest/OU=simpleCA-grid0.cs.appstate.edu/OU=cs.appstate.edu/CN=Darryl
Cook
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Requested service: jobmanager-fork
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Authorized as local user: dlc
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Authorized as local uid: 500
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: and local gid: 100
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: executing
/usr/local/globus/libexec/globus-job-manager
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=9
TIME: Thu Feb 10 14:06:05 2005
So it sees my request but it gets terminated.....
The next place to look is in:
1. The Gridmanager logfile (probably /tmp/GridmangerLog.<username>
on your
submit machine
2. The job-manager logfile on node1.cs.appstate.edu, in your home
directory.
You may need to change the
$(GLOBUS_LOCATION)/etc/globus-job-manager.conf
file - you probably want the -save-logfile option to be always
-Erik
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
|