Re: [Condor-users] Condor problems


Date: Mon, 14 Feb 2005 11:03:59 -0600
From: Dan Bradley <dan@xxxxxxxxxxxx>
Subject: Re: [Condor-users] Condor problems
Darryl,

Your latest test indicates that Condor's "standard universe" is working fine, but it does not indicate that Condor-G submissions to your gatekeeper's fork jobmanager are working. As Erik suggested, I'd recommend looking in the gridmanager log file and the jobmanager logfile.

--Dan Bradley

Darryl Cook wrote:

Thanks for the input Erik...I dont see anything in the logs that looks like an error. So I decided to try and find a different example submit to run and found one on the condor site for fibanocci numbers. Its basically a c program which gets compile with :

condor_compile cc fibonacci.c -o fibonacci
and then submitted with:
##############
# fibonacci.sdf - Fibonacci demo for condor - submit description file
##############
Executable = fibonacci

Output = fib.out

Log = foo.log

Queue 1


this works like a charm! so i guess the question is why wouldnt an executable of
/bin/ls not work?? If the fib example works would you conclude that condor and globus are setup correctly?


Sorry for my ignorance but I am simply an Admin given the job to get this up and working and know not much about either globus or condor except for what I have been reading. Thanks for everyones help.....

darryl

Erik Paulson wrote:

On Thu, Feb 10, 2005 at 02:10:42PM -0500, Darryl Cook wrote:


Ok, I have re-installed Condor on machines that I have and am *really* close to getting this thing to work now....but still having a couple of problems.

I am using two machines:   grid0  and node1.
I installed the central manager on node1.

I submit a job on grid0 to node1 with the following:

executable=/bin/ls
transfer_executable = false
globusscheduler = node1.cs.appstate.edu/jobmanager-fork
universe=globus
output=test1.out
log=test1.log
error=test1.error
requirements=true
queue

The job runs but gets rejected for some unknown reason. If I do a condor_q -analyze I get the following:1 reject your job because of their own requirements.



There is no matchmaking going on in this case, so you don't need to have a 'requirements', and there is nothing that condor_q -analyze can
tell you.



<...>


PID: 2911 -- Notice: 5: Authenticated globus user: /O=Grid/OU=GlobusTest/OU=simpleCA-grid0.cs.appstate.edu/OU=cs.appstate.edu/CN=Darryl Cook
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Requested service: jobmanager-fork
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Authorized as local user: dlc
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: Authorized as local uid: 500
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 5: and local gid: 100
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: executing /usr/local/globus/libexec/globus-job-manager
TIME: Thu Feb 10 14:06:05 2005
PID: 2911 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=9
TIME: Thu Feb 10 14:06:05 2005



So it sees my request but it gets terminated.....


The next place to look is in:

1. The Gridmanager logfile (probably /tmp/GridmangerLog.<username> on your
submit machine
2. The job-manager logfile on node1.cs.appstate.edu, in your home directory.
You may need to change the $(GLOBUS_LOCATION)/etc/globus-job-manager.conf
file - you probably want the -save-logfile option to be always


-Erik

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users



[← Prev in Thread] Current Thread [Next in Thread→]