Condor
Guru’s and the rest of us, I
was able to successfully run the job by adding the line: GETENV = TRUE, in the
description file. That was the key line to solve this problem. This
proved my condor nodes and central manager are configured correctly. I am now
moving to run production batch files to finalize my testing phase. I had a new
error but I will create a new post to troubleshoot this new issue. Thank
you to those who gave their input here, Alex.
From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of Ian Chesal In general the environment your job runs with via Condor is
considerably reduced. Condor sets up the scantest environment possible for your
job before executing it. Try running a job that does: myjob.bat:
set To dump the remote environment for the job. You might need to
export pieces of your submitting environment to get a specific command to work. On Windows, Condor also runs your job in a fairly restricted
desktop. The account under which it runs the job, assuming you haven’t
set up Condor’s credential daemon, is a temporary, local user account on
the machine with limited rights. The systeminfo.exe command may require better
user privilege than what the account under which the remote job is executing
has. Hope that helps debugging the problem. - Ian From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of Alas, Alex [FEDI] Dave, I
tried that and when you run the batch file or the systeminfo executable, it
runs without any problems is only when you try to run it through condor that
you see the error message. I will download the depens.exe tool to track the
problem, So
the logs I sent you were not useful at all? Thanks
for your help. Alex
From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of David Watrous Hi Alex, - Dave On Nov 17, 2008, at 3:32 PM, Alas, Alex [FEDI] wrote: Dave, Thanks
for your input. I am kind of a newbie here so I interpreted execute node as the
computer where I ran the condor_submit job and the Scheduler node where it
intends to run, am I right or wrong? Please correct me If I am. This is what the Starterlog on the execute node says 11/17 13:30:47
****************************************************** 11/17 13:30:47 ** condor_starter (CONDOR_STARTER) STARTING UP 11/17 13:30:47 ** C:\condor\bin\condor_starter.exe 11/17 13:30:47 ** $CondorVersion: 6.8.7 Nov 29 2007 $ 11/17 13:30:47 ** $CondorPlatform: INTEL-WINNT50 $ 11/17 13:30:47 ** PID = 2504 11/17 13:30:47 ** Log last touched 11/17 13:22:58 11/17 13:30:47
****************************************************** 11/17 13:30:47 Using config source: C:\condor\condor_config 11/17 13:30:47 Using local config sources: 11/17 13:30:47 C:\condor/condor_config.local 11/17 13:30:47 DaemonCore: Command Socket at
<1x.xx.xx.x9:3354> 11/17 13:30:47 Setting resource limits not implemented! 11/17 13:30:47 Communicating with shadow <1x.xx.xx.x4:2784> 11/17 13:30:47 Submitting machine is "theisman.domain.com" 11/17 13:30:47 File transfer completed successfully. 11/17 13:30:48 Starting a VANILLA universe job with ID: 23.0 11/17 13:30:48 IWD: C:\condor/execute\dir_2504 11/17 13:30:48 Output file:
C:\condor/execute\dir_2504\Batch4testv3.out.0 11/17 13:30:48 Error file: C:\condor/execute\dir_2504\Batch4testv3.err.0 11/17 13:30:48 Renice expr "10" evaluated to 10 11/17 13:30:48 About to exec C:\WINDOWS\system32\cmd.exe /Q /C
condor_exec.bat 11/17 13:30:48 Create_Process succeeded, pid=4080 11/17 13:30:48 Process exited, pid=4080, status=-1073741515 11/17 13:30:48 Got SIGQUIT. Performing fast shutdown. 11/17 13:30:48 ShutdownFast all jobs. 11/17 13:30:48 **** condor_starter (condor_STARTER) EXITING WITH
STATUS 0 11/17 14:34:32 ************************************************* Scheduler
Node’s shadowlog is the following: 11/17 13:30:38 Using config source: C:\Condor\condor_config 11/17 13:30:38 Using local config sources: 11/17 13:30:38 C:\Condor/condor_config.local 11/17 13:30:38 DaemonCore: Command Socket at
<1x.xx.xx.x4:2784> 11/17 13:30:38 Initializing a VANILLA shadow for job 23.0 11/17 13:30:38 (23.0) (1012): Request to run on
<1x.xx.xx.9:1104> was ACCEPTED 11/17 13:30:40 (23.0) (1012): ZKM: setting default map to (null) 11/17 13:30:40 (23.0) (1012): Job 23.0 terminated: exited with
status -1073741515 11/17 13:30:40 (23.0) (1012): **** condor_shadow (condor_SHADOW)
EXITING WITH STATUS 100 Thanks
for your input and I hope this logs can help, Alex From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of David Watrous Alex, What does your StarterLog on the
execute node and the ShadowLog on the scheduler say about these jobs as they
run? Do you get an "ERROR: Provider load failure" message in the
output/error file from systeminfo? If you don't see anything
interesting in those logs, I'm assuming that your scheduler and execute nodes
all have the referenced directories, so what happens when your batch file just
does an 'echo "Hello World"'? I hope this helps! Good luck, - Dave -- Cycle Computing, LLC Leader in Condor Grid Solutions Enterprise Condor Support and
Management Tools On Nov 17, 2008, at 2:34 PM, Alas,
Alex [FEDI] wrote: I
am trying to run a job but it fails exiting with the code 107374515. I know if
the job succeeds it will exit out with code 0 and anything else non-zero code
means it failed but I don’t know if this code means anything or if it is
a generic error code. My
description file is the following: ######################################################################################### #
Description file for Batch File for TESTING purposes #
Prepared by Alex Alas ########################################################################################## universe
= vanilla requirements
= (Arch == "INTEL" && OpSys == "WINNT51") initialdir
= c:\condor\execute_bk should_transfer_files
= YES when_to_transfer_output
= ON_EXIT transfer_input_files
= c:\windows\system32\systeminfo.exe run_as_owner
= true executable
= Batch4testv2.bat output
= Batch4testv3.out.$(Process) error
= Batch4testv3.err.$(Process) log
= Batch4testv3.log queue
1 The
batch file I am running is as follow: >
systeminfo.exe Any
input is much appreciated,
Respectfully, Alex Alas Systems Administrator _______________________________________________ _______________________________________________ -- David Watrous main: 888.292.5320 Cycle Computing, LLC Leader in Condor Grid Solutions Enterprise Condor Support and Management Tools Confidentiality Notice. |