[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] my_popen and condor_shadow.std.exe don't exist



Hi,

>> When I try to run an MPI job, the job appears to remain idle in the 
>> queue. An error message appears in the ScheddLog on the central 
>> manager, stating that "Shadow" exited with status 4. Does 
>anyone know 
>> what "status 4" means?
>
>Your shadow is exiting with an exception.  We cannot tell more 
>about what is going on without at least your submit file. 
>Having the shadow log may help too.

My submit file (called "run_condor_MPICH1.no_files") is:

Universe                = parallel
run_as_owner = true
Executable              = run_condor_MPICH1.no_files
arguments               = cpilog_minimal.exe
machine_count           = 1 
should_transfer_files   = yes
when_to_transfer_output = on_exit
transfer_input_files    =
\\indplly1\userdirs\JeffreySJ\Condor_Jobs\cpilog_minimal.exe
Queue

The MPI executable is "cpilog_minimal.exe" is on a Samba share that is
accessible by both the central manager and execute machine. The
executable does not read an input files or write any output files.

The shadow log on the central manager contains:
3/20 11:16:40 Using config source: D:\condor-6.8.4\condor_config
3/20 11:16:40 Using local config sources: 
3/20 11:16:40    D:\condor-6.8.4/condor_config.local
3/20 11:16:40 DaemonCore: Command Socket at <131.242.63.124:1298>
3/20 11:16:40 Initializing a PARALLEL shadow for job 12.0
3/20 11:16:41 (12.0) (3084): Request to run on <131.242.63.162:3789> was
ACCEPTED
3/20 11:16:42 (12.0) (3084): ERROR "Error from starter on
nes15300.lands.resnet.qg: Create_Process(D:\condor-6
.8.4\execute\dir_1824\condor_exec.exe,cpilog_minimal.exe, ...) failed"
at line 643 in file ..\src\condor_shadow.V6.1\pseudo_ops.C

The shadow log on the execute machine contains (I think this is from
when condor was started on that machine, not when the MPI job tried to
start):
3/20 10:45:53 ******************************************************
3/20 10:45:53 ** condor_shadow (CONDOR_SHADOW) STARTING UP
3/20 10:45:53 ** D:\condor-6.8.4\bin\condor_shadow.exe
3/20 10:45:53 ** $CondorVersion: 6.8.4 Feb 1 2007 $
3/20 10:45:53 ** $CondorPlatform: INTEL-WINNT50 $
3/20 10:45:53 ** PID = 1328
3/20 10:45:53 ** Log last touched 3/19 14:39:11
3/20 10:45:53 ******************************************************
3/20 10:45:53 Using config source: D:\condor-6.8.4\condor_config
3/20 10:45:53 Using local config sources: 
3/20 10:45:53 D:\condor-6.8.4/condor_config.local
3/20 10:45:53 DaemonCore: Command Socket at <131.242.63.162:3621>
3/20 10:45:53 ERROR: missing command-line arguments!3/20 10:45:53 Usage:
condor_shadow cluster.proc schedd_addr file_name
3/20 10:45:53 argv[0] = condor_shadow

Cheers
steve

************************************************************************
The information in this e-mail together with any attachments is
intended only for the person or entity to which it is addressed
and may contain confidential and/or privileged material.
Any form of review, disclosure, modification, distribution
and/or publication of this e-mail message is prohibited.  
If you have received this message in error, you are asked to
inform the sender as quickly as possible and delete this message
and any copies of this message from your computer and/or your
computer system network.  
************************************************************************