Hello list,
I get this error in versions 6.6.4 and 6.7 of the WINNT
release, when running an extremely simple, stripped-down dagman test. In
the archives there’s a message to the effect that this is a known bug
that should have been fixed in release 6.2. Anyone else running dagman on
WINNT with POST scripts? Is it working for you?
My test runs fine if I comment out the POST script.
Furthermore, the script itself runs fine as a PRE script. Following are
the gory details.
First, my dagman script:
job A2B submit_a2b.txt
job B2CD submit_b2cd.txt
script pre A2B c:\windows\system32\cmd.exe /c
c:\tmp\condor\echo.bat pre $JOB
script post A2B c:\windows\system32\cmd.exe /c
c:\tmp\condor\echo.bat post $JOB
parent A2B child B2CD
Next, my submit scripts:
universe = vanilla
executable = a2b.exe
arguments = data
output = out_a2b.txt
error = err_a2b.txt
log = log.txt
requirements = UidDomain == "naughtydog.com"
&& FileSystemDomain == "naughtydog.com" && OpSys ==
"WINNT51" && Disk >= 10 && (Memory * 1024) > 10
transfer_input_files = data.a
Queue 1
---
universe = vanilla
executable = b2cd.exe
arguments = data_0000
output = out_b2cd.txt
error = err_b2cd.txt
log = log.txt
requirements = UidDomain == "naughtydog.com"
&& FileSystemDomain == "naughtydog.com" && OpSys ==
"WINNT51" && Disk >= 10 && (Memory * 1024) > 10
transfer_input_files = data_0000.b
Queue 1
Finally, here’s my output file with the error message:
4/30 16:32:18
******************************************************
4/30 16:32:18 **
condor_scheduniv_exec.188.0 (CONDOR_DAGMAN) STARTING UP
4/30 16:32:18 ** $CondorVersion:
6.7.0 Apr 27 2004 $
4/30 16:32:18 ** $CondorPlatform:
INTEL-WINNT40 $
4/30 16:32:18 ** PID = 2136
4/30 16:32:18
******************************************************
4/30 16:32:18 Using config file:
C:\Condor\condor_config
4/30 16:32:18 Using local config
files: C:\Condor/condor_config.local
4/30 16:32:18 DaemonCore:
Command Socket at <10.0.0.56:3914>
4/30 16:32:18 argv[0] ==
"condor_scheduniv_exec.188.0"
4/30 16:32:18 argv[1] ==
"-Debug"
4/30 16:32:18 argv[2] ==
"3"
4/30 16:32:18 argv[3] == "-Lockfile"
4/30 16:32:18 argv[4] == "dagman.txt.lock"
4/30 16:32:18 argv[5] == "-Condorlog"
4/30 16:32:18 argv[6] ==
"log.txt"
4/30 16:32:18 argv[7] == "-Dag"
4/30 16:32:18 argv[8] ==
"dagman.txt"
4/30 16:32:18 argv[9] ==
"-Rescue"
4/30 16:32:18 argv[10] == "dagman.txt.rescue"
4/30 16:32:18 DAG Lockfile will
be written to dagman.txt.lock
4/30 16:32:18 DAG Input file is
dagman.txt
4/30 16:32:18 Rescue DAG will be
written to dagman.txt.rescue
4/30 16:32:18 Condor log will be
written to log.txt, etc.
4/30 16:32:18 Parsing dagman.txt
...
4/30 16:32:18 jobName: A2B
4/30 16:32:18 jobName: A2B
4/30 16:32:18 Dag contains 2
total jobs
4/30 16:32:18 Deleting any older
versions of log files...
4/30 16:32:18 Deleting older
version of log.txt
4/30 16:32:18 Bootstrapping...
4/30 16:32:18 Number of
pre-completed jobs: 0
4/30 16:32:18 Running PRE script
of Job A2B...
4/30 16:32:18 Registering condor_event_timer...
4/30 16:32:18 PRE Script of Job
A2B completed successfully.
4/30 16:32:19 Submitting Condor
Job A2B ...
4/30 16:32:19 submitting: condor_submit
-a "dag_node_name = A2B" -a "+DAGManJobID = 188.0" -a
"submit_event_notes = DAG Node: $(dag_node_name)" submit_a2b.txt
4/30 16:32:20 assigned
Condor ID (189.0)
4/30 16:32:20 Just submitted 1
job this cycle...
4/30 16:32:20 Event: ULOG_SUBMIT
for Condor Job A2B (189.0)
4/30 16:32:20 Of 2 nodes total:
4/30 16:32:20 Done
Pre Queued Post Ready
Un-Ready Failed
4/30 16:32:20
=== ===
=== ===
=== ===
===
4/30 16:32:20
0 0
1 0 0
1 0
4/30 16:32:45 Event:
ULOG_EXECUTE for Condor Job A2B (189.0)
4/30 16:32:45 Event:
ULOG_JOB_TERMINATED for Condor Job A2B (189.0)
4/30 16:32:45 Job A2B completed
successfully.
4/30 16:32:45 Running POST
script of Job A2B...
4/30 16:32:45 Of 2 nodes total:
4/30 16:32:45 Done
Pre Queued Post Ready
Un-Ready Failed
4/30 16:32:45
=== ===
=== ===
=== ===
===
4/30 16:32:45
0 0
0 1
0
1 0
4/30 16:32:45 ERROR "set_user_priv()
failed!" at line 352 in file ..\src\condor_c++_util\uids.C
Thanks mucho for any help or advice,
Mike