Hello,
I think I'm almoust there!
I'm trying to run a simple script:
echo "It Works" >> /tmp/thisshouldwork.txt
What happens:
- it goes to the queue
- it is removed from the queue
- it does not run (log files empty and file in tmp is not
created) and it seems to fall into some black hole... :(
The maximum that I could reach that shows any error is the
ShadowLog file, that gives me:
12/01/17 18:51:25 (?.?) (74915):******* Standard Shadow
starting up *******
12/01/17 18:51:25 (?.?) (74915):** $CondorVersion: 8.4.12 Jul
06 2017 BuildID: 409562 $
12/01/17 18:51:25 (?.?) (74915):** $CondorPlatform:
x86_64_Ubuntu14 $
12/01/17 18:51:25 (?.?)
(74915):*******************************************
12/01/17 18:51:25 (?.?) (74915):uid=0, euid=122, gid=0,
egid=131
12/01/17 18:51:25 (?.?) (74915):Hostname =
"<xxx.xxx.xxx.xxx:17345?addrs=xxx.xxx.xxx.xxx-17345>",
Job = 37.0
12/01/17 18:51:25 (37.0) (74915):Requesting Primary Starter
12/01/17 18:51:25 (37.0) (74915):Shadow: Request to run a job
was ACCEPTED
12/01/17 18:51:25 (37.0) (74915):Shadow: RSC_SOCK connected,
fd = 17
12/01/17 18:51:25 (37.0) (74915):Shadow: CLIENT_LOG connected,
fd = 18
12/01/17 18:51:25 (37.0) (74915):My_Filesystem_Domain = "my
domain"
12/01/17 18:51:25 (37.0) (74915):My_UID_Domain = "my domain"
12/01/17 18:51:25 (37.0) (74915):Can't get address for
checkpoint server host (NULL): No such file or directory
12/01/17 18:51:25 (37.0) (74915):ÂÂÂ Entering
pseudo_get_file_stream
12/01/17 18:51:25 (37.0) (74915):ÂÂÂ file =
"/var/lib/condor/spool/37/cluster37.ickpt.subproc0"
12/01/17 18:51:25 (37.0) (74915):Created TCP listen socket
<xxx.xxx.xxx.xxx:41412>
12/01/17 18:51:25 (37.0) (74915):Shadow: Job 37.0 exited,
termsig = 0, coredump = 128, retcode = 0
12/01/17 18:51:25 (37.0) (74915):user_time = 1 ticks
12/01/17 18:51:25 (37.0) (74915):sys_time = 0 ticks
12/01/17 18:51:25 (37.0) (74915):Shadow: Cannot notify
user( Condor Job 37.0, tavares, w )
12/01/17 18:51:25 (37.0) (74915):Static Policy: removing job
because OnExitRemove has become true
12/01/17 18:51:25 (37.0) (74915):********** Shadow
Exiting(102) **********
Just to keep it simple, I'd rather to avoid to use the
checkpoint server. Is it possible?
I'm a little clueless now... can you give me any help on
that?
Thank you!!!!
Roberto