Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] submitted jobs are not running
- Date: Thu, 10 Mar 2016 19:49:35 +0100
- From: Labounek René <xlabou01@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] submitted jobs are not running
Cituji Labounek René <xlabou01@xxxxxxxxxxxxxxxxxx>:
Dear condor users,
I have submitted jobs but they are still held and not running.
Condor_status looks ok:
labounek@emperor:~$ condor_status
Name OpSys Arch State Activity LoadAv Mem
ActvtyTime
slot10@xxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:23
slot11@xxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:24
slot12@xxxxxxxxxxx LINUX X86_64 Unclaimed Idle 6.320 2682
0+00:00:25
slot1@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:04
slot2@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:23
slot3@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:24
slot4@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:25
slot5@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:26
slot6@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:27
slot7@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:28
slot8@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:21
slot9@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 1.000 2682
0+00:00:22
Total Owner Claimed Unclaimed Matched
Preempting Backfill
X86_64/LINUX 12 0 0 12 0
0 0
Total 12 0 0 12 0
0 0
labounek@emperor:~$
Condor_submit comand looked like this:
condor_submit slice_0007.condor
The file contains this text:
Executable = /home/labounek/test/dti.bedpostX/condor_logs/slice_0007.sh
Universe = vanilla
output = /home/labounek/test/dti.bedpostX/condor_logs/slice_0007.out
error = /home/labounek/test/dti.bedpostX/condor_logs/slice_0007.error
Log = /home/labounek/test/dti.bedpostX/condor_logs/slice_0007.log
Queue
The file slice_0007.sh contains one comand:
/usr/share/fsl/5.0/bin/bedpostx_single_slice.sh
/home/labounek/test/dti 7 --nf=3 --fudge=1 --bi=1000 --nj=1250
--se=25 --model=2 --cnonlinear
I think everything should be ok, but it is stucked. Here is the
condor_q output:
Sorry, here it is:
labounek@emperor:~$ condor_q
-- Schedd: emperor.fnol.loc : <172.19.37.11:34081?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
1.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0000.sh
2.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0001.sh
3.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0002.sh
4.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0003.sh
5.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0004.sh
6.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0005.sh
7.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0006.sh
8.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0007.sh
9.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0008.sh
10.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0009.sh
11.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0010.sh
12.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0011.sh
13.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0012.sh
14.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0013.sh
15.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0014.sh
16.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0015.sh
17.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0016.sh
18.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0017.sh
19.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0018.sh
20.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0019.sh
21.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0020.sh
22.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0021.sh
23.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0022.sh
24.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0023.sh
25.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0024.sh
26.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0025.sh
27.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0026.sh
28.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0027.sh
29.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0028.sh
30.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0029.sh
31.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0030.sh
32.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0031.sh
33.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0032.sh
34.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0033.sh
35.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0034.sh
36.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0035.sh
37.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0036.sh
38.0 labounek 3/10 19:24 0+00:00:00 H 0 0.0 slice_0037.sh
39.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0038.sh
40.0 labounek 3/10 19:24 0+00:00:01 H 0 0.0 slice_0039.sh
40 jobs; 0 completed, 0 removed, 0 idle, 0 running, 40 held, 0 suspended
labounek@emperor:~$
Regards,
Rene Labounek