[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] SCHEDD not running right on upgraded CE with Condor 7.6.6



On Wed, 4 Apr 2012, Steven Lo wrote:

On 04/03/2012 07:33 PM, Alain Roy wrote:
On Apr 3, 2012, at 9:19 PM, Steven Lo wrote:
We use Rocks to install Condor RPM.
There's a ROCKS roll to install Condor, isn't there? I love the Condor RPM, but take the easy route if you can. :)
We not using Rocks roll.  We basically put Condor RPM in the distribution and 
have Rocks install for us
as a package just like others.

We have the following line in /etc/sysconfig/condor to point to the system wide
configuration file:
CONDOR_CONFIG="/share/apps/condor/etc/condor_config_7.6.6"
And it's also in the environment when you run condor_q? The daemons and the tools have to read the same configuration files. If they don't, condor_q and the other tools will fail in the way that you're seeing.
That's it.  It's the environment variable issue.  Set CONDOR_CONFIG correctly 
fix the problem.
You could also make /etc/condor/condor_config be a symlink
to the above location and that would work too.

Steve

We now able to get output from the condor_q and condor_status commands.

Thanks.


Following problem:

We see that bunch of jobs have scheduled but they are not executing:

# cd /wntmp/home
# ls
alice        uscms0179  uscms0604  uscms1029  uscms1454  uscms1879  uscms2304
cdf          uscms0180  uscms0605  uscms1030  uscms1455  uscms1880
   .
   .
   .


# condor_status

Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     1.000   982 
0+04:05:04
slot2@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.760   982 
0+04:05:05
slot3@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:06
slot4@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:07
slot5@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:08
slot6@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:09
slot7@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:10
slot8@xxxxxxxxxxxx LINUX      X86_64 Owner     Idle     0.000   982 
0+04:05:03
slot1@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963 
6+07:14:53
slot2@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963 
6+07:15:19
slot3@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963 
6+07:15:20
slot4@compute-10-5 LINUX      X86_64 Unclaimed Idle     0.000  1963 
6+07:15:21
slot10@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:07
slot11@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:08
slot12@compute-20- LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:09
slot1@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.420  4024 
0+07:44:43
slot2@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:07
slot3@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:08
slot4@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:09
slot5@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:10
slot6@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:11
slot7@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:12
slot8@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:05
slot9@compute-20-3 LINUX      X86_64 Unclaimed Idle     0.000  4024 
0+07:45:06
                    Total Owner Claimed Unclaimed Matched Preempting 
Backfill
       X86_64/LINUX    24     8       0        16       0          0 
0
              Total    24     8       0        16       0          0 
0

# condor_q


-- Submitter: cithep252.ultralight.org : <10.3.255.253:48116> : cithep252.ultralight.org
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 idle, 0 running, 0 held


Thanks.

Steven.


-alain

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.