Newbie alert!
I have installed Condor 6.6.6 on a Windows 2000 Professional box as the master. I have run a few example jobs thru the Condor interface and they were scheduled and ran fine.
Now I have built a Red Hat 9 box and am trying to add it to the pool. The condor_status command shows:
C:\Condor>condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
Mike_RH.nuvie LINUX INTEL Owner Idle 0.000 91 0+01:17:58
elpin.nuview. WINNT50 INTEL Unclaimed Idle 0.040 384 0+00:28:58
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/LINUX 1 1 0 0 0 0
INTEL/WINNT50 1 0 0 1 0 0
Total 2 1 0 1 0 0
and jobs still run correctly for Windows. But my job that I have built for Linux queues itself but does not run:
C:\Condor>condor_q
-- Submitter: elpin.nuview.com : <192.168.1.218:4789> : elpin.nuview.com
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
3.0 mike 8/3 14:28 0+00:00:00 I 0 0.0 linux.bat
4.0 mike 8/3 14:35 0+00:00:00 I 0 0.0 linux.bat
2 jobs; 2 idle, 0 running, 0 held
here is the output from an analyze:
C:\Condor\examples\printname>condor_q -analyze 3.0
-- Submitter: elpin.nuview.com : <192.168.1.218:4789> : elpin.nuview.com
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
---
003.000: Run analysis summary. Of 2 machines,
1 are rejected by your job's requirements
1 reject your job because of their own requirements
0 match, but are serving users with a better priority in the pool
0 match, match, but reject the job for unknown reasons
0 match, but will not currently preempt their existing job
0 are available to run your job
No successful match recorded.
Last failed match: Tue Aug 03 15:00:04 2004
Reason for last match failure: no match found
and I notice the following in the output of condor_q -long:
Requirements = (OpSys == "LINUX" && Arch == "INTEL") && (Disk >= DiskUsage) && ((Memory * 1024) >= ImageSize) && (HasFileTransfer)
which were not the requirements that I specified (I only had Requirements = (OpSys == "LINUX" && Arch == "INTEL"))...
And the final piece of the puzzle seems to be:
C:\Condor>condor_status -available
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
elpin.nuview. WINNT50 INTEL Unclaimed Idle 0.040 384 0+00:38:58
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/WINNT50 1 0 0 1 0 0
Total 1 0 0 1 0 0
which implies to me that the Linux box is not available for running jobs at all!
So, if you wouln't mind helping a rank beginner (in both Condor AND Linux):
1) Any guesses or things to look for on why the Linux box won't play?
2) Are the jobs no running because the Linux box is not participating, or is there more wrong?
3) Where are the other "requirements" coming from? I didn't put them in the submit file!
4) The ultimate goal of this experiment is to determine if Condor and DAG will give us the job submission/synch control that we need to kick off a "job" which causes other jobs to run (across multiple platforms at once) with a final job requiring that all previous jobs completed normally.
Any help would be appreciated. TIA!
_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx http://lists.cs.wisc.edu/mailman/listinfo/condor-users
Thanks and Regards P r a s h a n t L a l Cadence Design Systems Noida Export Processing Zone, Noida - 201301, Phone:+91 120 2562842, extn 4009 Fax:+91 120 2562231 Cell:+91 98101-44168 mailto: lalp@ cadence.com |