Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Condor-users] job requirements
- Date: Tue, 10 Aug 2004 09:38:10 -0500
- From: "Oakley, David" <David.L.Oakley@xxxxxxxxxxx>
- Subject: RE: [Condor-users] job requirements
Hello All,
I am having a similar problem. I submitted a bunch of jobs yesterday and
they started running on different machines. I came in this morning and only
my machine was running jobs even though the other machines were Idle. My
machine is running 6.6.6. I decided to update the other machines from 6.4.5
to 6.6.6. The idle jobs found processors and started running. About an hour
later I came back and checked the status. Once again only my machine was
running processes. The other processors were again sitting around unclaimed.
Has this been seen before? Could it possibly have something to do with the
power options set in the control panel (by the way I am running on Windows
2K machines)? Can someone help?
Thanks for your time,
David
David L. Oakley
Aerospace Engineer
US Army RDECOM
AMSRD-AMR-SS-MD
Redstone Arsenal, AL 35898-5252
Email : david.l.oakley@xxxxxxxxxxx
Phone : 256-876-0539 (DSN: 746-0539)
Secure: 256-876-0649 (DSN: 746-0649)
Fax : 256-842-0808 (DSN: 788-0808)
-----Original Message-----
From: Alain Roy [mailto:roy@xxxxxxxxxxx]
Sent: Monday, August 09, 2004 7:52 PM
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] job requirements
Fernando Rannou wrote:
>Requirements = (Arch == "INTEL") && (OpSys == "LINUX") && (Disk >=
>DiskUsage) && ((Memory * 1024) >= ImageSize) && (TARGET.FileSystemDomain
>== MY.FileSystemDomain)
>-----
>
>but the condor submit file has not Requierements!!!
>This seems to be happening for one specific user.
condor_submit makes reasonable requirements for a job.
Pick one job that has this problem. Pretend it's job 10.0.
Pick one computer that doesn't match. Pretend it's node.example.com.
Run these two commands, substituting the correct identifiers:
condor_q -l 10.0
condor_status -l node.example.com
Walk through the requirements and see what can't be true. For instance, if
the job has DiskUsage of 1000000 and the computer has Disk of 1000, then
Requirements will be false.
My bet is that it will be the FileSystemDomain that is causing your
problem. If so, there are two possibilities:
* If you are submitting the job from a shared disk (like NFS) then the
computers should indicate that they have the same FileSystemDomain.
Here at the UW, we set FileSystemDomain to be cs.wisc.edu instead of
the $(FULL_HOSTNAME). For more information, see:
http://www.cs.wisc.edu/condor/manual/v6.6/2_5Submitting_Job.html#SECTION0035
4000000000000000
* If you are submitting the job from a computer that doesn't
have shared disks, then you'll need to transfer files, so this
requirement doesn't show up. For more information, see:
http://www.cs.wisc.edu/condor/manual/v6.6/2_5Submitting_Job.html#SECTION0035
4000000000000000
I hope this helps.
-alain
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
http://lists.cs.wisc.edu/mailman/listinfo/condor-users