Steven Timm wrote: > On Wed, 7 Jun 2006, Michael Thomas wrote: > > >>I have a cluster of 50 nodes, 4 vms per node. On all but one node I >>have a certain directory mounted via read-only nfs. On the remaining >>node the directory is mounted read-write. >> >>Every user coming into the system only needs read-only access to the >>certain directory. But one special user always needs read-write access. >> >>How can I guarantee that this special user always gets sent to the one >>node that has read-write access to this directory? Note that I don't >>mind if other users also get sent to this read-write node. > > > First, define the one node to have an extra attribute in its machine > classad > > [root@fnpcsrv1 root]# grep IO /opt/condor/local/condor_config.local > MachineClass = "IO" > Class = "IO" > START = JobClass =!= UNDEFINED && JobClass == "IO" > > On a non-grid job, then the user should just add > +JobClass = "IO" > requirements = (MachineClass =!= UNDEFINED && MachineClass == "IO") > > to his condor submit file. > > You can force a inbound grid job for that user to do that > by hacking condor.pm to add these extra two lines to the > submit script file it writes. > > Steve Thanks for the tip, Steve. It almost works... I hacked condor.pm to add the +JobClass and requirements. The job submit script on the CE shows that they get added: ... Executable = /home/uscms01/.globus/.gass_cache/local/md5/4a/67571a70a8ae3d2291019518204cc1/md5/81/2e7051cca30e7ea792099078f56ae3/data +JobClass = "IO" Requirements = OpSys == "LINUX" && Arch == "INTEL" && (MachineClass =!= UNDEFINED && MachineClass == "IO") X509UserProxy = /home/uscms01/.globus/job/citgrid3.cacr.caltech.edu/29347.1150304652/x509_up ... condor_config.local on the compute node also has the machineclass and class configuration: MachineClass = "IO" Class = "IO" START = JobClass =!= UNDEFINED && JobClass == "IO" But it seems that the job's requirements prevent it from running anywhere. When I submit the job and run condor_q -better-analyze[1], it shows that the machineclass requirement is causing it to fail. How can I query the remote machine to verify that it's loading the condor_config.local settings as expected? --Mike [1] 59349.000: Run analysis summary. Of 8 machines, 8 are rejected by your job's requirements 0 reject your job because of their own requirements 0 match but are serving users with a better priority in the pool 0 match but reject the job for unknown reasons 0 match but will not currently preempt their existing job 0 are available to run your job No successful match recorded. Last failed match: Wed Jun 14 10:17:29 2006 Reason for last match failure: no match found WARNING: Be advised: No resources matched request's constraints The Requirements expression for your job is: ( target.OpSys == "LINUX" && target.Arch == "INTEL" && ( target.MachineClass isnt undefined && target.MachineClass == "IO" ) ) && ( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize ) && ( target.HasFileTransfer ) Condition Machines Matched Suggestion --------- ---------------- ---------- 1 ( target.MachineClass isnt undefined && target.MachineClass == "IO" ) 0 REMOVE 2 target.OpSys == "LINUX" 8 3 target.Arch == "INTEL" 8 4 ( target.Disk >= 76 ) 8 5 ( ( 1024 * target.Memory ) >= 1 ) 8 6 ( target.HasFileTransfer ) 8
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature