OK - I just noticed, that also for my own trace jobs the actual Condor job gets various Glidein job ads and more elaborate requirement ads I had assumed from the glideins, that it would be a CMS feature ð Sorry for the noise Thomas On 05/08/2020 10.45, Thomas Hartmann wrote: > Hi again, > > I just stumbled over a (CMS ð) job, that looks somewhat odd [1] > regarding its requirements. > For one, the memory requirement seems not really limited with > RequestMemory derived from MemoryUsage, i.e., the a priori limit > depending on the later mem usage? > > For the core requirements, I wonder why the values change between the > CondorCE view and the Condor view of the same job (especially since > condor_ce_history is just a wrapper around condor_history - I guess that > there is some transformation happening here somewhere, or?) > > in the [condorce] view, the job comes with a cpu request of 1 - but then > the [condor] view of the same job morphed to 8 cores AFAIS? Glidein?? > At the moment I do not see, how the RequestCpus_ce = 1 becomes > OriginalCpus = 8 (getting fed into RequestCpus_batch) > > tbh I would prefer to strip of such dynamic behaviour in favour of a > one-to-one matching of resources. > > Cheers, > Thomas > > > [1] > RequestMemory = ifthenelse(MemoryUsage =!= > undefined,MemoryUsage,(ImageSize + 1023) / 1024) > RequestCpus = 1 > RequestDisk = DiskUsage > MemoryUsage = ((ResidentSetSize + 1023) / 1024) > > [condorce] >> grep Cpus > RequestCpus = 1 > > [condor] >> grep Cpus > CpusProvisioned = 8 > GlideinCpusIsGood = !isUndefined(MATCH_EXP_JOB_GLIDEIN_Cpus) && > (int(MATCH_EXP_JOB_GLIDEIN_Cpus) =!= error) > JOB_GLIDEIN_Cpus = "$$(ifThenElse(WantWholeNode is true, > !isUndefined(TotalCpus) ? TotalCpus : JobCpus, OriginalCpus))" > JobCpus = JobIsRunning ? int(MATCH_EXP_JOB_GLIDEIN_Cpus) : OriginalCpus > JobIsRunning = (JobStatus =!= 1) && (JobStatus =!= 5) && GlideinCpusIsGood > OriginalCpus = 8 > RequestCpus = ifThenElse(WantWholeNode =?= true, !isUndefined(TotalCpus) > ? TotalCpus : JobCpus,OriginalCpus) > orig_RequestCpus = 1 > > > On 04/08/2020 02.44, Antonio Perez-Calero Yzquierdo wrote: >> Hi Thomas, >> >> See my comment below: >> >> On Mon, Aug 3, 2020 at 10:50 AM Thomas Hartmann <thomas.hartmann@xxxxxxx >> <mailto:thomas.hartmann@xxxxxxx>> wrote: >> >> Hi Brian, >> >> yes, from the technical view you are absolutely right. >> >> My worries just go into the 'political direction' ;) >> >> So far, if a VO want's to run highmem jobs, i.e., core/mem < 1/2GB, they >> have to scale by cores. >> With cores and memory decoupled, I might worry, that we could become >> more attractive to VOs to run their highmem jobs - and we starve in the >> end there and have cores idleing, that are not accounted (and cause >> discussions later on...) >> Probably the primary 'issue' is, that AFAIS cores are somewhat the base >> currency - in the end the 'relevant' pi charts are just about the >> delivered core scaled walltime :-/ >> >> We have discussed in CMS several times the option of updating the >> "currency" as you named it, from CPU cores to the number of "unit cells" >> occupied by each jobs, when each "cell" is a multidimensional unit, e.g >> in 2D, CPU x memory, the unitÂcell being 1 CPU core x 2 GB. So each user >> would be charged on the basis of the max between the number of CPU cores >> and the number of 2 GB quanta employed. I condor terms (correct me if >> I'm wrong), that is managed by the slot weight, which can take such an >> expression as formula. >> >> In fact, what we had in mind was somehow charging the "extra cost" to >> the user requesting more memory, to discourage such requests (=CPU is >> consumed faster => lower priority), but still keep the CPU core >> available for potential matchmaking, as Brian explained, to improve the >> overall utilization of the resources. >> >> Despite discussions, we have not (yet) taken the steps to put this into >> effect as in the end the cases where jobs do require higher than >> standard memory/core are generally marginal. If they became more >> frequent, we'd look into this possibility. >> >> I somehow feel the political side of things as you described it would >> still be complicated ;-) >> >> Cheers, >> Antonio. >> >> >> Cheers, >>  Thomas >> >> On 31/07/2020 20.58, Bockelman, Brian wrote: >> > Hi Thomas, >> > >> > We do not normalize incoming requirements. >> > >> > In your example, I'm not sure if I'm following the benefit. You >> are suggesting changing: >> > >> > 1 core / 8GB -> 4 core / 8 GB >> > >> > Right? To me, in that case, you now have 3 idle cores inside the >> job - guaranteed to not be used - rather than 3 idle cores in condor >> which possibly are not used unless another VO comes in with odd >> requirements. >> > >> > Now, some sites *do* charge for jobs according to both memory and >> CPU. So, in your case of 1 core / 2GB being nominal, they would >> charge the user's fairshare for 4 units if the user submitted a 1 >> core / 8 GB job. >> > >> > Or am I looking at this from the wrong direction? >> > >> > Brian >> > >> >> On Jul 31, 2020, at 5:02 AM, Thomas Hartmann >> <thomas.hartmann@xxxxxxx <mailto:thomas.hartmann@xxxxxxx>> wrote: >> >> >> >> Hi all, >> >> >> >> on your CondorCEs, do you normalize incoming jobs for their >> core/memory >> >> requirements? >> >> >> >> Thing is, that we normally assume a ratio of ~ 1core/2GB memory. >> >> Now let's say a user/VO submits jobs with a skewed ration like >> >> 1core/8GB. Which would probably lead to draining for memory and >> leave a >> >> few cores idle. >> >> So, I had been thinking, if it might make sense to rescale a >> job's core >> >> or memory requirements in a transform to get the job close to the >> >> implicitly assumed core/mem ratio. >> >> >> >> Does that make sense? ð >> >> >> >> Cheers, >> >> Thomas >> >> >> >> _______________________________________________ >> >> HTCondor-users mailing list >> >> To unsubscribe, send a message to >> htcondor-users-request@xxxxxxxxxxx >> <mailto:htcondor-users-request@xxxxxxxxxxx> with a >> >> subject: Unsubscribe >> >> You can also unsubscribe by visiting >> >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> >> >> >> The archives can be found at: >> >> https://lists.cs.wisc.edu/archive/htcondor-users/ >> > >> > >> > _______________________________________________ >> > HTCondor-users mailing list >> > To unsubscribe, send a message to >> htcondor-users-request@xxxxxxxxxxx >> <mailto:htcondor-users-request@xxxxxxxxxxx> with a >> > subject: Unsubscribe >> > You can also unsubscribe by visiting >> > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> > >> > The archives can be found at: >> > https://lists.cs.wisc.edu/archive/htcondor-users/ >> > >> >> _______________________________________________ >> HTCondor-users mailing list >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx >> <mailto:htcondor-users-request@xxxxxxxxxxx> with a >> subject: Unsubscribe >> You can also unsubscribe by visiting >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> >> The archives can be found at: >> https://lists.cs.wisc.edu/archive/htcondor-users/ >> >> >> >> -- >> Antonio Perez-Calero Yzquierdo, PhD >> CIEMAT & Port d'Informacià Cientifica, PIC. >> Campus Universitat Autonoma de Barcelona, Edifici D, E-08193 Bellaterra, >> Barcelona, Spain. >> Phone: +34 93 170 27 21 >> >> >> _______________________________________________ >> HTCondor-users mailing list >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a >> subject: Unsubscribe >> You can also unsubscribe by visiting >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> >> The archives can be found at: >> https://lists.cs.wisc.edu/archive/htcondor-users/ >> >> >> _______________________________________________ >> HTCondor-users mailing list >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a >> subject: Unsubscribe >> You can also unsubscribe by visiting >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users >> >> The archives can be found at: >> https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature