[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Slot weights



At Fermilab we've used a SLOT_WEIGHT _expression_ for several years but we are not using it to steer high memory jobs to high memory nodes.  
We are basically using it to penalize high-memory 1-core jobs by boosting up (worse) their priority.  Also note that in the era of partitionable slots, which most people are running these days, the behavior of RANK (both job rank and slot rank) is not necessarily predictable and it won't usually do what you would like it to do.  As a result we often see that we have some nodes memoried out and other nodes cored out and there's not much htcondor can do about it.. But the default distribution is certainly less than optimal.

Steve


From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Beyer, Christoph <christoph.beyer@xxxxxxx>
Sent: Sunday, December 21, 2025 1:50 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Slot weights
 
[EXTERNAL] – This message is from an external sender

Hi Jeff,

I think for steering a ranking impression in the job would be the apropriate mean ...


best
christoph

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx

----- Ursprüngliche Mail -----
Von: "Jeff Templon" <templon@xxxxxxxxx>
An: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Gesendet: Sonntag, 21. Dezember 2025 16:55:51
Betreff: [HTCondor-users] Slot weights

Hi,

I am looking to use the SLOT_WEIGHT both to steer high-memory jobs to high-memory machines, and as well weight the usage of high memory jobs more than those of low memory, to make the distribution of slots to users more fair.

HOWEVER:

Enable use of the condor_negotiator-side resource consumption policy, allocating the job-requested number of cores to the dynamic slot, and use SLOT_WEIGHT to assess the user usage that will affect user priority by the number of cores allocated. Note that the only attributes valid within the SLOT_WEIGHT _expression_ are Cpus, Memory, and disk. This must the set to the same value on all machines in the pool.

Really: the same value on all machines?  Why?  Or do I read this wrong?

JT


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://urldefense.proofpoint.com/v2/url?u=https-3A__www-2Dauth.cs.wisc.edu_lists_htcondor-2Dusers_&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Zza0YXSDmNRjGLdGwu6YJtwwezyYj3-D_Io5v4cmuVQXOcuLrh0yiwDGAHcWYV-t&s=spqmO-gI5FS7_A-gMVvDFeMQJwlQfWKtWYmCM2qLfNg&e=

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://urldefense.proofpoint.com/v2/url?u=https-3A__www-2Dauth.cs.wisc.edu_lists_htcondor-2Dusers_&d=DwIGaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=Zza0YXSDmNRjGLdGwu6YJtwwezyYj3-D_Io5v4cmuVQXOcuLrh0yiwDGAHcWYV-t&s=spqmO-gI5FS7_A-gMVvDFeMQJwlQfWKtWYmCM2qLfNg&e=