On 23/03/21 12:04, Jeff Templon wrote:
Hi
Hello Jeff,
I am looking into setting up accounting and plotting on our condor
setup. Weâve traditionally done this by unix groups, see this
plot:
https://www.nikhef.nl/grid/stats/stbc/grisview-week
For how the plots look now under the torque batch system â on the
top plot, right hand side, is a list of unix user groups and how much
of the system was used by each during the past 7 days, giving also
the color code legend for the plot, which is a stacked histogram of
the number of jobs running by each of those unix groups at each
sample point.
Condor does not have, AFAICT, this concept of accounting by unix
groups - what I read is that the user needs to specify an accounting
group (completely unrelated to unix groups). How can I have this
automatically set to the unix group, except for cases where the user
overrides it?
I went through similar steps as you seem to be going now. With
HTCondor our current solution is to define a text mapfile filled with
lines like this:
* <username> <group>,<group>
Example:
* pilatlas011 atlas,atlas
In our case "atlas" is the main gid for the pilatlas022 user.
Then, in the configuration of the SCHEDD (aka Submit Node)
#[1]
use FEATURE:AssignAccountingGroup($(T1_SHARED_SCRIPT_DIR)/Hgroups.txt)
#[2]
JOB_TRANSFORM_NAMES = $(JOB_TRANSFORM_NAMES) SetAccountingGroup
JOB_TRANSFORM_SetAccountingGroup @=end
[
eval_set_AcctGroup=usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup);
eval_set_AccountingGroup=join(".",usermap("AssignAccountingGroup",AcctGroupUser,AcctGroup,defaultGroup),AcctGroupUser);
]
@end
Notes:
[1] This should be all you need. [2]Â have been added here because of
an unexpected behaviour of #[1] in some particular cases:
Having set that, running jobs should have the following Classad set:
AccountingGroup = "atlas.atlasprd011"
AcctGroup = "atlas"
AcctGroupUser = "atlasprd011"
If a user defines his own AcctGroup in the submit file, this should be
moved to "RequestedAcctGroup"
and AcctGroup should be set by [1].
We set [2] because in this particular case the AcctGroup remains
unexpectedly unset.
####
With this in place you can configure fairshare; mine is set as follow:
In the Central Manager:
PRIORITY_HALFLIFE = 26000
# Accept surplus and regroup
GROUP_ACCEPT_SURPLUS = true
#GROUP_AUTOREGROUP = false
DEFAULT_PRIO_FACTOR = 100000.0
include ifexist : /usr/share/htc/prod/conf/htc_shares.conf
Âhtc_shares.conf is script generated and it contains:
GROUP_NAMES = \
ÂÂÂÂ atlas, \
ÂÂÂÂ alice, \
ÂÂÂÂ belle, \
[...]
ÂÂÂÂ lhcb
GROUP_QUOTA_DYNAMIC_belle = 0.041403
[...]
GROUP_QUOTA_DYNAMIC_cms = 0.154328
The total sum yelds 1.0 and these numbers are the "share" for that
group.
Hope these notes and a bit of "Read That Fantastic Manual" should help
Stefano
Also : we ultimately need to consider doing fair share on these same
unix groups - are the numbers going into the fair share calculations
the same set going into accounting? I would like to avoid setting
up parallel infrastructures for things that are identical.
Also : does the user have complete freedom to put any group they
want? I hope not; I would not want to have to police the system.
Not all groups have the same allocation here, and users are quite
opportunistic when theyâve found shortcuts to getting their jobs
running.
Thanks,
JT
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/