| Hi Jaime, all,
Thanks, your guess was spot on. These jobs got into that state on HTC 24; HTC 25 does not clean them up but seems to prevent them from reoccurring. Weâre 24h in on HTC 25 now and the issue did not reoccur.
If anyone else wants to make the update on a live system: The jobs can still be edited on HTC 24. So just do a qedit for both queues [0] first, then install the update. If you are unlucky, afterwards you might need to rm [1] any jobs that arrived while the update was installed. (Or simply prevent new submissions while doing the update.)
Cheers, Max
[0] condor_ce_qedit -constraint 'OsUser=!=undefined && OsUser =!= Owner' 'OsUser=Owner' condor_qedit -constraint 'OsUser=!=undefined && OsUser =!= Owner' 'OsUser=Ownerâ
[1] condor_ce_rm -constraint 'OsUser=!=undefined && OsUser =!= Owner' condor_rm -constraint 'OsUser=!=undefined && OsUser =!= Owner'
On 27. Oct 2025, at 20:29, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
If the affected jobs were in the queue when the upgrade occurred, then manually removing them should fix the problem.
- Jaime
On Oct 27, 2025, at 12:14âPM, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
Last question for now:
Were the affected jobs in the CEâs job queue before the upgrade to 25.0?
- Jaime
On Oct 27, 2025, at 11:30âAM, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
What version of HTCondor did you upgrade from and did you downgrade to the same version?
Can you try running the following command and post the output?
CONDOR_CONFIG=/etc/condor-ce/condor_config condor_qusers gfactory
- Jaime
On Oct 27, 2025, at 10:57âAM, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
I tried to reproduce this behavior on my own test system and failed. In my test, the OSUser attribute of the new job in the CE is derived from the User attribute, despite the submitter trying to set a different value.
Can you verify that the User attribute of these problem jobs is the expected mapped owner, and not âgfactory@XXXâ?
- Jaime
On Oct 27, 2025, at 10:00âAM, Jaime Frey via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
The
OSUser attribute is a recent addition to the job ad, which represents the OS account to use for running the job and job file ownership. Without it, the User attribute (derived from authenticating the submitting client) must match an existing OS account. It
looks like we have a bug where the submitter to the HTCondor-CE is forwarding the value of OSUser from its local job ad to the CEâs job ad when the CE schedd should be setting the appropriate value based on the local system.
We
will work on a fix right away. You can try setting Remote_OSUser=Undefined on the APs submitting jobs to the CE.
-
Jaime
On Oct 23, 2025, at 3:40âAM, KÃhn, Max (SCC) <max.fischer@xxxxxxx> wrote:
Hi all,
We recently updated our HTCondor-CEs to HTC 25.0.2 /HTC-CE 25.0.1 but had to roll back since the JobRouter kept dying [0] whenever it tried to transform a CMS job. For some reason, the JobRouter tried to run as (?) a user that simply does not exist on our system,
possibly the submit user on the remote machine (I assume âgfactoryâ stands for GlideIn factory).
Iâve traced this ghost user account back to the Job attribute âOsUserâ. Apparently many jobs have it, but only for CMS itâs different from the Owner that we map jobs to. Iâm not familiar with this attribute, and itâs not documented; all I could find was a commit
for HTC 25.
Why does the JobRouter try to access this user? What does the OSUser attribute do? And critically, can we overwrite it to fix this?
Cheers,
Max
[0]
10/23/25 08:54:40 passwd_cache::cache_uid(): getpwnam("gfactory") failed: user not found
10/23/25 08:54:40 gfactory not in passwd file
10/23/25 08:54:40 Failed in init_user_ids(gfactory,(null))
10/23/25 08:54:40 WriteUserLog::initialize: init_user_ids(pcms02) failed!
10/23/25 08:54:40 passwd_cache::cache_uid(): getpwnam("gfactory") failed: user not found
10/23/25 08:54:40 gfactory not in passwd file
10/23/25 08:54:40 Failed in init_user_ids(gfactory,(null))
[1]
https://urldefense.com/v3/__https://github.com/htcondor/htcondor/commit/34ca97e2960306ed2d75deae6710d1e77e9ef097__;!!Mak6IKo!IBhwhhtImZCwriVdlo4KgaOQoCubLwGxZPoYDzeT5V-Bj5bBzzvRyAz4Ow8J7cIFPsAv5nOodLbEaADEtzM9sAIAjH_TmA$
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxxwith a
subject: Unsubscribe
The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________
HTCondor-users
mailing list
To
unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
a
subject:
Unsubscribe
The
archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
The archives can be found at:
https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
The archives can be found at:
https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
_______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
|