The reason is at the Bottom of the log snipped.
09/16/24 14:01:31 (pid:1560370) Error: MakeUserRec with illegal identifiers: user=condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
owner=condor, ntdomain=(null)
09/16/24 14:01:31 (pid:1560370) NewCluster(): failed to create new User record for condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
You cannot submit a job as the condor user. We used to allow this, but that was a bug. Submitting a job as condor is equivalent to submitting a job as root - it gives the job the power to run arbitrary code at high priv.
We need to figure out a way to make the error message better. Tokens make that hard because only the AP knows the real reason and it doesn't have a way to tell condor submit what the reason is.
-tj
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Ben Tovar via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Monday, September 16, 2024 1:11 PM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Cc: Ben Tovar <btovar@xxxxxx> Subject: [HTCondor-users] schedd error: cannot allocate a cluster id Hi all,
We are trying to install a new condor pool from scratch using get_htcondor. Trying with two machines using --central-manager and --submit respectively, when we try to submit a job with condor_q,
we get:
Submitting job(s)
ERROR: Cannot allocate a cluster id The idtokens seem ok (with condor_token_list), and the condor config files have the default use statements for role:get_htcondor_central_manager and role:get_htcondor_submit respectively.
I tried adding a UID_DOMAIN line to this default configuration, but it didn't seem to have any effect.
Any help is appreciated,
Ben
condor_version
$CondorVersion: 23.9.6 2024-08-08 BuildID: 748275 PackageID: 23.9.6-1 GitSHA: dfdd9eaa $ $CondorPlatform: x86_64_AlmaLinux8 $ Possible relevant lines from ScheddLog to follow:
09/16/24 14:01:29 (pid:1560370) Received a superuser command
09/16/24 14:01:29 (pid:1560370) Number of Active Workers 0 09/16/24 14:01:31 (pid:1560370) Received a superuser command 09/16/24 14:01:31 (pid:1560370) Owner condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx has no JobQueueUserRec 09/16/24 14:01:31 (pid:1560370) (bt:cbe7:13) Creating pending JobQueueUserRec for owner condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Backtrace bt:cbe7:13 is condor_schedd(_ZN9Scheduler16insert_ownerinfoEPKc+0x149) [0x55eb3edba509] condor_schedd(_Z10NewClusterP11CondorError+0x3fb) [0x55eb3ed5dbcb] condor_schedd(_Z12do_Q_requestR9QmgmtPeer+0x2913) [0x55eb3ed85413] condor_schedd(_Z8handle_qiP6Stream+0x8c) [0x55eb3ed5333c] /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore18CallCommandHandlerEiP6Streambbff+0x298) [0x7f24984330c8] /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore21HandleReqPayloadReadyEP6Stream+0x11c) [0x7f249843343c] /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x1e0) [0x7f249842a500] /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x21) [0x7f249842a741] /lib64/libcondor_utils_23_9_6.so(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x3c) [0x7f24981e3bfc] /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore6DriverEv+0xdce) [0x7f249842e4ce] /lib64/libcondor_utils_23_9_6.so(_Z7dc_mainiPPc+0x17ff) [0x7f249844cb3f] /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f24960e87e5] condor_schedd(_start+0x2e) [0x55eb3ed1fdfe] 09/16/24 14:01:31 (pid:1560370) Error: MakeUserRec with illegal identifiers: user=condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, owner=condor, ntdomain=(null) 09/16/24 14:01:31 (pid:1560370) NewCluster(): failed to create new User record for condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |