[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] schedd error: cannot allocate a cluster id



Please disregard my previous email. I think our error was trying to submit with root. Submitting with a regular user works. Sorry for the noise.

Ben

On Mon, Sep 16, 2024 at 2:11âPM Ben Tovar <btovar@xxxxxx> wrote:
Hi all,

We are trying to install a new condor pool from scratch using get_htcondor. Trying with two machines using --central-manager and --submit respectively, when we try to submit a job with condor_q, we get:

Submitting job(s)
ERROR: Cannot allocate a cluster id

The idtokens seem ok (with condor_token_list), and the condor config files have the default use statements for role:get_htcondor_central_manager and role:get_htcondor_submit respectively. I tried adding a UID_DOMAIN line to this default configuration, but it didn't seem to have any effect.

Any help is appreciated,

Ben

condor_version
$CondorVersion: 23.9.6 2024-08-08 BuildID: 748275 PackageID: 23.9.6-1 GitSHA: dfdd9eaa $
$CondorPlatform: x86_64_AlmaLinux8 $

Possible relevant lines from ScheddLog to follow:

09/16/24 14:01:29 (pid:1560370) Received a superuser command
09/16/24 14:01:29 (pid:1560370) Number of Active Workers 0
09/16/24 14:01:31 (pid:1560370) Received a superuser command
09/16/24 14:01:31 (pid:1560370) Owner condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx has no JobQueueUserRec
09/16/24 14:01:31 (pid:1560370) (bt:cbe7:13) Creating pending JobQueueUserRec for owner condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    Backtrace bt:cbe7:13 is
    condor_schedd(_ZN9Scheduler16insert_ownerinfoEPKc+0x149) [0x55eb3edba509]
    condor_schedd(_Z10NewClusterP11CondorError+0x3fb) [0x55eb3ed5dbcb]
    condor_schedd(_Z12do_Q_requestR9QmgmtPeer+0x2913) [0x55eb3ed85413]
    condor_schedd(_Z8handle_qiP6Stream+0x8c) [0x55eb3ed5333c]
    /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore18CallCommandHandlerEiP6Streambbff+0x298) [0x7f24984330c8]
    /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore21HandleReqPayloadReadyEP6Stream+0x11c) [0x7f249843343c]
    /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore24CallSocketHandler_workerEibP6Stream+0x1e0) [0x7f249842a500]
    /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore35CallSocketHandler_worker_demarshallEPv+0x21) [0x7f249842a741]
    /lib64/libcondor_utils_23_9_6.so(_ZN13CondorThreads8pool_addEPFvPvES0_PiPKc+0x3c) [0x7f24981e3bfc]
    /lib64/libcondor_utils_23_9_6.so(_ZN10DaemonCore6DriverEv+0xdce) [0x7f249842e4ce]
    /lib64/libcondor_utils_23_9_6.so(_Z7dc_mainiPPc+0x17ff) [0x7f249844cb3f]
    /lib64/libc.so.6(__libc_start_main+0xe5) [0x7f24960e87e5]
    condor_schedd(_start+0x2e) [0x55eb3ed1fdfe]
09/16/24 14:01:31 (pid:1560370) Error: MakeUserRec with illegal identifiers: user=condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx, owner=condor, ntdomain=(null)
09/16/24 14:01:31 (pid:1560370) NewCluster(): failed to create new User record for condor@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx