[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] New Machine not running jobs



Hi David,

you could try and switch on all the debug output on the collector/negotiator [1] - hopefully, the negotiator might tell a bit more, why the machines are not brokered for the jobs

Cheers,
  Thomas

[1]
> cat 99_debugoutput.conf
ALL_DEBUG = D_FULLDEBUG
MAX_DEFAULT_LOG = 5000000000
SCHEDD_DEBUG = $(SCHEDD_DEBUG) D_CAT D_SECURITY:2


On 08/07/2025 16.59, David Cohen wrote:
Hi,
The problem isn't with a specific job not running on these machines but the machines not getting any jobs. Today some of them started running jobs, after a long time. So now I\m more confused. I'll try to see if they still get jobs when the load is lower, to understand if for some reason those machines are considered kast for jobs.

Thanks,
David


On Mon, Jul 7, 2025 at 9:42âAM Beyer, Christoph <christoph.beyer@xxxxxxx <mailto:christoph.beyer@xxxxxxx>> wrote:

    Hi,

    try a 'not running job' using:

    condor_q <jobid> -better-analyze -reverse -machine <FQDN of the
    machine in question>

    This should give you an idea :)

    Best
    christoph


-- Christoph Beyer
    DESY Hamburg
    IT-Department

    Notkestr. 85
    Building 02b, Room 009
    22607 Hamburg

    phone:+49-(0)40-8998-2317
    mail: christoph.beyer@xxxxxxx <mailto:christoph.beyer@xxxxxxx>

    ------------------------------------------------------------------------
    *Von: *"David Cohen" <cdavid@xxxxxxxxxxxxxxxxxxxxxx
    <mailto:cdavid@xxxxxxxxxxxxxxxxxxxxxx>>
    *An: *"HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx
    <mailto:htcondor-users@xxxxxxxxxxx>>
    *Gesendet: *Montag, 7. Juli 2025 08:36:55
    *Betreff: *[HTCondor-users] New Machine not running jobs

    Hi,
    A new machine, installed with the same version and configuration as
    all the other execute nodes, is not getting matched for running
    jobs, although there are queued jobs.

    Specifically requesting that machine as a requirement gets the job
    running.
    Any ideas?

    David


    _______________________________________________
    HTCondor-users mailing list
    To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
    <mailto:htcondor-users-request@xxxxxxxxxxx> with a
    subject: Unsubscribe

    The archives can be found at: https://www-auth.cs.wisc.edu/lists/
    htcondor-users/ <https://www-auth.cs.wisc.edu/lists/htcondor-users/>
    _______________________________________________
    HTCondor-users mailing list
    To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
    <mailto:htcondor-users-request@xxxxxxxxxxx> with a
    subject: Unsubscribe

    The archives can be found at: https://www-auth.cs.wisc.edu/lists/
    htcondor-users/ <https://www-auth.cs.wisc.edu/lists/htcondor-users/>


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature