[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Load Balancer for AP(submitter)



Hi all,

I have written a load balancer for multiple APs so that I can choose least loaded AP and Distribute load evenly.
I am using Total Jobs and RecentDaemonCoreDutyCycle into consideration, I get this info using condor_status -schedd cmd

There are major 2 problems I am facing

1. I am using jobs with max_idle 300 with maximum of 2000 jobs in a cluster, But the jobs in factory are not visible in cmd(condor_status) and I kind of oversubscribe and AP leading to slowness, Is there any way to get total jobs(including in factory), instead of condor_q(as this stucks a lot and I am removing it's dependency to reduce load).

2. If any AP has RecentDaemonCoreDutyCycle high, I am not able to debug the reason, why this is happening?

Are there any other factors I should be considering for load balancer?

Thanks andÂRegards
Raman