[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Maximum SERVICE's run in local universe



I forgot to report back on this. It worked perfectly! I have noticed
though, that sometimes the service node doesn't end when all of the
work associated with the service node completes. In fact, the service
job separates from the parent DAG and sits in the running state until
you remove it manually. At first I thought it was because the job
started and finished so quickly, that it didn't start the service
until after the job had been completed, but it's happening with jobs
that take the better part of 15 hours to complete, and i've confirmed
that the service started far before anyone picked up the work. Have
you see this before? Other than writing logic into the service to
check regularly for any remaining work, is there another way to force
the service to end gracefully when the rest of its dag is done?

Also, I forgot to mention last time that i'm running 23.0.3

On Tue, Feb 6, 2024 at 2:29âPM Cole Bollig via HTCondor-users
<htcondor-users@xxxxxxxxxxx> wrote:
>
> Hi Christopher,
>
> Assuming this relates to the DAGMan setup I helped with recently, the change to this would have to be in the Schedd configuration. You just have to set START_LOCAL_UNIVERSE in the AP configuration (host that the Schedd/DAGMan is running on). This defaults to TotalLocalJobsRunning < 200 so something like:
>
> START_LOCAL_UNIVERSE = TotalLocalJobsRunning < n
>
> where n is the desired cap on local universe jobs that can run at once on the host. Don't forget to reconfigure condor (i.e. condor_reconfig)
>
> Cheers,
> Cole Bollig
> ________________________________
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Christopher Phipps <hawtdogflvrwtr@xxxxxxxxx>
> Sent: Tuesday, February 6, 2024 11:35 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: [HTCondor-users] Maximum SERVICE's run in local universe
>
> Is there a way to increase the number of SERVICE jobs that can be
> running at the same time in the local universe? It appears to be
> limited by default to 200 and I'd like to increase it slightly.
>
> Thanks,
> Chris
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/



-- 
It will be happened; it shall be going to be happening; it will be was
an event that could will have been taken place in the future. Simple
as that. ~ Arnold Rimmer