Hi Stuart,
I am glad that the removal of the single line stopped the infinite Schedd segmentation faults. It looks like the condor cron doesn't know how to handle a STEP value without a range (x-y) or asterisk i.e. [1/10]. Because of this invalid STEP value, the matchFields()
appears to recursively run until a segmentation fault occurs. Sorry you had to stumble across this.
-Cole Bollig
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Anderson, Stuart B. <sba@xxxxxxxxxxx>
Sent: Thursday, March 2, 2023 7:55 PM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Remove job without condor_rm > On Mar 2, 2023, at 5:38 PM, Anderson, Stuart B. <sba@xxxxxxxxxxx> wrote: > > Does anyone know how to remove a job from a schedd queue while condor_schedd is not running? > > I have tracked down a crondor job that is segfaulting condor_schedd at startup with a 64k deep stack trace of calls to CronTab::matchFields, so condor_rm is not an option. I have a pretty good guess of the offending jobid, but I am not sure if I should just just manually edit job_queue.log to remove any line where column 2 contains the suspect job id before restarting condor, or if there is additional state to manually update? I decided to not try and manually remove an entire job without confirmation from an expert, but removing the following line from job_queue.log has allowed condor_schedd version 9.0.7 to run again on an EL8 system, 103 0115248510.-1 CronMinute "1/10" Thanks. -- Stuart Anderson sba@xxxxxxxxxxx _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users The archives can be found at: https://lists.cs.wisc.edu/archive/htcondor-users/ |