[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor SOAP hanging schedd.

Has anyone else ever seen this problem? Is there any more information I can provide?

On 16-Jun-10, at 11:41 AM, Patrick Armstrong wrote:

I've been having some trouble with condor soap queries hanging my schedd. I have Condor 7.5.2 installed, with a pool of about 200 workers, and about 10000 jobs in my queue, and every ten minutes or so, a script of mine is querying the schedd with the soap interface. Normally, this takes about two minutes, and looks like this in the log:

06/16/10 10:39:51 Received HTTP POST connection from <>
06/16/10 10:39:51 Current Socket bufsize=85k
06/16/10 10:39:51 Current Socket bufsize=49k
06/16/10 10:39:51 About to serve HTTP request...
06/16/10 10:39:51 SOAP entered getJobAds(), transaction: 0
06/16/10 10:39:53 SOAP leaving getJobAds() result=0
06/16/10 10:41:20 Completed servicing HTTP request

However, I'll occasionally see the schedd get stuck, and not do anything until I send it SIGKILL. The log looks like this:

[root@canfarpool ~]# tail /var/log/condor/SchedLog
06/16/10 10:56:20 Received UDP command 60008 (DC_CHILDALIVE) from <>, access level DAEMON 06/16/10 10:56:20 Received UDP command 60008 (DC_CHILDALIVE) from <>, access level DAEMON 06/16/10 10:56:20 Received UDP command 60008 (DC_CHILDALIVE) from <>, access level DAEMON 06/16/10 10:56:20 Received UDP command 60008 (DC_CHILDALIVE) from <>, access level DAEMON
06/16/10 10:58:11 Received HTTP POST connection from <>
06/16/10 10:58:11 Current Socket bufsize=85k
06/16/10 10:58:11 Current Socket bufsize=49k
06/16/10 10:58:11 About to serve HTTP request...
06/16/10 10:58:11 SOAP entered getJobAds(), transaction: 0
06/16/10 10:58:14 SOAP leaving getJobAds() result=0
[root@canfarpool ~]# date
Wed Jun 16 11:39:57 PDT 2010

As you can see, it's been stuck for about 40 minutes.

Has anyone else run into this?


Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: