HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] ProcAPI Messages



Hello All,

I modified the procapi.h file so that all of the age, creation_tim and sample_time variables use an unsigned long type. That seems to have fixed the ProcAPI errors that we were seeing.

But now we are seeing the following in SchedLog:
5/11 14:53:38 (fd:7) (pid:57011) In DaemonCore::Create_Process(/usr/local/condor/sbin/condor_procd,...)
5/11 14:53:38 (fd:7) (pid:57011) PRIV_CONDOR --> PRIV_ROOT at daemon_core.cpp:6852
5/11 14:53:38 (fd:7) (pid:57011) PRIV_ROOT --> PRIV_CONDOR at daemon_core.cpp:6885
5/11 14:53:38 (fd:11) (pid:57011) Create Process: fork() failed: Resource temporarily unavailable (35)
5/11 14:53:38 (fd:7) (pid:57011) start_procd: unable to execute the procd
5/11 14:53:38 (fd:5) (pid:57011) Close_Pipe(pipe_end=65536) succeeded
5/11 14:53:38 (fd:5) (pid:57011) Close_Pipe(pipe_end=65537) succeeded
5/11 14:53:38 (fd:5) (pid:57011) ERROR "unable to start the ProcD" at line 620 in file proc_family_proxy.cpp

I am not sure what to do at this point?

Ideas / Suggestions?

TIA





Jim Summers wrote:
Hello All,

We are trying to get condor 7.2.2 running on a Apple Mac Pro: ($CondorPlatform: I386-OSX_10_4 $). It is running an update Leopard.

We keep seeing the following in the MasterLog:

4/29 18:31:24 The STARTD (pid 32380) exited with status 0
4/29 18:31:24 ProcAPI sanity failure on pid 32380, age = -1980177591
4/29 18:31:24 ProcAPI sanity failure on pid 32383, age = -1980177591
4/29 18:31:24 ProcAPI sanity failure on pid 32386, age = -1980177591
4/29 18:31:24 restarting /usr/local/condor/sbin/condor_startd in 10
seconds
4/29 18:31:33 The SCHEDD (pid 32391) exited with status 4
4/29 18:31:33 ProcAPI sanity failure on pid 32391, age = -1980177582
4/29 18:31:33 Sending obituary for
"/usr/local/condor/sbin/condor_schedd"
4/29 18:31:33 restarting /usr/local/condor/sbin/condor_schedd in 11
seconds
4/29 18:31:34 Started DaemonCore process
"/usr/local/condor/sbin/condor_startd", pid and pgroup = 32396

I saw some references that this was fixed in the 6.x condor series. But I am pretty sure that was for the linux versions.

We are pretty sure this is keeping condor from running.

Ideas / Suggestions?  Not sure if this a parameter setting or a bug in the code?

TIA

--
Jim Summers
School of Computer Science-University of Oklahoma
-------------------------------------------------