> <mailto:
gquinn@xxxxxxxxxxx>> wrote:
>
> Fernando,
>
> The "watchdog" pipe is created by the ProcD when it starts up, and is
> only ever deleted by Condor when the ProcD shuts down.
>
> Is it possible that something outside of Condor is deleting the pipe? We
> have seen problems like this before with programs like tmpwatch
> (although I guess it's doubtful that tmpwatch is running over your
> /home/condor/hosts/wolf10/log/ directory).
>
> Come to think of it, /home/condor/hosts/wolf10/log sounds like it could
> be on NFS. It's perfectly fine to have your LOG directory on NFS, but it
> is in that case required to have a separate local LOCK directory (where
> things like the ProcD's pipes are stored). Please make sure that your
> LOCK setting refers to a local directory.
>
> Thanks,
>
> Greg Quinn
> Condor Team
>
> Fernando Rannou wrote:
> > Hello,
> > I'm getting he following error in one of the StaterLog
> > ------------------------
> > 1/28 11:20:04 About to exec /home/mpetct/sampproc --universal
> > 1/28 11:20:04 error opening watchdog pipe
> > /home/condor/hosts/wolf10/log/procd_pipe.STARTD.watchdog: No such
> file
> > or directory (2)
> > 1/28 11:20:04 ProcFamilyClient: error initializing LocalClient
> > 1/28 11:20:04 ProcFamilyProxy: error initializing ProcFamilyClient
> > 1/28 11:20:04 ERROR "ProcD has failed" at line 599 in file
> > proc_family_proxy.C
> > 1/28 11:20:04 ShutdownFast all jobs.
> > --------------------------
> > Clealry the "pipe" files are not there. What should I do.
> > We restarted condor on all nodes but the files did not appear.
> >
> > This has happened in a couple of nodes. All other nodes do have the
> > watchdog file:
> >
> > prw-rw---- 1 root isl 0 Nov 4 16:08
> procd_pipe.STARTD
> > prw-rw---- 1 root isl 0 Nov 4 16:08
> > procd_pipe.STARTD.watchdog
> > -
> > Thanks
> >
> > Fernando
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to
condor-users-request@xxxxxxxxxxx