Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Problems with version 7.4.2
- Date: 29 Jun 2010 14:07:00 -0500
- From: "Todd Tannenbaum" <tannenba@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Problems with version 7.4.2
>Unless you have a need for globus, I highly recommend "going native" on
>fedora (11, 12, or 13). It is a "proper" build linking against the
>system distro'd libs.
>
>yum list condor
>yum install condor
>
Agreed, but in addition to lacking grid universe/globus support, note I think the fedora yum package also currently lacks standard universe and perhaps a couple other grid types.
regards
Todd
>Cheers,
>Tim
>
>On Tue, 2010-06-29 at 11:57 +0100, Alan wrote:
> Sounds like a similar issue reported here:
>
>
> http://www.escience.cam.ac.uk/projects/camgrid/upgrade.html
>
>
> Alan
>
> On Tue, Jun 29, 2010 at 10:47, Diana Lousa <dlousa@xxxxxxxxxxx> wrote:
> Hello,
>
> We have installed condor version 7.4.2 in a cluster composed
> of machines with Fedora and Ubuntu 10.04 OS. Our installation
> is in shared directories and we have different binaries for
> Fedora and Ubuntu
>
> (condor-7.4.2-linux-x86-rhel3-dynamic and
> condor-7.4.2-linux-x86-debian50-dynamic, respectively). We
> also have the home dir of condor and the configuration files
> in a shared directory. The local dir of our central
> manager/dedictaed sched id in a local directory and for all
> the other machines it is in a shared directory. We have been
> experiencing some serious problems:
>
> 1- The condor_submit command gets hung:
> Sometimes when I submit jobs, condor_submit gets stuck,
> althoug the job enters the queue, the command doesn't stop and
> I have to kill it with ctrl+c
>
> 2. Jobs return to Idle state and can't be removed:
> One of our users has jobs that return to the Idle state after
> they terminate or die. He then tries to remove these jobs from
> the queue, but that action causes condor to go crazy. Condor_q
> stops responding and shows the message:
> -- Failed to fetch ads from: <192.168.127.3:39790> :
> zyon.itqb.unl.pt
> and then all the jobs die.
>
> It is worth pointing out that everything works fine when we
> use an older version of condor (6.8.4) in our central
> manager/dedicated sched. However, we only have Fedora binaries
> for these version and these means that we can not run this
> version in a machine with Ubuntu (due to libraries
> incompatibility) and our goal is to have a machine with Ubuntu
> 10.04 as central manager/dedicated sched..
>
> Can anyone help?
>
>
>