Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor Problems on Linux (Fedora Core 3)
- Date: Thu, 25 Aug 2005 15:53:20 -0400
- From: Avi Flamholz <flamholz@xxxxxxxxx>
- Subject: Re: [Condor-users] Condor Problems on Linux (Fedora Core 3)
I figured out the problem - it was the linux 2.6 kernel bug with
condor 6.6.10. Condor thought that there was not enough memory on the
target machines to execute the commands. To fix this I switched to
Condor 6.7.10.
What are the macros that I would have to set to make 6.6.10 work on these boxes?
I know MEMORY is one of them, but what are the macros that set
VirtualMemory and TotalVirtual memory? I tried VIRTUAL_MEMORY and
TOTAL_VIRTUAL_MEMORY but they did not work.
Thanks
-Avi
On 8/25/05, Jaime Frey <jfrey@xxxxxxxxxxx> wrote:
> On Aug 24, 2005, at 10:39 AM, Avi Flamholz wrote:
>
> > I am working on installing condor on a server farm of linux machines.
> > As a test I took two of them and I am trying to install there first,
> > as this is my first time dealing with condor. I have already completed
> > a test install on a pool of 3 solaris machines, which worked fine, but
> > with the fedora machines I encounter problems.
> >
> > The problem manifests itself as follows:
> >
> > I have machines 1&2. 1 is configured as the central manager and a
> > submit machine, 2 is configured as an execute and submit machine. Each
> > machine has multiple processors. When I run condor_status I get:
> >
> > ----------------------------------------------------------------------
> > ------------------------------------------------
> > $ condor_status
> >
> > Name OpSys Arch State Activity LoadAv Mem
> > ActvtyTime
> >
> > vm1@machine2 LINUX INTEL Unclaimed Idle 0.000 1 0
> > +00:15:04
> > vm2@machine2 LINUX INTEL Unclaimed Idle 0.000 1 0
> > +00:20:05
> > vm3@machine2 LINUX INTEL Unclaimed Idle 0.000 1 0
> > +00:20:06
> > vm4@machine2 LINUX INTEL Unclaimed Idle 0.000 1 0
> > +00:20:07
> >
> > Machines Owner Claimed Unclaimed Matched
> > Preempting
> >
> > INTEL/LINUX 4 0 0 4
> > 0 0
> >
> > Total 4 0 0 4
> > 0 0
> > ----------------------------------------------------------------------
> > -------------------------------------------------
> >
> > However, when I run condor_findhost, I get:
> > ----------------------------------------------------------------------
> > --------
> > $ condor_findhost
> > Warning: Found no submitters
> >
> > ERROR: 1 machines not available
> > ----------------------------------------------------------------------
> > --------
> >
> > And when I run condor_submit, the job waits on the queue indefinitely.
> >
> > Does anyone know what the issue might be?
>
> Try running condor_q -analyze on your queued jobs.
>
> +----------------------------------+---------------------------------+
> | Jaime Frey | Public Split on Whether |
> | jfrey@xxxxxxxxxxx | Bush Is a Divider |
> | http://www.cs.wisc.edu/~jfrey/ | -- CNN Scrolling Banner |
> +----------------------------------+---------------------------------+
>
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>