[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] blank condor_status and Central manager that executes



Ok I have started over after many more failures. Before I add any machines to this small pool I am trying to get the central manager up and running. This will also be a submit and execute machine. I cannot get the machine to come up with condor_status since it is a will be a execute machine. Following the advice from users about a similar problem.

 

I run

 

[root@checkpoint condor-7.2.4]# ./condor_install --prefix=~condor --local-dir=/root/Desktop/condor --type=manager,submit,execute --overwrite

Setting up Condor in /root/Desktop/condor-7.2.4

 

Condor has been installed into:

    /root/Desktop/condor-7.2.4

 

In order for Condor to work properly you must set your CONDOR_CONFIG

environment variable to point to your Condor configuration file:

/root/Desktop/condor-7.2.4/etc/condor_config before running Condor

commands/daemons.

Created scripts which can be sourced by users to setup their

Condor environment variables.  These are:

   sh: /root/Desktop/condor-7.2.4/condor.sh

  csh: /root/Desktop/condor-7.2.4/condor.csh

 

[root@checkpoint condor-7.2.4]# export CONDOR_CONFIG=/root/Desktop/condor-7.2.4/etc/condor_config

[root@checkpoint condor-7.2.4]# export PATH=/root/Desktop/condor-7.2.4/sbin:${PATH}

[root@checkpoint condor-7.2.4]# export PATH=/root/Desktop/condor-7.2.4/bin:${PATH}

[root@checkpoint condor-7.2.4]# echo $CONDOR_CONFIG /root/Desktop/condor-7.2.4/etc/condor_config

/root/Desktop/condor-7.2.4/etc/condor_config /root/Desktop/condor-7.2.4/etc/condor_config

 

[root@checkpoint condor-7.2.4]# cd sbin

[root@checkpoint sbin]# ./condor_restart

Sent "Restart" command to local master

[root@checkpoint sbin]# cd ..

[root@checkpoint condor-7.2.4]# condor_status

 

[root@checkpoint condor-7.2.4]# condor_status -any

 

MyType               TargetType           Name                         

 

Scheduler            None                 checkpoint.bioinformatics.ualr

DaemonMaster         None                 checkpoint.bioinformatics.ualr

Negotiator           None                 checkpoint.bioinformatics.ualr

 

I have followed the advice that was given to the  user that had a very similar question. I changed the variables CONDOR_HOST and HOSTALLOW_READ and HOSTALLOW_WRITE in the files like was recommended to the earlier user with the similar problem. I have attached the local and global confiq(just the first two parts of the global since it the only part I did any changes to) files that I  configured. I am stuck and don’t know what else to do and need the advice from those with experience.

 

Condor install is in Desktop, which has the global config file, and so is the created condor directory which has the local config file. The machine OS is  RedHAt Enterprise 5.1

 

Checkpoint.bioinformatics.ualr.edu(144.167.99.210) is the name of the central manager. I used the IP of the machine when defining CONDOR_HOST. I will provide more info if needed.


Thanks for any help. I really am lost at what else to try.

[root@checkpoint condor]# vi condor_config.local 

## NETWORK_INTERFACE was added manually

NETWORK_INTERFACE = 144.167.99.210


##  What machine is your central manager?

CONDOR_HOST = 144.167.99.210


##  Pathnames:
##  Where have you installed the bin, sbin and lib condor directories?

RELEASE_DIR = /root/Desktop/condor-7.2.4


##  Where is the local condor directory for each host?
##  This is where the local config file(s), logs and
##  spool/execute directories are located

LOCAL_DIR = /root/Desktop/condor


##  Mail parameters:
##  When something goes wrong with condor at your site, who should get
##  the email?

CONDOR_ADMIN = root@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


##  Full path to a mail delivery program that understands that "-s"
##  means you want to specify a subject:

MAIL = /bin/mailx


##  Network domain parameters:
##  Internet domain of machines sharing a common UID space.  If your
##  machines don't share a common UID space, set it to
##  UID_DOMAIN = $(FULL_HOSTNAME)
##  to specify that each machine has its own UID space.

UID_DOMAIN = $(FULL_HOSTNAME)


##  Internet domain of machines sharing a common file system.
##  If your machines don't use a network file system, set it to
##  FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
##  to specify that each machine has its own file system.

FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)


##  The user/group ID <uid>.<gid> of the "Condor" user.
##  (this can also be specified in the environment)
##  Note: the CONDOR_IDS setting is ignored on Win32 platforms

CONDOR_IDS = 501.501


##  Condor needs to create a few lock files to synchronize access to
##  various log files.  Because of problems we've had with network
##  filesystems and file locking over the years, we HIGHLY recommend
##  that you put these lock files on a local partition on each
##  machine.  If you don't have your LOCAL_DIR on a local partition,
##  be sure to change this entry.  Whatever user (or group) condor is
##  running as needs to have write access to this directory.  If
##  you're not running as root, this is whatever user you started up
##  the condor_master as.  If you are running as root, and there's a
##  condor account, it's probably condor.  Otherwise, it's whatever
##  you've set in the CONDOR_IDS environment variable.  See the Admin
##  manual for details on this.

LOCK = /tmp/condor-lock.$(HOSTNAME)0.824360165120201

DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD


##  Java parameters:
##  If you would like this machine to be able to run Java jobs,
##  then set JAVA to the path of your JVM binary.  If you are not
##  interested in Java, there is no harm in leaving this entry
##  empty or incorrect.

JAVA = /usr/bin/java


##  Some JVMs need to be told the maximum amount of heap memory
##  to offer to the process.  If your JVM supports this, give
##  the argument here, and Condor will fill in the memory amount.
##  If left blank, your JVM will choose some default value,
##  typically 64 MB.  The default (-Xmx) works with the Sun JVM.

JAVA_MAXHEAP_ARGUMENT = -Xmx1024m

######################################################################
##
##  condor_config
##
##  This is the global configuration file for condor.  Any settings
##  made here may potentially be overridden in the local configuration
##  file.  KEEP THAT IN MIND!  To double-check that a variable is
##  getting set from the configuration file that you expect, use
##  condor_config_val -v <variable name>
##
##  The file is divided into four main parts:
##  Part 1:  Settings you MUST customize
##  Part 2:  Settings you may want to customize
##  Part 3:  Settings that control the policy of when condor will
##           start and stop jobs on your machines
##  Part 4:  Settings you should probably leave alone (unless you
##  know what you're doing)
##
##  Please read the INSTALL file (or the Install chapter in the
##  Condor Administrator's Manual) for detailed explanations of the
##  various settings in here and possible ways to configure your
##  pool.
##
##  Unless otherwise specified, settings that are commented out show
##  the defaults that are used if you don't define a value.  Settings
##  that are defined here MUST BE DEFINED since they have no default
##  value.
##
##  Unless otherwise indicated, all settings which specify a time are
##  defined in seconds.
##
######################################################################

######################################################################
######################################################################
##
##  ######                                     #
##  #     #    ##    #####    #####           ##
##  #     #   #  #   #    #     #            # #
##  ######   #    #  #    #     #              #
##  #        ######  #####      #              #
##  #        #    #  #   #      #              #
##  #        #    #  #    #     #            #####
##
##  Part 1:  Settings you must customize:
######################################################################
######################################################################

##  What machine is your central manager?
CONDOR_HOST = 144.167.99.210

##--------------------------------------------------------------------
##  Pathnames:
##--------------------------------------------------------------------
##  Where have you installed the bin, sbin and lib condor directories?
RELEASE_DIR             = /root/Desktop/condor-7.2.4

##  Where is the local condor directory for each host?
##  This is where the local config file(s), logs and
##  spool/execute directories are located
LOCAL_DIR               = /root/Desktop/condor
#LOCAL_DIR              = $(RELEASE_DIR)/hosts/$(HOSTNAME)

##  Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = /root/Desktop/condor/condor_config.local

## If the local config file is not present, is it an error?
## WARNING: This is a potential security issue.
## If not specificed, the default is True
#REQUIRE_LOCAL_CONFIG_FILE = TRUE

##--------------------------------------------------------------------
##  Mail parameters:
##--------------------------------------------------------------------
##  When something goes wrong with condor at your site, who should get
##  the email?
CONDOR_ADMIN            = condor-admin@xxxxxxxxxxx

##  Full path to a mail delivery program that understands that "-s"
##  means you want to specify a subject:
MAIL                    = /usr/bin/mail

##--------------------------------------------------------------------
##  Network domain parameters:
##--------------------------------------------------------------------
##  Internet domain of machines sharing a common UID space.  If your
##  machines don't share a common UID space, set it to
##  UID_DOMAIN = $(FULL_HOSTNAME)
##  to specify that each machine has its own UID space.
UID_DOMAIN              = $(FULL_HOSTNAME)

##  Internet domain of machines sharing a common file system.
##  If your machines don't use a network file system, set it to
##  FILESYSTEM_DOMAIN = $(FULL_HOSTNAME)
##  to specify that each machine has its own file system.
FILESYSTEM_DOMAIN       = $(FULL_HOSTNAME)

##  This macro is used to specify a short description of your pool.
##  It should be about 20 characters long. For example, the name of
##  the UW-Madison Computer Science Condor Pool is ``UW-Madison CS''.
COLLECTOR_NAME          = My Pool

######################################################################
######################################################################
##
##  ######                                   #####
##  #     #    ##    #####    #####         #     #
##  #     #   #  #   #    #     #                 #
##  ######   #    #  #    #     #            #####
##  #        ######  #####      #           #
##  #        #    #  #   #      #           #
##  #        #    #  #    #     #           #######
##
##  Part 2:  Settings you may want to customize:
##  (it is generally safe to leave these untouched)
######################################################################
######################################################################

##
##  The user/group ID <uid>.<gid> of the "Condor" user.
##  (this can also be specified in the environment)
##  Note: the CONDOR_IDS setting is ignored on Win32 platforms
#CONDOR_IDS=x.x

##--------------------------------------------------------------------
##  Flocking: Submitting jobs to more than one pool
##--------------------------------------------------------------------
##  Flocking allows you to run your jobs in other pools, or lets
##  others run jobs in your pool.
##
##  To let others flock to you, define FLOCK_FROM.
##
##  To flock to others, define FLOCK_TO.

##  FLOCK_FROM defines the machines where you would like to grant
##  people access to your pool via flocking. (i.e. you are granting
##  access to these machines to join your pool).
FLOCK_FROM =
##  An example of this is:
#FLOCK_FROM = somehost.friendly.domain, anotherhost.friendly.domain

##  FLOCK_TO defines the central managers of the pools that you want
##  to flock to. (i.e. you are specifying the machines that you
##  want your jobs to be negotiated at -- thereby specifying the
##  pools they will run in.)
FLOCK_TO =
##  An example of this is:
#FLOCK_TO = central_manager.friendly.domain, condor.cs.wisc.edu

##  FLOCK_COLLECTOR_HOSTS should almost always be the same as
##  FLOCK_NEGOTIATOR_HOSTS (as shown below).  The only reason it would be
##  different is if the collector and negotiator in the pool that you are
##  flocking too are running on different machines (not recommended).
##  The collectors must be specified in the same corresponding order as
##  the FLOCK_NEGOTIATOR_HOSTS list.
FLOCK_NEGOTIATOR_HOSTS = $(FLOCK_TO)
FLOCK_COLLECTOR_HOSTS = $(FLOCK_TO)
## An example of having the negotiator and the collector on different
## machines is:
#FLOCK_NEGOTIATOR_HOSTS = condor.cs.wisc.edu, condor-negotiator.friendly.domain
#FLOCK_COLLECTOR_HOSTS =  condor.cs.wisc.edu, condor-collector.friendly.domain

##--------------------------------------------------------------------
##  Host/IP access levels
##--------------------------------------------------------------------
##  Please see the administrator's manual for details on these
##  settings, what they're for, and how to use them.

##  What machines have administrative rights for your pool?  This
##  defaults to your central manager.  You should set it to the
##  machine(s) where whoever is the condor administrator(s) works
##  (assuming you trust all the users who log into that/those
##  machine(s), since this is machine-wide access you're granting).
HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST)

##  If there are no machines that should have administrative access
##  to your pool (for example, there's no machine where only trusted
##  users have accounts), you can uncomment this setting.
##  Unfortunately, this will mean that administering your pool will
##  be more difficult.
#HOSTDENY_ADMINISTRATOR = *

##  What machines should have "owner" access to your machines, meaning
##  they can issue commands that a machine owner should be able to
##  issue to their own machine (like condor_vacate).  This defaults to
##  machines with administrator access, and the local machine.  This
##  is probably what you want.
HOSTALLOW_OWNER = $(FULL_HOSTNAME), $(HOSTALLOW_ADMINISTRATOR)

##  Read access.  Machines listed as allow (and/or not listed as deny)
##  can view the status of your pool, but cannot join your pool
##  or run jobs.
##  NOTE: By default, without these entries customized, you
##  are granting read access to the whole world.  You may want to
##  restrict that to hosts in your domain.  If possible, please also
##  grant read access to "*.cs.wisc.edu", so the Condor developers
##  will be able to view the status of your pool and more easily help
##  you install, configure or debug your Condor installation.
##  It is important to have this defined.
HOSTALLOW_READ = *
#HOSTALLOW_READ = *.your.domain, *.cs.wisc.edu
#HOSTDENY_READ = *.bad.subnet, bad-machine.your.domain, 144.77.88.*

##  Write access.  Machines listed here can join your pool, submit
##  jobs, etc.  Note: Any machine which has WRITE access must
##  also be granted READ access.  Granting WRITE access below does
##  not also automatically grant READ access; you must change
##  HOSTALLOW_READ above as well.
##
##  You must set this to something else before Condor will run.
##  This most simple option is:
##    HOSTALLOW_WRITE = *
##  but note that this will allow anyone to submit jobs or add
##  machines to your pool and is serious security risk.
HOSTALLOW_WRITE = *
#HOSTALLOW_WRITE = *.your.domain, your-friend's-machine.other.domain
#HOSTDENY_WRITE = bad-machine.your.domain

##  Negotiator access.  Machines listed here are trusted central
##  managers.  You should normally not have to change this.
HOSTALLOW_NEGOTIATOR = $(CONDOR_HOST)