Looks like the condor user is not available on destination host. Is it
a NIS based setup? If it is, ensure condor user is available on all
hosts. If it is a NIS based installation, ensure that your UID_DOMAIN
is set to your NIS domain.
Regards,
Nitin
Lizhe.Wang@xxxxxxxxxxxx wrote:
Thanks your kindly reply
I have set START=TRUE in the local configuration file
and run the condor_reschedule command
The problem still exists.
to provide more information:
I include the log file contents.
Thanks,
Lizhe
StartLog:
-----------------
9/7 13:56:51 Connect failed for 10 seconds; returning FALSE
9/7 13:56:51 ERROR:
SECMAN:2004:Failed to start a session with TCP
SECMAN:2003:TCP connection to <194.199.22.87:34884> failed
9/7 13:56:51 condor_write(): Socket closed when trying to write buffer
9/7 13:56:51 Buf::write(): condor_write() failed
9/7 13:56:51 SECMAN: Error sending response classad!
9/7 13:56:51 Our parent process (pid 17196) went away; shutting down
9/7 13:56:51 Can't connect to <194.199.22.87:9618>:0, errno = 111
9/7 13:56:51 Will keep trying for 10 seconds...
9/7 13:57:01 Connect failed for 10 seconds; returning FALSE
9/7 13:57:01 ERROR:
SECMAN:2003:TCP connection to <194.199.22.87:9618> failed
9/7 13:57:01 Error sending update to the collector HEAVEN.inrialpes.fr
<194.199.22.87:9618>: Failed to send UDP update command to collector
9/7 13:57:01 Error sending update to collector(s)
9/7 13:57:01 Got SIGTERM. Performing graceful shutdown.
9/7 13:57:01 shutdown graceful
9/7 13:57:01 Deleting Cronmgr
9/7 13:57:01 Can't connect to <194.199.22.87:9618>:0, errno = 111
9/7 13:57:01 Will keep trying for 10 seconds...
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7
13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_startd (CONDOR_STARTD) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_startd
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17728
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35036>
9/7 13:57:20 New machine resource allocated
9/7 13:57:20 About to run initial benchmarks.
9/7 13:57:26 Completed initial benchmarks.
9/7 13:57:26 State change: IS_OWNER is false
9/7 13:57:26 Changing state: Owner -> Unclaimed
---------------------------------------------------------
ScheduleLog:
-------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_schedd (CONDOR_SCHEDD) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_schedd
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17729
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35037>
-------------------------
egotiatorLog:
----------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_negotiator (CONDOR_NEGOTIATOR) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_negotiator
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17727
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:9614>
9/7 13:57:14 ACCOUNTANT_HOST = None (local)
9/7 13:57:14 NEGOTIATOR_INTERVAL = 300 sec
9/7 13:57:14 NEGOTIATOR_TIMEOUT = 30 sec
9/7 13:57:14 PREEMPTION_REQUIREMENTS = (CurrentTime - EnteredCurrentState) >
(1 * (60 * 60)) && RemoteUserPrio > SubmittorPrio * 1.2
9/7 13:57:14 PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize
9/7 13:57:14 ---------- Started Negotiation Cycle ----------
9/7 13:57:14 Phase 1: Obtaining ads from collector ...
9/7 13:57:14 Getting all public ads ...
9/7 13:57:14 Sorting 0 ads ...
9/7 13:57:14 Getting startd private ads ...
9/7 13:57:14 Got ads: 0 public and 0 private
9/7 13:57:14 Public ads include 0 submitter, 0 startd
9/7 13:57:14 Phase 2: Performing accounting ...
9/7 13:57:14 Phase 3: Sorting submitter ads by priority ...
9/7 13:57:14 Phase 4.1: Negotiating with schedds ...
9/7 13:57:14 ---------- Finished Negotiation Cycle ----------
9/7 14:02:14 ---------- Started Negotiation Cycle ----------
9/7 14:02:14 Phase 1: Obtaining ads from collector ...
9/7 14:02:14 Getting all public ads ...
9/7 14:02:14 Sorting 3 ads ...
9/7 14:02:14 Getting startd private ads ...
9/7 14:02:14 Got ads: 3 public and 1 private
9/7 14:02:14 Public ads include 0 submitter, 1 startd
9/7 14:02:14 Phase 2: Performing accounting ...
9/7 14:02:14 Phase 3: Sorting submitter ads by priority ...
9/7 14:02:14 Phase 4.1: Negotiating with schedds ...
9/7 14:02:14 ---------- Finished Negotiation Cycle ----------
------------------------------------
MasterLog
-------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_master (CONDOR_MASTER) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_master
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17725
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35035>
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_collector", pid and pgroup = 17726
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_negotiator", pid and pgroup = 17727
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_startd", pid and pgroup = 17728
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_schedd", pid and pgroup = 17729
-------------------------------------------------------
CollectorLog:
------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_collector (CONDOR_COLLECTOR) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_collector
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17726
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:9618>
9/7 13:57:14 In ViewServer::Init()
9/7 13:57:14 In CollectorDaemon::Init()
9/7 13:57:14 In ViewServer::Config()
9/7 13:57:14 In CollectorDaemon::Config()
9/7 13:57:14 enable: Creating stats hash table
9/7 13:57:14 (Sent 0 ads in response to query)
9/7 13:57:14 Got QUERY_STARTD_PVT_ADS
9/7 13:57:14 (Sent 0 ads in response to query)
9/7 13:57:14 WARNING: No master ad for < HEAVEN.inrialpes.fr >
9/7 13:57:14 ScheddAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:14 stats: Inserting new hashent for
'Schedd':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:19 ** Master < HEAVEN.inrialpes.fr > rejuvenated from recently down
9/7 13:57:19 stats: Inserting new hashent for
'Master':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:30 StartdAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:30 stats: Inserting new hashent for
'Start':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:30 StartdPvtAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:30 stats: Inserting new hashent for
'StartdPvt':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:58:59 Got QUERY_STARTD_ADS
9/7 13:58:59 (Sent 1 ads in response to query)
9/7 14:02:14 (Sent 3 ads in response to query)
9/7 14:02:14 Got QUERY_STARTD_PVT_ADS
9/7 14:02:14 (Sent 1 ads in response to query)
9/7 14:07:14 (Sent 3 ads in response to query)
9/7 14:07:14 Got QUERY_STARTD_PVT_ADS
9/7 14:07:14 (Sent 1 ads in response to query)
~
------------------------------------
Quoting Prashant Lal <lalp@xxxxxxxxxxx>:
do condor_reschedule on that machien and see
LAL
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx on behalf of abhi
Sent: Wed 9/7/2005 4:19 PM
To: Condor-Users Mail List
Subject: RE: [Condor-users] idle jobs
execute the following command
echo "START=TRUE" >> /path of machine's local file/condor_config.local
condor_reconfig
-----Original Message-----
From: lizhe.wang@xxxxxxxxxxxx
Sent: Wed, 7 Sep 2005 11:30:39 +0200
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] idle jobs
Dear all:
I 'm new to condor on a single machine for test.
This machine take the role of submit, execute and manager.
I installed condor and submit an example submission file like:
Executable=/bin/date
Log =/tmp/logr
output=/tmp/logr.out
Queue =
when I run condor_q -analyze
the output is :
0 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match, but are serving users with a better priority in the pool
1 match, match, but reject the job for unknown reasons
0 match, but will not currently preempt their existing job
0 are available to run your job
If I want that any job can run this machine regardless of the status of
the
machine.
I have set the START = True in the configure file, it seems it does not
work.
How can I configure the file?
any hints?
thanks,
Lizhe
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
|