Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] a question about the condor cluster: i can not determin if the submit machine is connected with the central manager!!!
- Date: Mon, 30 Jun 2008 11:52:18 +0800
- From: "张家贞" <zhangjiazhen@xxxxxxxxxxxxxxxxxx>
- Subject: [Condor-users] a question about the condor cluster: i can not determin if the submit machine is connected with the central manager!!!
condor-users,hi!
at first,thanks for reading the question. i installed condor on one machine as the cengtral manager as a manager and excute role.
it run as follows:
[root@cngrid219 condor]# ps -ef| egrep condor
root 2720 1 0 Jun29 ? 00:00:10 condor_master
root 2721 2720 0 Jun29 ? 00:00:01 condor_collector -f
root 2722 2720 0 Jun29 ? 00:00:00 condor_negotiator -f
root 2723 2720 0 Jun29 ? 00:00:19 condor_startd -f
root 3483 3309 0 11:09 pts/0 00:00:00 grep -E condor
[root@cngrid219 condor]# condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@cngrid219 LINUX INTEL Owner Idle 0.000 1007 0+00:05:04
slot2@cngrid219 LINUX INTEL Unclaimed Idle 0.000 1007 0+01:00:09
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/LINUX 2 1 0 1 0 0 0
Total 2 1 0 1 0 0 0
and then i installed condor on the other machine as the submit machine role:
it is running as follows:
[root@cngrid239 ~]# ps -ef | grep condor
condor 4550 1 3 11:53 ? 00:01:09 ./condor_master
condor 4551 4550 3 11:53 ? 00:00:56 condor_schedd -f
root 4552 4551 0 11:53 ? 00:00:00 condor_procd -A /tmp/condor-lock.cngrid2390.791864523737789/procd_pipe.SCHEDD -S 60 -C 501
when i submit 10 job:
-- Submitter: cngrid239.localdomain : <127.0.0.1:32869> : cngrid239.localdomain
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
2.0 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.1 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.2 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.3 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.4 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.5 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.6 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.7 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.8 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
2.9 condor 6/30 12:08 0+00:00:00 I 0 0.0 nodejob.exe
10 jobs; 10 idle, 0 running, 0 held
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*****
ALL the job is idle. why? my job is so simple that it just print something out!!!!
is my submit machine cngird239 connected with the central manager 219?
i installed the 239 using: #condor-configure --install --type=submit --local-dir=/home/condor --central-manager=cngird219.xxxx
i have pinged the cngird219.xxxx, it is ok!
who can tell me why? why are the jobs idle not running?
thanks !!!!!
regards
jiazhen zhang
zhangjiazhen@xxxxxxxxxxxxxxxxxx
2008-06-30