Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] How to solve problem between condor and globus?
- Date: Tue, 20 Dec 2005 18:05:35 +0800
- From: "Fu-Ming Tsai" <sary357@xxxxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] How to solve problem between condor and globus?
Sorry, all,
After trying so many times, I gave up and used NFS.
However, I still can not submit globus job to condor.
so, I tried to get some debug information.
[sary357@osgc01 job]$ condor_q -analyze
---
4206.000: Run analysis summary. Of 4 machines,
3 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
1 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
WARNING: Analysis is only meaningful for Globus universe jobs using
matchmaking.
---
4207.000: Run analysis summary. Of 4 machines,
0 are rejected by your job's requirements
3 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
1 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
Last successful match: Tue Dec 20 09:45:24 2005
Last failed match: Tue Dec 20 09:55:31 2005
Reason for last match failure: no match found
== StarterLog.vm2==
12/20 17:35:36 Shadow version: $CondorVersion: 6.7.7 Apr 27 2005 $
12/20 17:35:36 Submitting machine is "osgc01.grid.sinica.edu.tw"
12/20 17:35:36 ShouldTransferFiles is "NO", NOT transfering files
12/20 17:35:36 Submit UidDomain: "grid.sinica.edu.tw"
12/20 17:35:36 Local UidDomain: "grid.sinica.edu.tw"
12/20 17:35:36 Initialized user_priv as "sary357"
12/20 17:35:36 Done moving to directory "/opt/osg/osgs01/execute/dir_6591"
12/20 17:35:36 JICShadow::initIOProxy(): Job does not define WantIOProxy
12/20 17:35:36 No StarterUserLog found in job ClassAd
12/20 17:35:36 Starter will not write a local UserLog
12/20 17:35:36 Starting a VANILLA universe job with ID: 4207.0
12/20 17:35:36 In OsProc::OsProc()
12/20 17:35:36 Main job KillSignal: 15 (SIGTERM)
12/20 17:35:36 Main job RmKillSignal: 15 (SIGTERM)
12/20 17:35:36 Main job HoldKillSignal: 15 (SIGTERM)
12/20 17:35:36 in VanillaProc::StartJob()
12/20 17:35:36 in OsProc::StartJob()
12/20 17:35:36 IWD: /home/sary357/gram_scratch_tUb21E3Wqv
12/20 17:35:36 Input file: /dev/null
12/20 17:35:36 Failed to
open '/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std
out' as standard output: No such file or directory (errno 2)
12/20 17:35:36 Failed to
open '/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std
err' as standard error: No such file or directory (errno 2)
12/20 17:35:36 Failed to open some/all of the std files...
12/20 17:35:36 Aborting OsProc::StartJob.
12/20 17:35:36 Failed to start job, exiting
12/20 17:35:36 ShutdownFast all jobs.
12/20 17:35:36 Got ShutdownFast when no jobs running.
12/20 17:35:36 Removing /opt/osg/osgs01/execute/dir_6591
12/20 17:35:36 Attempting to remove /opt/osg/osgs01/execute/dir_6591 as
SuperUser (root)
=========================
[sary357@osgc01 job]$ condor_q -better-analyze 4206
-- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> :
osgc01.grid.sinica.edu.tw
---
4206.000: Run analysis summary. Of 4 machines,
3 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
1 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
The Requirements expression for your job is:
( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) )
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) )
1
WARNING: Analysis is only meaningful for Globus universe jobs using
matchmaking.
[sary357@osgc01 job]$ condor_q -better-analyze 4207
-- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> :
osgc01.grid.sinica.edu.tw
Segmentation fault
I'm sure the FileDomain in those 2 machines are the same.
It looks like the output file and error file can not be built. Does anyone
know?
BR
----------------------------------------------------------------------
"Gravitation is not responsible for people falling in love."
Fu-Ming Tsai
Academia Sinica Computing Centre, Academia Sinica
sary357@xxxxxxxxxxxxxxxxxx
------------------------------------------------------------------------