Hi Fu-Ming (i suppose this is your firstname) I can see in the condor_config osgc01.grid.sinica.edu.tw is the cetral manager and you're submitting the job to this host. I suppose you have Globus with Condor as Scheduler on it. So why do you use the Globus Universe and not Vanilla? More important are the requirements in the condor_config. Try to replace UWCS_* with TESTINGMODE_*. You can see the settings after Part3 in section: ##################################################################### ## This where you choose the configuration that you would like to ## use. It has no defaults so it must be defined. We start this ## file off with the UWCS_* policy. ###################################################################### I send you one of my config_files as example. Pedro -----Ursprüngliche Nachricht----- Von: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] Im Auftrag von Fu-Ming Tsai Gesendet: Donnerstag, 22. Dezember 2005 09:44 An: Condor-Users Mail List Betreff: Re: [Condor-users] How to solve problem between condor and globus? Hello, Pedro, Please refer the attched file. It's my global condor_config file. the following is my submit file. [root@osgc01 root]# more /home/sary357/job/job4.jdl Universe = globus globusscheduler = osgc01.grid.sinica.edu.tw/jobmanager-condor Executable = job4.sh Output = job4.out Error = job4.err Log = job4.log Requirements = (Name=="vm2@xxxxxxxxxxxxxxxxxxxxxxxxx") should_transer_file = IF_NEEDED when_to_transfer_output = ON_EXIT Queue [root@osgc01 root]# more /home/sary357/job/job4.sh #!/bin/bash /bin/hostname Thank you for your attention!! BR On Wed, 21 Dec 2005 19:12:40 +0100, Pedro R. Br輍ger Taboada wrote > I see many problems, staging, universe and expression. I need to see > the submit file and the condor_config file. Perhaps the I can solve > your problem. > > Pedro > > -----Ursprgliche Nachricht----- > Von: condor-users-bounces@xxxxxxxxxxx > [mailto:condor-users-bounces@xxxxxxxxxxx] Im Auftrag von Fu-Ming Tsai > Gesendet: Dienstag, 20. Dezember 2005 11:06 > An: Condor-Users Mail List > Betreff: Re: [Condor-users] How to solve problem between condor and globus? > > Sorry, all, > After trying so many times, I gave up and used NFS. > However, I still can not submit globus job to condor. > so, I tried to get some debug information. > > [sary357@osgc01 job]$ condor_q -analyze > --- > 4206.000: Run analysis summary. Of 4 machines, > 3 are rejected by your job's requirements > 0 reject your job because of their own requirements > 0 match but are serving users with a better priority in the > pool 1 match but reject the job for unknown reasons 0 > match but will not currently preempt their existing job 0 are > available to run your job > > WARNING: Analysis is only meaningful for Globus universe jobs using > matchmaking. > --- > 4207.000: Run analysis summary. Of 4 machines, > 0 are rejected by your job's requirements > 3 reject your job because of their own requirements > 0 match but are serving users with a better priority in the > pool 1 match but reject the job for unknown reasons 0 > match but will not currently preempt their existing job 0 are > available to run your job > Last successful match: Tue Dec 20 09:45:24 2005 > Last failed match: Tue Dec 20 09:55:31 2005 Reason > for last match failure: no match found > > == StarterLog.vm2== > 12/20 17:35:36 Shadow version: $CondorVersion: 6.7.7 Apr 27 2005 $ > 12/20 17:35:36 Submitting machine is "osgc01.grid.sinica.edu.tw" > 12/20 17:35:36 ShouldTransferFiles is "NO", NOT transfering files > 12/20 17:35:36 Submit UidDomain: "grid.sinica.edu.tw" > 12/20 17:35:36 Local UidDomain: "grid.sinica.edu.tw" > 12/20 17:35:36 Initialized user_priv as "sary357" > > 12/20 17:35:36 Done moving to directory "/opt/osg/osgs01/execute/dir_6591" > > 12/20 17:35:36 JICShadow::initIOProxy(): Job does not define WantIOProxy > 12/20 17:35:36 No StarterUserLog found in job ClassAd > 12/20 17:35:36 Starter will not write a local UserLog > 12/20 17:35:36 Starting a VANILLA universe job with ID: 4207.0 > 12/20 17:35:36 In OsProc::OsProc() > 12/20 17:35:36 Main job KillSignal: 15 (SIGTERM) > 12/20 17:35:36 Main job RmKillSignal: 15 (SIGTERM) > 12/20 17:35:36 Main job HoldKillSignal: 15 (SIGTERM) > 12/20 17:35:36 in VanillaProc::StartJob() > 12/20 17:35:36 in OsProc::StartJob() > 12/20 17:35:36 IWD: /home/sary357/gram_scratch_tUb21E3Wqv > 12/20 17:35:36 Input file: /dev/null > 12/20 17:35:36 Failed to > open > '/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std > out' as standard output: No such file or directory (errno 2) > 12/20 17:35:36 Failed to > open > '/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/17186.1135070994/std > err' as standard error: No such file or directory (errno 2) > 12/20 17:35:36 Failed to open some/all of the std files... > 12/20 17:35:36 Aborting OsProc::StartJob. > 12/20 17:35:36 Failed to start job, exiting > 12/20 17:35:36 ShutdownFast all jobs. > 12/20 17:35:36 Got ShutdownFast when no jobs running. > 12/20 17:35:36 Removing /opt/osg/osgs01/execute/dir_6591 > > 12/20 17:35:36 Attempting to remove /opt/osg/osgs01/execute/dir_6591 > as SuperUser (root) > ========================= > > [sary357@osgc01 job]$ condor_q -better-analyze 4206 > > -- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> : > osgc01.grid.sinica.edu.tw > --- > 4206.000: Run analysis summary. Of 4 machines, > 3 are rejected by your job's requirements > 0 reject your job because of their own requirements > 0 match but are serving users with a better priority in the > pool 1 match but reject the job for unknown reasons 0 > match but will not currently preempt their existing job 0 are > available to run your job > > The Requirements expression for your job is: > > ( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) ) > > Condition Machines Matched Suggestion > --------- ---------------- ---------- > 1 ( ( target.Name == "vm2@xxxxxxxxxxxxxxxxxxxxxxxxx" ) ) > 1 > > WARNING: Analysis is only meaningful for Globus universe jobs using > matchmaking. > [sary357@osgc01 job]$ condor_q -better-analyze 4207 > > -- Submitter: osgc01.grid.sinica.edu.tw : <140.109.98.41:41846> : > osgc01.grid.sinica.edu.tw > Segmentation fault > > I'm sure the FileDomain in those 2 machines are the same. > It looks like the output file and error file can not be built. Does > anyone know? > > BR > > ---------------------------------------------------------------------- > "Gravitation is not responsible for people falling in love." > > Fu-Ming Tsai > Academia Sinica Computing Centre, Academia Sinica > sary357@xxxxxxxxxxxxxxxxxx > ------------------------------------------------------------------------ > > _______________________________________________ > Condor-users mailing list > Condor-users@xxxxxxxxxxx > https://lists.cs.wisc.edu/mailman/listinfo/condor-users > > _______________________________________________ > Condor-users mailing list > Condor-users@xxxxxxxxxxx > https://lists.cs.wisc.edu/mailman/listinfo/condor-users ---------------------------------------------------------------------- "Gravitation is not responsible for people falling in love." Fu-Ming Tsai Academia Sinica Computing Centre, Academia Sinica sary357@xxxxxxxxxxxxxxxxxx ------------------------------------------------------------------------
Attachment:
condor_config
Description: Binary data