Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] job submission using condor-G to gt4
- Date: Fri, 16 Dec 2005 11:21:22 -0600
- From: Dan Bradley <dan@xxxxxxxxxxxx>
- Subject: Re: [Condor-users] job submission using condor-G to gt4
Vinodh,
Comments are inline below:
Vinodh wrote:
hi,
i trying to submit a file using condor_G.
the command i gave was condor_submit hi, where hi is
executable = /bin/ls
transfer_executable=false
arguments = -l
universe = grid
grid_type = gt4
globusscheduler = advaitha:8443
jobmanager_type = Fork
output = inspiral.out
error = inspiral.err
log = inspiral.log
notification = error
queue 1
this is working fine and the log is
000 (185.000.000) 12/16 12:21:59 Job submitted from
host: <172.25.243.135:57464>
017 (185.000.000) 12/16 12:22:15 Job submitted to
Globus
RM-Contact: advaitha:8443
JM-Contact:
https://172.25.243.135:8443/wsrf/services/ManagedExecutableJobService?77c697d0-6e00-11da-9d7a-da23fb7f3afa
Can-Restart-JM: 0
...
001 (185.000.000) 12/16 12:22:23 Job executing on
host: gt4 advaitha:8443 Fork
...
005 (185.000.000) 12/16 12:22:31 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:00, Sys 0 00:00:00 - Run
Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run
Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 -
Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 -
Total Local Usage
0 - Run Bytes Sent By Job
0 - Run Bytes Received By Job
0 - Total Bytes Sent By Job
0 - Total Bytes Received By Job
then, in the file hi i changed the jobmanager_type as
Condor. then, its not working.
after my submission, condor_q -ana gave the output
186.000: Run analysis summary. Of 13 machines,
0 are rejected by your job's requirements
3 reject your job because of their own
requirements
0 match but are serving users with a better
priority in the pool
10 match but reject the job for unknown reasons
0 match but will not currently preempt their
existing job
0 are available to run your job
WARNING: Analysis is only meaningful for Globus
universe jobs using matchmaking.
You are not using matchmaking, since you are submitting to a specific
globusscheduler, therefore, the above analysis is not meaningful.
then, the command condor_q -ana gives
186.000: Run analysis summary. Of 13 machines,
0 are rejected by your job's requirements
3 reject your job because of their own
requirements
0 match but are serving users with a better
priority in the pool
10 match but reject the job for unknown reasons
0 match but will not currently preempt their
existing job
0 are available to run your job
WARNING: Analysis is only meaningful for Globus
universe jobs using matchmaking.
---
187.000: Run analysis summary. Of 13 machines,
13 are rejected by your job's requirements
0 reject your job because of their own
requirements
0 match but are serving users with a better
priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their
existing job
0 are available to run your job
No successful match recorded.
Last failed match: Fri Dec 16 12:26:34 2005
Reason for last match failure: no match found
WARNING: Be advised:
No resources matched request's constraints
Check the Requirements expression below:
Requirements = (OpSys == "LINUX" && Arch == "INTEL")
&& (Disk >= DiskUsage) && ((Memory * 1024) >=
ImageSize) && (TARGET.FileSystemDomain ==
MY.FileSystemDomain)
Ok, this last bit _is_ meaningful, because the second job is the plain
vanilla universe job that was submitted by the Condor jobmanager for
Globus when it received the job that Condor-G submitted through the
globus protocols.
The problem is that the requirements expression for this new job is not
matching any machines in your Condor pool. My guess is that
FileSystemDomain is responsible. Check the FileSystemDomain in the job
(with 'condor_q -l') and in the machines in your pool.
If they are different, then this explains the problem. To solve that,
you would need to understand which filesystems the Globus job needs to
access (usually at least the filesystem containing the GASS cache where
the stdin/stdout files are). If all of these required filesystems are
accessible from the machines in your pool, then you should configure
FILESYSTEM_DOMAIN to be the same in the Condor configuration on the
gatekeeper and the machines. If the filesystems are _not_ accessible
from the machines in your pool, then there are ways of modifying the
Condor jobmanager to enable file-transfer mode, which will enable some
types of jobs to run.
the machine itself submits one another job. both these
jobs are idle forever.
the log file is
000 (186.000.000) 12/16 12:26:13 Job submitted from
host: <172.25.243.135:57464>
...
017 (186.000.000) 12/16 12:26:26 Job submitted to
Globus
RM-Contact: advaitha:8443
JM-Contact:
https://172.25.243.135:8443/wsrf/services/ManagedExecutableJobService?0deb5480-6e01-11da-9d7a-da23fb7f3afa
Can-Restart-JM: 0
the output of condor_q -globus is
ID OWNER STATUS MANAGER HOST
EXECUTABLE
186.0 vinodh PENDING Condor advaitha
/bin/ls
Regards,
Vinodh Kumar. G
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam
protection around
http://mail.yahoo.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users