Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] job submission using condor-G to gt4
- Date: Sat, 17 Dec 2005 01:40:51 -0800 (PST)
- From: Vinodh <gvinodh1980@xxxxxxxxxxx>
- Subject: Re: [Condor-users] job submission using condor-G to gt4
hi dan,
u r right. the problem was with the
filesystemdomain. it is different in all the nodes.
i am having one more doubt. now i added one more
line in the job description file "requirements = Arch
== Linux".
then, i submitted this file using condor_submit.
condor_g submits this job fine.
the condor_q -ana is
204.000: Run analysis summary. Of 13 machines,
13 are rejected by your job's requirements
0 reject your job because of their own
requirements
0 match but are serving users with a better
priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their
existing job
0 are available to run your job
WARNING: Be advised:
No resources matched request's constraints
Check the Requirements expression below:
Requirements = (Arch == Linux)
the globus job is submitted as vanilla universe job in
another machine.
condor_q -ana on that machine is
011.000: Run analysis summary. Of 13 machines,
13 are rejected by your job's requirements
0 reject your job because of their own
requirements
0 match but are serving users with a better
priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their
existing job
0 are available to run your job
No successful match recorded.
Last failed match: Sat Dec 17 15:07:14 2005
Reason for last match failure: no match found
WARNING: Be advised:
No resources matched request's constraints
Check the Requirements expression below:
Requirements = (OpSys == "LINUX" && Arch == "INTEL")
&& (Disk >= DiskUsage) && ((Memory * 1024) >=
ImageSize) && (TARGET.FileSystemDomain ==
MY.FileSystemDomain)
my question is "how to change these requirements, from
where the condor picks these values?"
Regards,
Vinodh Kumar. G
--- Dan Bradley <dan@xxxxxxxxxxxx> wrote:
> Vinodh,
>
> Comments are inline below:
>
> Vinodh wrote:
>
> >hi,
> >
> > i trying to submit a file using condor_G.
> >
> >the command i gave was condor_submit hi, where hi
> is
> >
> >executable = /bin/ls
> >transfer_executable=false
> >arguments = -l
> >universe = grid
> >grid_type = gt4
> >globusscheduler = advaitha:8443
> >jobmanager_type = Fork
> >output = inspiral.out
> >error = inspiral.err
> >log = inspiral.log
> >notification = error
> >queue 1
> >
> >
> >this is working fine and the log is
> >
> >000 (185.000.000) 12/16 12:21:59 Job submitted from
> >host: <172.25.243.135:57464>
> >017 (185.000.000) 12/16 12:22:15 Job submitted to
> >Globus
> > RM-Contact: advaitha:8443
> > JM-Contact:
>
>https://172.25.243.135:8443/wsrf/services/ManagedExecutableJobService?77c697d0-6e00-11da-9d7a-da23fb7f3afa
> > Can-Restart-JM: 0
> >...
> >001 (185.000.000) 12/16 12:22:23 Job executing on
> >host: gt4 advaitha:8443 Fork
> >...
> >005 (185.000.000) 12/16 12:22:31 Job terminated.
> > (1) Normal termination (return value 0)
> > Usr 0 00:00:00, Sys 0 00:00:00 -
> Run
> >Remote Usage
> > Usr 0 00:00:00, Sys 0 00:00:00 -
> Run
> >Local Usage
> > Usr 0 00:00:00, Sys 0 00:00:00 -
> >Total Remote Usage
> > Usr 0 00:00:00, Sys 0 00:00:00 -
> >Total Local Usage
> > 0 - Run Bytes Sent By Job
> > 0 - Run Bytes Received By Job
> > 0 - Total Bytes Sent By Job
> > 0 - Total Bytes Received By Job
> >
> >then, in the file hi i changed the jobmanager_type
> as
> >Condor. then, its not working.
> >
> >after my submission, condor_q -ana gave the output
> >
> >186.000: Run analysis summary. Of 13 machines,
> > 0 are rejected by your job's requirements
> > 3 reject your job because of their own
> >requirements
> > 0 match but are serving users with a better
> >priority in the pool
> > 10 match but reject the job for unknown
> reasons
> > 0 match but will not currently preempt their
> >existing job
> > 0 are available to run your job
> >
> >WARNING: Analysis is only meaningful for Globus
> >universe jobs using matchmaking.
> >
>
> You are not using matchmaking, since you are
> submitting to a specific
> globusscheduler, therefore, the above analysis is
> not meaningful.
>
> >then, the command condor_q -ana gives
> >
> >186.000: Run analysis summary. Of 13 machines,
> > 0 are rejected by your job's requirements
> > 3 reject your job because of their own
> >requirements
> > 0 match but are serving users with a better
> >priority in the pool
> > 10 match but reject the job for unknown
> reasons
> > 0 match but will not currently preempt their
> >existing job
> > 0 are available to run your job
> >
> >WARNING: Analysis is only meaningful for Globus
> >universe jobs using matchmaking.
> >---
> >187.000: Run analysis summary. Of 13 machines,
> > 13 are rejected by your job's requirements
> > 0 reject your job because of their own
> >requirements
> > 0 match but are serving users with a better
> >priority in the pool
> > 0 match but reject the job for unknown
> reasons
> > 0 match but will not currently preempt their
> >existing job
> > 0 are available to run your job
> > No successful match recorded.
> > Last failed match: Fri Dec 16 12:26:34 2005
> > Reason for last match failure: no match
> found
> >
> >WARNING: Be advised:
> > No resources matched request's constraints
> > Check the Requirements expression below:
> >
> >Requirements = (OpSys == "LINUX" && Arch ==
> "INTEL")
> >&& (Disk >= DiskUsage) && ((Memory * 1024) >=
> >ImageSize) && (TARGET.FileSystemDomain ==
> >MY.FileSystemDomain)
> >
>
> Ok, this last bit _is_ meaningful, because the
> second job is the plain
> vanilla universe job that was submitted by the
> Condor jobmanager for
> Globus when it received the job that Condor-G
> submitted through the
> globus protocols.
>
> The problem is that the requirements expression for
> this new job is not
> matching any machines in your Condor pool. My guess
> is that
> FileSystemDomain is responsible. Check the
> FileSystemDomain in the job
> (with 'condor_q -l') and in the machines in your
> pool.
>
> If they are different, then this explains the
> problem. To solve that,
> you would need to understand which filesystems the
> Globus job needs to
> access (usually at least the filesystem containing
> the GASS cache where
> the stdin/stdout files are). If all of these
> required filesystems are
> accessible from the machines in your pool, then you
> should configure
> FILESYSTEM_DOMAIN to be the same in the Condor
> configuration on the
> gatekeeper and the machines. If the filesystems are
> _not_ accessible
> from the machines in your pool, then there are ways
> of modifying the
> Condor jobmanager to enable file-transfer mode,
> which will enable some
> types of jobs to run.
>
>
> >
> >the machine itself submits one another job. both
> these
> >jobs are idle forever.
> >
> >the log file is
> >
> >000 (186.000.000) 12/16 12:26:13 Job submitted from
> >host: <172.25.243.135:57464>
> >...
> >017 (186.000.000) 12/16 12:26:26 Job submitted to
> >Globus
> > RM-Contact: advaitha:8443
> > JM-Contact:
>
>https://172.25.243.135:8443/wsrf/services/ManagedExecutableJobService?0deb5480-6e01-11da-9d7a-da23fb7f3afa
> > Can-Restart-JM: 0
> >
> >
> >the output of condor_q -globus is
> >
> >ID OWNER STATUS MANAGER HOST
>
> > EXECUTABLE
> >
> >186.0 vinodh PENDING Condor advaitha
>
> > /bin/ls
> >
> >Regards,
> >Vinodh Kumar. G
> >
> >__________________________________________________
> >Do You Yahoo!?
> >Tired of spam? Yahoo! Mail has the best spam
>
=== message truncated ===
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam
protection around
http://mail.yahoo.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com