More to add on this
troubleshooting: Intentionally I mistyped the submission file, this due to the
inability of running condor_q –better in order to obtain all the
requirements of my job. I got the message below. As you can see I never
stipulate in my description file the requirement about the amount of memory. Where
are these settings coming from? Any input will be much
appreciated. Alex Please see below: Submitting job(s) ERROR: Parse error in
_expression_:
Requirements = (((Arch == "INTEL" && OpSys ==
"WINNT51") || (Arch == " INTEL" && OpSys ==
"WINNT52"))) && (Disk >= DiskUsage) && ( (Memory *
1024) >= ImageSize )&& (HasFileTransfer) &&
(HasWindowsRunAsOwner && (LocalCredd =?= "centralmanager.domain.com:9620"))
^^^ Error in submit file From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On
Behalf Of Alas, Alex [FEDI] Again, hello to all of you, In addition to my previous
e-mail I ran the condor_q –analyze and the results are: 084.049: Run analysis
summary. Of 20 machines, 19 are
rejected by your job's requirements 0
reject your job because of their own requirements 1
match but are serving users with a better priority in the pool 0
match but reject the job for unknown reasons 0
match but will not currently preempt their existing job 0
are available to run your job When I run the condor_status I
have the following results: C:\WINDOWS\system32>condor_status Name
OpSys Arch
State Activity LoadAv Mem ActvtyTime Computer1.domain.com
WINNT51 INTEL Unclaimed Idle
0.060 1022 0+00:45:03 Computer2.domain.com
WINNT51 INTEL Unclaimed Idle
0.230 1022 0+00:00:49 slot1@xxxxxxxxxxxxxxxx
WINNT51 INTEL Unclaimed Idle
0.000 1022 5+22:33:03 slot2@xxxxxxxxxxxxxxxx
WINNT51 INTEL Unclaimed Idle
0.030 1022 0+02:30:05 slot1@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:21:17 slot2@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+00:20:05 slot3@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:21:19 slot4@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:21:20 slot1@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+21:24:31 slot2@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+21:28:45 slot3@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+02:30:06 slot4@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+21:33:45 slot1@xxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:26:28 slot2@xxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+00:25:05 slot3@xxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:26:30 slot4@xxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 2+20:26:31 slot1@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+03:35:41 slot2@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+03:35:42 slot3@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.050 511 0+03:35:43 slot4@xxxxxxxxxxxxxxxx
WINNT52 INTEL Unclaimed Idle
0.000 511 0+00:25:07
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/WINNT51 4
0
0
4 0
0 0
INTEL/WINNT52 16
0
0
16
0
0 0
Total 20
0
0 20
0
0 0 Unfortunately, I am not a condor
expert to fully understand what this error message is trying to tell me or what
could be the best wayt to interpret it. Also when I tried to run condor_q
–better I got the following message: Sorry, the -better-analyze
option is not available on this platform. Due to the message, I know now
there is something wrong on my job’s requirements that is preventing the
job to match other nodes but I don’t know what? If anyone had experienced
a similar issue and know more less how to get it to work, I really would
appreciate your input, Alex From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Alas, Alex [FEDI] Hello to all of you, I have a little issue with a type of job I am trying to
submit. I have a condor pool of 20 nodes. I initially upgrade all the pool to
version 7.05 but after reading all the issues that version was having with
pre-empting jobs I decide to downgrade the central manager to version 7.01. The
description file is the following way: ######################################################################################### # Description file for Batch File for TESTING purposes ######################################################################################### universe = vanilla requirements = (Arch == "INTEL" && OpSys
== "WINNT51") || \
(Arch
== "INTEL" && OpSys == "WINNT52") getenv = True notify_user=usename@xxxxxxxxxx initialdir = c:\condor\execute_bk should_transfer_files = YES when_to_transfer_output = ON_EXIT Transfer_input_files = c:\windows\system32\systeminfo.exe run_as_owner = true executable = Batch4testv2.bat output = Batch4testv3.out.$(Process) error = Batch4testv3.err.$(Process) log = Batch4testv3.log queue 10 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx If the job is submitted like that It will only run on one
machine, if I omit the run as owner line, it will run fine on all the different
nodes. Not a problem as I said after removing the line. But this condor project
was originally implemented to run jobs over network shares. For that I
configured the pool to have a credd_host (which is the central manager) and the
I created a condoruser with some reading and limited right to run those jobs. I
set the condor_pool and the condoruser credentials\passwords on all the
different computers set as execute machines. When I run the condor_store_cred
query –c and condor_store_cred query –u condoruser all the
computers come back saying: A credential is stored and is valid. The
description file is attached next. When I try to run this type of jobs it will
only run on one computer, the same computer as the other jobs. If I remove the
line RUN_AS_OWNER, the central manager will try to match the job with all the
pool’s nodes but it will error out due to saying: Logon failure: unknown
user name or bad password. Anyone has any ideas what log should I look into to find
answers or any suggestions to solve this issue are more than welcome, Thanks in advance for your input, Alex ################################################### ## DESCRIPTION FILE FOR CONDOR JOBS ## PREPARED BY ALEX ALAS ################################################### UNIVERSE = VANILLA REQUIREMENTS = (Arch == "INTEL" && OpSys
== "WINNT51") || \
(Arch
== "INTEL" && OpSys == "WINNT52") GETENV = TRUE NOTIFY_USER = username@xxxxxxxxxx INITIALDIR = c:\condor\execute_bk SHOULD_TRANSFER_FILES = YES WHEN_TO_TRANSFER_OUTPUT = ON_EXIT TRANSFER_INPUT_FILES =
\\fileserver\Sharedfolder1\Sharedfolder2\Sharedfolder3\lasEnvelop.exe RUN_AS_OWNER = TRUE EXECUTABLE = \\fileserver\Sharedfolder1\Sharedfolder2\Sharedfolder3\Batchfile_lasEnvelop1.bat OUTPUT = Batchfile_lasEnvelop1.out.$(Process) ERROR = Batchfile_lasEnvelop1.err.$(Process) LOG = Batchfile_lasEnvelop1.log QUEUE 25 Respectfully, Alex Alas Systems Administrator Tel. 301-948-8550 x219 Fax 301-963-2064 E-mail: aalas@xxxxxxxxxxxxx Website: http://www.fugroearthdata.com |