[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Unable to run a standard universe job.



Hi Michael,

From the analyze output it seems like that machine is rejecting your job. I would either check the START _expression_ on that machine directly (1) or do a reverse analyze with condor_q (2) to find out why.

1: condor_config_val -nameÂbane.hq.ierustech.com -v START
2: condor_q 183.0 --better-analyze -reverse -machineÂbane.hq.ierustech.com

Best,
Collin

On Fri, Jun 14, 2019 at 6:43 AM Michael Murphy <Michael.Murphy@xxxxxxxxxxxxx> wrote:
Greetings,

I am trying to run a standard job in our condor pool. However, I cannot get a test job to execute. The matchmaker is not finding a match even though my requirement only specifies a hostname. I have never run a standard job in our pool before. I am not sure it's configured properly. Here's my submit script:

universe = standard
executable = ./Cicero_CC_12750
should_transfer_files = YES
Requirements = machine == "bane.hq.ierustech.com"
when_to_transfer_output = ON_EXIT_OR_EVICT
log = $(Cluster).log

input = test_run.inp
output = test_run.out
error = test_run.err
transfer_input_files = test_run.inp
queue

The executable is compiled FORTRAN code relinked with condor_compile.Â

When I check the status and try to determine why it's not matched to the execute host I use 'condor_q -analyze -better <JOB ID>' with the following output:

[michael.murphy@banzai Condor_checkpoint_test]$ condor_q -better -analyze 183.0 
-- Schedd: banzai.hq.ierustech.com : <192.168.6.67:9618?...
The Requirements _expression_ for job 183.000 is

    ( machine == "bane.hq.ierustech.com" ) && ( TARGET.Arch == "X86_64" ) && ( TARGET.OpSys == "LINUX" ) && ( ( CkptArch == TARGET.Arch ) || ( CkptArch is undefined ) ) && ( ( CkptOpSys == TARGET.OpSys ) ||
      ( CkptOpSys is undefined ) ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory )

Job 183.000 defines the following attributes:

    DiskUsage = 3750
    ImageSize = 3500
    RequestDisk = DiskUsage
    RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,( ImageSize + 1023 ) / 1024)

The Requirements _expression_ for job 183.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]           2  machine == "bane.hq.ierustech.com"
[6]         560  CkptArch is undefined
[10]        560  CkptOpSys is undefined

No successful match recorded.
Last failed match: Fri Jun 14 08:24:48 2019

Reason for last match failure: no match found 

183.000:  Run analysis summary ignoring user priority.  Of 560 machines,
    544 are rejected by your job's requirements 
      2 reject your job because of their own requirements 
     14 are exhausted partitionable slots 
      0 match and are already running your jobs 
      0 match but are serving other users 
      0 are available to run your job

WARNING:  Be advised:
   Job did not match any machines's constraints
   To see why, pick a machine that you think should match and add
     -reverse -machine <name>
   to your query.


The submitting machine's name is "banzai.hq.ierustech.com" and the execution machine is called "bane.hq.ierustech.com".

Have I forgotten to specifiy some macros to enable std universe jobs? Thanks for your time.

-- 
Michael McInerny Murphy
IERUS Technologies, Inc.
2904 Westcorp Blvd., Suite 210
Huntsville, ALÂÂ35805
(O): (256) 319-2026 ext 107
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


--
Collin Mehring | PE-JoSE - Software Engineer