[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] 10.0.5 install



On 6/26/2023 4:13 PM, Weatherby,Gerard wrote:

Trying to use the quick start https://htcondor.readthedocs.io/en/latest/getting-htcondor/admin-quick-start.html

I ran this on a jnew central manager.

#!/bin/bash
ORIGIN=$(dirname $(readlink -f $0))

 

PW=$(cat .pw)

curl -fsSL https://get.htcondor.org | sudo GET_HTCONDOR_PASSWORD="$PW" /bin/bash -s -- --no-dry-run --central-manager condorcentralmanager.nmrbox.org

I ran this on a  new submit node.

#!/bin/bash

ORIGIN=$(dirname $(readlink -f $0))

PW=$(cat .pw)

curl -fsSL https://get.htcondor.org | sudo GET_HTCONDOR_PASSWORD="$PW" /bin/bash -s -- --no-dry-run --submit condorcentralmanager.nmrbox.org

I ran this on a new execute node
#!/bin/bash

ORIGIN=$(dirname $(readlink -f $0))

PW=$(cat .pw)

curl -fsSL https://get.htcondor.org | sudo GET_HTCONDOR_PASSWORD="$PW" /bin/bash -s -- --no-dry-run --execute condorcentralmanager.nmrbox.org

Then I submitted this job:

Executable   = /bin/ls

output                  = listing.txt

Log          = listing.log

Queue

The job remains idle. The NegotiatorLog says:


Hi Gerard,

Sorry to see you are having trouble!  

I just tried the above on my laptop using Docker and all worked fine -- see details below for how I did this test.
  
What distro/ version of Linux are you using?  Are all three nodes on the same network?  Do they all have hostname entries in either DNS or /etc/hosts ?  Your email subject line says "10.0.5 install", but the commands below would pull from the Feature channel and thus install HTCondor v10.5.0 currently (perhaps that is what you meant).

Was the contents of the ".pw" file Does "condor_status" show available slots from your execute node?  Given the "skipped because submitterCeiling" message below, what is the output from  

    condor_status -af name cpus slotweight

?

FWIW, here is how I tested:

On a machine with Docker installed, open up three terminal windows.

In window #1 enter, create a virtual network (so all three containers we create are on the same network), setup a blank new machine and install your central manager:

$ docker network create tvang9-testing
$ docker run -it --rm --network tvang9-testing --hostname cm.tvang.org
almalinux
# curl -fsSL https://get.htcondor.org | GET_HTCONDOR_PASSWORD="some_secret_password" /bin/bash -s -- --no-dry-run --central-manager cm.tvang.org

In window #2, setup a blank new machine and install the execution point:

$ docker run -it --rm --network tvang9-testing --hostname ep.tvang.org almalinux
# curl -fsSL https://get.htcondor.org | GET_HTCONDOR_PASSWORD="some_secret_password" /bin/bash -s -- --no-dry-run --execute cm.tvang.org

In window #3, setup a blank new machine and install the access point where users can submit jobs.  Then you can create a regular (non-root) user to try some commands and test some jobs!

$ docker run -it --rm --network tvang9-testing --hostname ap.tvang.org almalinux
# curl -fsSL https://get.htcondor.org | GET_HTCONDOR_PASSWORD="some_secret_password" /bin/bash -s -- --no-dry-run --submit cm.tvang.org
# useradd tvang
# su - tvang
$ condor_q
$ condor_status
$ condor_submit executable="/bin/echo" arguments="Hello, world!" log=hello.log output=hello.out -queue 1
$ condor_wait hello.log
$ more hello.out
$ more hello.log

Hope the above helps,
Todd




06/26/23 17:11:46 Starting prefetch negotiation for gweatherby@xxxxxxxxxxxxxxxxxxxxxxx.

06/26/23 17:11:46     Got NO_MORE_JOBS;  schedd has no more requests

06/26/23 17:11:46 Prefetch summary: 1 attempted, 1 successful.

06/26/23 17:11:46 Phase 4.1:  Negotiating with schedds ...

06/26/23 17:11:46   Negotiating with gweatherby@xxxxxxxxxxxxxxxxxxxxxxx at <155.37.253.166:9618?addrs=155.37.253.166-9618&alias=test-condor1.nmrbox.org&noUDP&sock=schedd_1356_ae83>

06/26/23 17:11:46 0 seconds so far for this submitter

06/26/23 17:11:46 0 seconds so far for this schedd

06/26/23 17:11:46   Negotiation with gweatherby@xxxxxxxxxxxxxxxxxxxxxxx skipped because submitterCeiling remaining is 2147483647

06/26/23 17:11:46  negotiateWithGroup resources used submitterAds length 0

06/26/23 17:11:46 ---------- Finished Negotiation Cycle ----------


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


-- 
Todd Tannenbaum <tannenba@xxxxxxxxxxx>  University of Wisconsin-Madison
Center for High Throughput Computing    Department of Computer Sciences
Calendar: https://tinyurl.com/yd55mtgd  1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                   Madison, WI 53706-1685