[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] self-contained grid access point/submitter to HTCondorCE



In addition to what Jaime said, I'd like to caution against using a CE just to enable remote access. CEs are designed with glideins/pilots in mind and will do things like aggressively remove held jobs from the queue under the assumption that another glidein/pilot will just shortly take its place.

- Brian

On 1/27/23 15:58, Jaime Frey via HTCondor-users wrote:
In your submit file, Iâm assuming that grid-htcondorce0.desy.de is the HTCondor-CE. You should be using this line in your submit files:

grid_resource = condor grid-htcondorce0.desy.de grid-htcondorce0.desy.de:9619

The last field is the collector that should be contacted to query the CE scheddâs address.

Since X.509 credentials are not supported for authentication in HTCondor 10.0 and beyond, a Sci/WLCG token is the standard way to authenticate with a CE for job submission.

  - Jaime

On Jan 26, 2023, at 9:56 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:

Hi all,

we are looking into how to best set up a somewhat self-contained grid access point. I.e., a node from which a user can submit their jobs to a CondorCE and retrieve their job outputs 'easily'.

We prepared a node (aiming for a containerized environment for users) with master+collector+scheduler daemons running locally plus a gridmanager/ghap daemon(?). Idea would be, that one could submit grid jobs to the local collector, which relays them with the GHAP to a CondorCE as remote access point.

Unfortunately, test jobs [1] do not progress beyond the local queue. The job are picked up by the grid helper [2] - however, the ghap helper only sees the remote AP as always down :-/
(actually, I have not see IPv4,6 traffic towards the CE with tcpdump or the submit node/IPs in the CE logs)

Maybe there is a puzzle piece missing? ð

On the longer run, would be a Sci/WLCG token submission for users work with the ghap helpers? I.e. instead of x509*, export/include as a user BEARER_TOKEN_FILE in the submit?

Cheers,
  Thomas

[1]
universe = grid
grid_resource = condor grid-htcondorce0.desy.de localhost
use_x509userproxy = true
X509UserProxy=$ENV(X509_USER_PROXY)
executable = x.sh
output = stdout
error = stderr
log = logs
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
+remote_jobuniverse = 5
+remote_requirements = True
+remote_ShouldTransferFiles = "YES"
+remote_WhenToTransferOutput = "ON_EXIT"
queue

[2]
   CGroup: /system.slice/condor.service
           ââ12562 /usr/sbin/condor_master -f
           ââ12603 condor_procd -A /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R 1000000 -S 60 -C 25411
           ââ12605 condor_shared_port
           ââ12606 condor_collector
           ââ12607 condor_schedd
           ââ12638 condor_gridmanager -f -C (Owner=?="grid"&&JobUniverse==9) -o grid -S /tmp/condor_g_scratch.0x55f349f65100.12607
           ââ12643 /usr/sbin/condor_c-gahp -f -s grid-htcondorce0.desy.de -P localhost
           ââ12645 /usr/sbin/condor_c-gahp_worker_thread -f -s grid-htcondorce0.desy.de -P localhost
           ââ12646 /usr/sbin/condor_c-gahp_worker_thread -f -s grid-htcondorce0.desy.de -P localhost

[3]
01/26/23 16:35:15 [12638] Found job 9.0 --- inserting
01/26/23 16:35:15 [12638] Found job 8.0 --- inserting
01/26/23 16:35:15 [12638] Found job 11.0 --- inserting
01/26/23 16:35:15 [12638] Found job 7.0 --- inserting
01/26/23 16:35:15 [12638] Found job 10.0 --- inserting
01/26/23 16:35:15 [12638] (9.0) doEvaluateState called: gmState GM_INIT, remoteState 0
01/26/23 16:35:15 [12638] (8.0) doEvaluateState called: gmState GM_INIT, remoteState 0
01/26/23 16:35:15 [12638] (11.0) doEvaluateState called: gmState GM_INIT, remoteState 0
01/26/23 16:35:15 [12638] (7.0) doEvaluateState called: gmState GM_INIT, remoteState 0
01/26/23 16:35:15 [12638] (10.0) doEvaluateState called: gmState GM_INIT, remoteState 0
01/26/23 16:38:25 [12638] resource grid-htcondorce0.desy.de is still down


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/