[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] self-contained grid access point/submitter to HTCondorCE



In your submit file, Iâm assuming that grid-htcondorce0.desy.de is the HTCondor-CE. You should be using this line in your submit files:

grid_resource = condor grid-htcondorce0.desy.de grid-htcondorce0.desy.de:9619

The last field is the collector that should be contacted to query the CE scheddâs address.

Since X.509 credentials are not supported for authentication in HTCondor 10.0 and beyond, a Sci/WLCG token is the standard way to authenticate with a CE for job submission.

 - Jaime

> On Jan 26, 2023, at 9:56 AM, Thomas Hartmann <thomas.hartmann@xxxxxxx> wrote:
> 
> Hi all,
> 
> we are looking into how to best set up a somewhat self-contained grid access point. I.e., a node from which a user can submit their jobs to a CondorCE and retrieve their job outputs 'easily'.
> 
> We prepared a node (aiming for a containerized environment for users) with master+collector+scheduler daemons running locally plus a gridmanager/ghap daemon(?). Idea would be, that one could submit grid jobs to the local collector, which relays them with the GHAP to a CondorCE as remote access point.
> 
> Unfortunately, test jobs [1] do not progress beyond the local queue. The job are picked up by the grid helper [2] - however, the ghap helper only sees the remote AP as always down :-/
> (actually, I have not see IPv4,6 traffic towards the CE with tcpdump or the submit node/IPs in the CE logs)
> 
> Maybe there is a puzzle piece missing? ð
> 
> On the longer run, would be a Sci/WLCG token submission for users work with the ghap helpers? I.e. instead of x509*, export/include as a user BEARER_TOKEN_FILE in the submit?
> 
> Cheers,
>  Thomas
> 
> [1]
> universe = grid
> grid_resource = condor grid-htcondorce0.desy.de localhost
> use_x509userproxy = true
> X509UserProxy=$ENV(X509_USER_PROXY)
> executable = x.sh
> output = stdout
> error = stderr
> log = logs
> ShouldTransferFiles = YES
> WhenToTransferOutput = ON_EXIT
> +remote_jobuniverse = 5
> +remote_requirements = True
> +remote_ShouldTransferFiles = "YES"
> +remote_WhenToTransferOutput = "ON_EXIT"
> queue
> 
> [2]
>   CGroup: /system.slice/condor.service
>           ââ12562 /usr/sbin/condor_master -f
>           ââ12603 condor_procd -A /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R 1000000 -S 60 -C 25411
>           ââ12605 condor_shared_port
>           ââ12606 condor_collector
>           ââ12607 condor_schedd
>           ââ12638 condor_gridmanager -f -C (Owner=?="grid"&&JobUniverse==9) -o grid -S /tmp/condor_g_scratch.0x55f349f65100.12607
>           ââ12643 /usr/sbin/condor_c-gahp -f -s grid-htcondorce0.desy.de -P localhost
>           ââ12645 /usr/sbin/condor_c-gahp_worker_thread -f -s grid-htcondorce0.desy.de -P localhost
>           ââ12646 /usr/sbin/condor_c-gahp_worker_thread -f -s grid-htcondorce0.desy.de -P localhost
> 
> [3]
> 01/26/23 16:35:15 [12638] Found job 9.0 --- inserting
> 01/26/23 16:35:15 [12638] Found job 8.0 --- inserting
> 01/26/23 16:35:15 [12638] Found job 11.0 --- inserting
> 01/26/23 16:35:15 [12638] Found job 7.0 --- inserting
> 01/26/23 16:35:15 [12638] Found job 10.0 --- inserting
> 01/26/23 16:35:15 [12638] (9.0) doEvaluateState called: gmState GM_INIT, remoteState 0
> 01/26/23 16:35:15 [12638] (8.0) doEvaluateState called: gmState GM_INIT, remoteState 0
> 01/26/23 16:35:15 [12638] (11.0) doEvaluateState called: gmState GM_INIT, remoteState 0
> 01/26/23 16:35:15 [12638] (7.0) doEvaluateState called: gmState GM_INIT, remoteState 0
> 01/26/23 16:35:15 [12638] (10.0) doEvaluateState called: gmState GM_INIT, remoteState 0
> 01/26/23 16:38:25 [12638] resource grid-htcondorce0.desy.de is still down
> 
> 
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/