Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] mpi job stuck as idle
- Date: Mon, 22 Jan 2018 14:50:43 -0600
- From: Jason Patton <jpatton@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] mpi job stuck as idle
This config will need to be on all the execute machines that should be
allowed to run parallel universe jobs, and then condor_reconfig should
be run on them. The config tells the execute node to trust the submit
node (what I think you mean by frontend) as the dedicated scheduler
for parallel universe jobs.
Jason
On Mon, Jan 22, 2018 at 2:46 PM, Mahmood Naderan <nt_mahmood@xxxxxxxxx> wrote:
>>Your modified
>>condor_config.local.dedicated.resource should go in
>>/etc/condor/config.d/ on the execute machine, which is where condor
>>looks for supplemental config files.
>
>
> Do you mean, I have to scp that file to all nodes, including the frontend? I
> just want to interpret execute machine for myself.
>
>
> # cp /opt/condor/etc/example/condor_config.local.dedicated.resource
> /opt/condor/etc/config.d/
> # scp /opt/condor/etc/example/condor_config.local.dedicated.resource
> compute-0-0:/opt/condor/etc/config.d/
> # scp /opt/condor/etc/example/condor_config.local.dedicated.resource
> compute-0-1:/opt/condor/etc/config.d/
> ...
>
>
> Then, should I run condor_restart on the frontend only? Or I have to ssh to
> all nodes and run condor_restart on them?
>
>
> Regards,
> Mahmood
>