Jason,
Now i can run your MPI program with correct output. Thank you so much!
I am a little bit confused by the concept "machine". In this presentation, https://meetings.internet2.edu/media/medialibrary/2015/10/19/20151008-thain-htcondor-admin-tutorial.pdfit says: "Machine â An individual computer, managed by one startd", this means "machine" is a physical machine.
but when I run condor_q on my 24-core server(I have only this server), i got result as follows:Machines Owner Claimed Unclaimed Matched Preempting DrainX86_64/LINUX 24 0 4 20 0 0 0Total 24 0 4 20 0 0 0Here "machines" is 24, it means it's not a "physical" machine, but a core or a slot.
Could you please clarify for me? In addition, what does node mean? My condor version is 8.6.12 for CentOS.
hufh
_______________________________________________On Fri, Nov 16, 2018 at 12:41 AM Jason Patton <jpatton@xxxxxxxxxxx> wrote:
_______________________________________________On Thu, Nov 15, 2018 at 10:31 AM hufh <hufh2004@xxxxxxxxx> wrote:
Hi Jason,
Is "a.out" in your script a MPI program?
Yes. It has to be referenced in both the submit file (to be transferred to the execute node) and the wrapper script (to be exec'd).
Here's my code for reference:
---#include <mpi.h>#include <stdio.h>
int main(int argc, char** argv) {MPI_Init(NULL, NULL);
// number of processesint world_size;MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// rank of the this processint world_rank;MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// name of this processorchar processor_name[MPI_MAX_PROCESSOR_NAME];int name_len;MPI_Get_processor_name(processor_name, &name_len);
// print hello world messageprintf("Hello world from processor %s, rank %d out of %d processors\n",processor_name, world_rank, world_size);
// print arguments, one on each linefor (int i = 1; i < argc; ++i) {printf("I was given argument %s\n",argv[i]);}
sleep(5);
MPI_Finalize();
}---
Jason
hufh
_______________________________________________On Thu, Nov 15, 2018 at 11:03 PM Jason Patton <jpatton@xxxxxxxxxxx> wrote:
_______________________________________________Here's my submit file:
---universe = parallel
executable = openmpiscriptarguments = mpi_wrapper.shtransfer_input_files = a.out, mpi_wrapper.shgetenv = true
should_transfer_files = yeswhen_to_transfer_output = on_exit_or_evict+ParallelShutdownPolicy = "WAIT_FOR_ALL"
output = out.$(NODE)error = err.$(NODE)log = log
request_cpus = 1machine_count = 4
queue---
Here's mpi_wrapper.sh:
---#!/bin/sh
if [ "$_CONDOR_PROCNO" -lt 2 ]; thenexec ./a.out '_CONDOR_PROCNO='$_CONDOR_PROCNO args1elseexec ./a.out '_CONDOR_PROCNO='$_CONDOR_PROCNO args2fi---
I'm using $_CONDOR_PROCNO to figure out which node of my MPI job is running and passing arguments to my MPI application (a.out) based on its value.
Jason
On Thu, Nov 15, 2018 at 6:12 AM hufh <hufh2004@xxxxxxxxx> wrote:
Hi Jason,_______________________________________________
Sorry for late reply. I have tried your method, but it didn't work. Could you please send me your submit file and other stuff so that I can try it on my machines.
Thanks for your help!
hufh
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/