Hi
I just wanted to know if I am cooking
right?
The program, jobscript, log and outputs are
below.
Please comment.
Samir
ps: I only see co1ndout.0
as one output file. Shouldn't there be something line co1ndout.1,
co1ndout.2 and so on?
The program:
#include <stdio.h>
#include <unistd.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <string.h> #include <mpi.h>
int main (int argc, char *argv[])
{
int myrank, size; char HOST[256];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &size);
bzero(HOST, sizeof(HOST));
gethostname(HOST, sizeof(HOST));
printf("%s \n", (char *)HOST);
MPI_Finalize();
} Job script:
universe
=parallel
#initialdir = /home/skhanal/ executable =/home/skhanal/mp1script arguments = /home/skhanal/cpi machine_count =6 should_transfer_files = yes when_to_transfer_output = on_exit transfer_input_files = /home/skhanal/cpi output = co1ndout.$(NODE) error = co1nderr.$(NODE) log = condor.log queue
content
of: co1ndout.0
running /home/skhanal/cpi on 6 LINUX ch_p4
processors
Created /var/opt/condor/execute/dir_5207/PIazvckb5389 compute-0-1.local compute-0-2.local compute-0-3.local compute-0-0.local compute-0-4.local compute-0-7.local Content of :
co1nderr.0
empty Content of
Condor.log:
000 (082.000.000) 03/20 18:21:35 Job submitted from
host: <129.1.64.210:32773>
... 014 (082.000.000) 03/20 18:23:41 Node 0 executing on host: <10.255.255.247:32785> ... 014 (082.000.001) 03/20 18:23:41 Node 1 executing on host: <10.255.255.254:32785> ... 014 (082.000.002) 03/20 18:23:41 Node 2 executing on host: <10.255.255.250:32785> ... 014 (082.000.003) 03/20 18:23:41 Node 3 executing on host: <10.255.255.253:32785> ... 014 (082.000.004) 03/20 18:23:41 Node 4 executing on host: <10.255.255.252:32785> ... 014 (082.000.005) 03/20 18:23:42 Node 5 executing on host: <10.255.255.251:32785> ... 001 (082.000.000) 03/20 18:23:42 Job executing on host: MPI_job ... 015 (082.000.000) 03/20 18:23:47 Node 0 terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 1357 - Run Bytes Sent By Node 333145 - Run Bytes Received By Node 1357 - Total Bytes Sent By Node 333145 - Total Bytes Received By Node ... 005 (082.000.000) 03/20 18:23:47 Job terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 1357 - Run Bytes Sent By Job 1998870 - Run Bytes Received By Job 1357 - Total Bytes Sent By Job 1998870 - Total Bytes Received By Job |