Tanzima Zerin Islam wrote:
Hi all,To follow up on what I have done to link 3 compiled files into 1 checkpointable MPI executable:It is the "IS" Nas Par benchmark application written in c. 1. cc -g -o setparams setparams.c 2. cc -g -c -I/tmp/NPB3.3/NPB3.3-MPI/common is.c 3. cc -g -c -I/tmp/NPB3.3/NPB3.3-MPI/common c_print_results.c 4. cc -g -c -I/tmp/NPB3.3/NPB3.3-MPI/common c_timers.c
I'm sorry, but the condor checkpointing technology is unable to checkpoint processes that have inter-process communication of most any kind. It can't checkpoint MPI codes today.
If all the MPI nodes are going to run on the same machine, you might want to investigate the DMTCP checkpointing libraries on source forge.
Sorry, and good luck, -Greg