Hi All,
I am trying to run CPI executable that comes with MPI installation on condor pool of 4 machines. I have setup all pre-requisites required for MPI job to run like funda of "dedicated scheduler". But to my surprise, Job runs well till machine_count is 1. If we increase machine_count, it fails giving error like
/*****************************************************/
rm_3948: (-) net_recv failed for fd = 3
rm_3948: p4_error: net_recv read, errno = : 104
/******************************************************/
I have gone through all the previous mails on this particular issue, but still i am facing the same.
Please help me out
Neeraj