06/06/17 19:45:06 (1115.0) (2409): Request to run on <10.1.255.217:47300> <10.1.255.217:47300> was ACCEPTED 06/06/17 19:47:52 (1115.0) (2409): Can no longer talk to condor_starter <10.1.255.217:47300> 06/06/17 19:47:52 (1115.0) (2409): This job cannot reconnect to starter, so job exiting 06/06/17 19:47:52 (1115.0) (2409): ERROR "Can no longer talk to condor_starter <10.1.255.217:47300>" at line 208 in file /slots/11/dir_17560/userdir/src/condor_shadow.V6.1/NTreceivers.cpp
So what this says is that about two minutes into the job, the starter either crashed or hung (or the network went away, but that seems unlikely), and the shadow doesn't know why. At this point, it would make sense to look at the execute node(s) -- their startd and starter logs -- and see what's going on at the same time.
- ToddM