[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Checkpointing Errors



Simon David Hammond wrote:
Hi All,

[snip]
Our LOWPORT is 9000 and HIGHPORT is 9500 for servers and 9060 for clients. 
I'm confused as to why the checkpointing system is picking 53211 and I 
can't seem to find a configuration option to change it! 
Is Condor configured to send the checkpoint back to the condor_shadow 
process, or have you configured a checkpoint server?
If the former, do you consider the machine where the condor_shadow runs 
(your submit machine) to be a client or a server?  If a client, perhaps 
60 ports isn't enough --- how many jobs are simultaneously running from 
the submit machine?
Finally, there may be some good clues in the ShadowLog file from the 
submit machine at the same time the job is trying to checkpoint.
regards
Todd