[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] SLOW_CKPT_SPEED and job results upload



Hi,

In our cluster, we're in the situation where the upload of the result
files (several GBs) from the running machine(s) to the head node is
overwhelming the head node. This causes shadow exceptions, and the
main problem is that the upload of all the result files is never
completed, and the head node never receives all the files.

I've noticed the SLOW_CKPT_SPEED setting in the condor configuration
file, but from the docs it looks like it only applies to
checkpointing. Is this correct? Or it does apply to all the trasfers
made by condor_shadow from the running machine to the head node? If
this is not the case, is there another setting to limit the transfer
speed of the results? We really need this.

We posted earlier about this problem, without luck:

https://lists.cs.wisc.edu/archive/condor-users/2008-March/msg00117.shtml

Regards,
Pasquale