Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Network Bandwidth Problem
- Date: Wed, 1 Feb 2006 13:51:49 +0100 (CET)
- From: Jens Harting <jens@xxxxxxxxxxxxxxxxxxxx>
- Subject: [Condor-users] Network Bandwidth Problem
Hi,
we are running a small ~50 CPU Condor cluster (Linux) using version 6.7.6.
The cluster runs just great, but we have now got a user who uses the
standard universe and his jobs need about 800MB of RAM. Therefore, the
checkpoints are pretty big and the network connection is getting saturated
when his jobs try to write a checkpoint to the submit machine. Then, the
compute machine and the submit machine hang for about two minutes until
the whole checkpoint is written. We have tried to play with the
PERIODIC_CHECKPOINT statement in order to force the dumps not taking place
at the same time, but we could not see any real improvement.
I there any way to limited the bandwidth used by Condor to a fixed rate,
for example 10MB/s? Or would our problem be solved if I update the cluster
to a more recent version of Condor?
Thanks for your help,
Jens