I am trying to set up a small linux lab to use as condor execution nodes. I can run very small jobs that produce no more than about 10kb of output but anything bigger just seems to hang. In the scheduler log, there is a message of the type, condor_read(): timeout reading 5 bytes from ... It would appear that the problem is linked to some networking configuration as I can set up condor on another linux box outside the linux lab and it works fine. The network administrator says the following about the linux lab: ----------- The lab uses big frames for NFS performance, this means that when a lab machine goes to do a large write to you, it will send back a potentially 6000 byte or so packet. This isn't a problem, because the router in front of the lab will then fragment that packet down to a normal mtu. I've certainly seen issues with client server software that doesn't check correctly for a short read in these situations, as part of the data can be delivered up to a socket before you get the whole thing. ----------- Could this fragmentation of the packets cause problems for condor? Masao -- Masao Fujinaga fujinaga@xxxxxxxxxxx Tel.: (780) 492-2117 Fax.: (780) 492-1729 Research Computing Support Academic Information and Communication Technologies (AICT) University of Alberta, Edmonton, Alberta, CANADA T6G 2H1 This communication is intended for the use of the recipient to which it is addressed, and may contain confidential, personal, and/or privileged information. Please contact us immediately if you are not the intended recipient of this communication. If you are not the intended recipient of this communication, do not copy, distribute, or take action on it. Any communication received in error, or subsequent reply, should be deleted or destroyed |