hello. I am writing to get some more ideas regarding a problem which is
becoming rather hard to tackle. I have my machine as a condor submitter and
unfortunately we realised that the local disk transfer speeds for the log and
spool files is too slow and limits our maximum job number. Replacing the disk
with an ssd will bring another problem close which is processor speed. I have
hence decided to alter the config file to be able to make the submitting machine
exchanging data over our super fast connection and network storage. I had a go
trying out different things for the last days. I got it to momentarily work and
the number of jobs I could carry out simultaneously went up to 1200 from 300
which was the earlier limit however then the processor maxed out and cut off
taking up more jobs. We are planning to split the administration job to other
PCs to get the processing speed required and the max amount of jobs running. I
have tried adding the network location folders for the spool and log pathnames
in the configuration file:
######################################################################
## Daemon-wide settings:
######################################################################
## Pathnames
LOG = \\PATHNAME\log
SPOOL = \\PATHNAME\spool
EXECUTE =
$(LOCAL_DIR)/execute
BIN = $(RELEASE_DIR)/bin
LIB = $(RELEASE_DIR)/lib
INCLUDE =
$(RELEASE_DIR)/include
SBIN = $(BIN)
LIBEXEC = $(BIN)
However this does not work and the condor service is cut off and I cannot
restart it or enquire about it unless I change the config file back to the
initial one (i.e. local log and spool folders). I am running condor on a windows
7 machine. Replacing the memory with an SSD is not an option as the job sizes
are quite large and there are no funds to do that on a large scale while the
network storage can provide the speed we are after. Any ideas?
Cheers
Antonis |