[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] afs vs nfs vs separate file systems



In the style of IT everywhere: "it depends".

If your blast database for a given run is smaller than the available
main memory of the compute node, then you'll see minimal performance
difference.  This is because the linux kernel will use spare RAM as a
disk buffer cache.  Thus, after the first sequence, as each subsequent
sequence is blasted, the blast program "reads" the database, which the
kernel conveniently already has in RAM.  Speed is good ;-)  As
speculation, I'd say that a shared filesystem might be slightly faster,
as processing will be interleaved during the initial read, whereas if
you copy the db to local disk, then your compute node does nothing while
it's copying the file, then starts processing.  But that's probably a
minor consideration, assuming your blasting any signficant number of
sequences.

If your blast database does *not* fit into main memory, then your
performance will drop either way, but it will be *much* better to have
the blast db on local disk.  This is because in this case, the kernel
uses memory as a disk buffer cache, but too much data is read, so "old"
data (the early parts of the database) are flushed and "new" data (later
parts of the database) are put into memory instead.  If you're using a
shared file system, this reading will translate into network traffic,
with all the associated latency, whereas a local copy will still be
slow, but will only be "disk speed" slow, not "network" slow.  ;-)

FWIW, we use local copies (not using condor yet for this, but that's the
plan when we do), and we've also seen *significant* speedups by
splitting up our large databases into separate chunks, each of which can
fit into memory.  By doing that, our CPU usage actually hits 100%, as
opposed to barely hitting 20% (this is on dual 2.8Ghz Xeons).  

HTH,
Craig Miskell

> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Michael Thon
> Sent: Friday, 17 September 2004 4:42 a.m.
> To: condor-users@xxxxxxxxxxx
> Subject: [Condor-users] afs vs nfs vs separate file systems
> 
> Greetings - we have had a log of success in setting up a 
> small condor pool of 
> 4 linux boxes (workstations) in our lab.  Now we are planning 
> to add more 
> systems to the pool and to start actually using it for 
> research and we need 
> to make some decisions about how to configure the pool.  
> Specifically, I need 
> to decide if we will us a shared file system.  Some of the 
> programs we run 
> have large input files and output files. (sometimes > 500 MB 
> These are blast 
> databases for any of you biologists out there)   currently 
> all files are 
> being transferred by condor.  From a nework performance point 
> of view, would 
> it be better to put these files on a shared file system?  
> Another option is 
> to mirror the blast databases on all of the machines, but 
> this could take a 
> lot of disk space on the nodes and can cause problems if the 
> syncronization 
> gets out of wack.  
> If I use a shared file system should I use afs or nfs?  I 
> have used nfs a 
> little bit and I have no experience with afs.  I want to keep 
> our config as 
> simple as possible, even if I have to sacrifice a little 
> performance of the 
> pool.
> thanks for your comments
> Mike
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================