Re: [HTCondor-devel] Squid cache flushing question


Date: Tue, 16 Jul 2013 11:14:45 -0500
From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Subject: Re: [HTCondor-devel] Squid cache flushing question
Hi Donald -

The Squid caching infrastructure and setup are not baked into HTCondor per se, but is something specific to Open Science Grid (OSG).

I think you would have better luck sending the below to the OSG Grid Operations Center at :

  goc@xxxxxxxxxxxxxxxxxxx

as my guess is your observations is most likely related to some OSG sites not having a Squid cache configured, or the Squid cache at some sites being undersized.

You may also consider sending it to

  htcondor-users@xxxxxxxxxxx

since many Open Science Grid admin types also read that list, more so than the list about HTCondor development.

regards,
Todd


On 7/16/2013 7:46 AM, Krieger, Donald N. wrote:
Dear List,

We are using the Open Science Grid extensively and would like to improve
the cache hit rate with squid.

We typically spawn 3000 jobs at a time, i.e. over about a 15 – 30 minute
period, with a new job entering the run state typically every 1-3 sec.

We are trying to use Condor’s squid file caching capability by sending a
common data staging file to the execute nodes via http.

The file is typically 12 -35 Mbytes and is the same for all 3000 jobs.

The apache log (http server) on the machine from which the files are
staged typically shows 1400-2000 fetches of the file, suggesting that
our hit rate is only about 30-40%.  This is in spite of the fact that
our jobs are typically executing on only 10 – 25 grid facilities with
presumably only 1 http_proxy per facility.

We are wondering if our file is getting flushed from the cache very
quickly so that refetches are required and if there is any way to
control that.  A related question regards how the flushing mechanism
works: Is there something in place which takes account of repeated uses
of the same file to improve it’s persistence in the cache?  Or does the
flushing mechanism depend perhaps only on the size of the file and the
time since it was fetched?  Thanks for any insight and/or suggestions
you can provide.  And if this list is the wrong place to pose this, I
would be happy to be redirected.  It struck me in thinking about this
though that there might be subscribers to the list who are in a good
position to tinker with this and who might know it very well.

Thanks,

Don

Don Krieger, Ph.D.

Department of Neurological Surgery

Universityof Pittsburgh



_______________________________________________
HTCondor-devel mailing list
HTCondor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-devel



--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685
[← Prev in Thread] Current Thread [Next in Thread→]