The limiting factor in speed of condor_submit completion is usually
    the time it takes to complete fsync() in the schedd.  If your jobs
    have user logs, prior to 7.5, condor_submit also called fsync(). 
     
    To confirm whether this is your problem, you can start a long series
    of condor_submit invocations.  While those are running, periodically
    run gstack <schedd_pid>.  This will print out the schedd
    stack.  Is fsync() frequently listed? 
     
    Things you can do to speed up fsync: put $(SPOOL) on a fast disk
    that doesn't have a lot of other usage.  For testing, you can even
    stick it in /dev/shm, which is basically a ramdisk.  The same
    applies to user logs. 
     
    --Dan 
     
    On 9/8/11 3:32 PM, Patty Bragger wrote:
    Thanks David, 
      I've tried adding the -disable flag, and it seems to help a little
      bit, but not a whole lot.   It's now averaging about 10 seconds
      per 100 instead of 11 seconds. 
       
      So this is still a pretty stark difference in performance from
      what you're seeing, and granted, my 4 core machine is probably
      pretty weak compared to a 16 core nahalem, but I guess I was still
      expecting to see some kind explanation by way of maxed out cpu, or
      something.. but I'm not seeing that at all.  I submitted 1200
      jobs, just to sustain the "load" for a noticeable time of 2+
      minutes.  During that time, the load average didn't even break 1,
      and the cpu usage increased from about 10% to about 35%.  
       
      Oh well, this isn't the end of the world, thanks for all of the
      info. 
       
      -Patty 
       
      On Thu, Sep 8, 2011 at 3:37 PM, David J.
        Herzfeld  <herzfeldd@xxxxxxxxx>
        wrote:
         
          On Thu, 2011-09-08 at 15:23 -0400, David J.
            Herzfeld wrote: 
            > Hi Patty: 
            > 
            > On Thu, 2011-09-08 at 14:40 -0400, Patty Bragger wrote: 
            > > So an average of about 9 jobs/sec, which is faster
            (but only a little) than 
            > > submitting through dag.  What kind of rates are
            you guys getting?  Maybe 
            > > this is this normal? 
            > > 
            > 
            > My guess is that the numbers you are seeing and
            probably pretty normal 
            > (both for dagman and when calling directly from the
            command line). 
            > 
            > We see faster times (real = 0m2.454s, user = 0m1.315s,
            ~40 jobs/s), but 
            > have a pretty customized config. For instance, we set 
            > SUBMIT_SKIP_FILECHECKS = False 
            > SUBMIT_SEND_RESCHEDULE = False 
            > I would assume that both of these knobs would reduce
            submit times 
            > (although haven't tested them myself). 
             
           
          Sorry, that should be: 
          SUBMIT_SKIP_FILECHECKS = True 
          (see 
          > http://www.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#SECTION004314000000000000000). 
          Sorry about that. 
           
          You should be able to emulate this behavior with the -disable
          flag to 
          condor_submit (if you want to try to see if that increases
          your speed). 
           
          Best of luck, 
          
         
       
       
       
      
       
      _______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
 
     
  
 |