>
> Date: Thu, 03 May 2007 08:28:33 -0700 (PDT)
> From: James Wang <jameswang99@xxxxxxxxx>
>
> Hi Dan:
>
> Thanks a lot. I think one of the reasons that my benchmark
> invokes OS is that it does quiet a bit IO which actually could
> be avoided.
>
Ugh. If you're doing I/O inside a transaction, that would be a pretty
bad error, which LogTM doesn't prevent you from doing. In fact, LogTM
doesn't prevent you from doing any system calls of any kind from
within a transaction, which is also almost certainly an error. (If
you wish, I can probably send you some code that hacks the trap
instruction to cause a *simulator* assertion violation if you ever
make any system call inside a transaction. Tracking that down has
cost me the better part of a week on two occasions in the past year or
so.)
Even if the I/O is outside a transaction, it should probably be
avoided. Screen I/O can definitely produce interrupts long after the
program returns from the I/O call, in which the characters are
actually sent to the screen device. These interrupts can preempt your
program for what can seem to be a *long* time.
Better to just avoid it -- buffer it up and do all the I/O after the
transactional part of the test has finished (and after you've taken
any timing numbers you need).
>
> But I don't think processor_bind is just a suggestion.
>
That's my understanding as well. Note that I've been told (by
somebody who almost certainly knows) that for real applications,
binding threads to processors is a very bad idea -- it can cause the
scheduler to do really bad things. For LogTM runs, doing so is
absolutely necessary, however, because migrating a thread that's
currently inside a transaction would lead to utter chaos (it is my
understanding that LogTM is not at all equiped to correctly handle
such a thing).
dann
>
> It is mandatory for User threads but maybe a suggestion for OS
> related thread.
>
> Regards
> James
>
> ----- Original Message ----
> From: Dan Gibson <degibson@xxxxxxxx>
> To: Gems Users <gems-users@xxxxxxxxxxx>
> Sent: Friday, May 4, 2007 3:16:25 AM
> Subject: Re: [Gems-users] Processor Lost
>
> As a corollary to this discussion, I'd like to add that this kind of
> issue is one of the reasons that full-system simulation is a great
> thing -- this sort of thing can happen to a real workload, too.
>
> James Wang wrote:
> > Hi Dan:
>
> > Thank you very much for your prompt reply. But I don't really
> > understand what the nature of this situation is. Why would the
> > OS want to deschedule my benchmark?
>
> The OS deschedules threads for a variety of reasons. Interrupts of
> any kind typically need to be handled, and if your benchmark
> initiates a blocking I/O operation the OS will probably deschedule
> your benchmark while the I/O completes (including, say, page
> faults). Moreover, there is always the real-time interrupt timer
> that the OS uses for timesharing anyway -- this timer can actually
> be a problem when running with Simics, as a common trick is to set
> Simics's clock frequency low to improve I/O performance -- with low
> clock frequencies, timer interrupts happen more often relative to
> higher Simics frequencies. For other reasons, have a look at your
> favourite OS textbook.
>
> > Also, I bound the thread to the processor, should it just stay there and run?
>
> See Kevin's response. In general, processor_bind() is a suggestion,
> not a command, to the OS.
>
> >
> > I did this with a four processor simulated machine, why other
> > processors are not affected by this problem?
> >
>
> Solaris seems to favor P0 for a variety of reasons, chief among them
> simplicity.
>
> >
> > Regards
> > James
> >
>
> Regards,
> Dan
>
> > ----- Original Message ----
> > From: Dan Gibson <degibson@xxxxxxxx>
> > To: Gems Users <gems-users@xxxxxxxxxxx>
> > Sent: Friday, May 4, 2007 1:03:42 AM
> > Subject: Re: [Gems-users] Processor Lost
> >
> > The OS could be descheduling your transactional benchmark, though
> > I'm not sure why that might be happening. Try quiescing the system
> > by killing background processes, and then pre-fetch any binaries
> > or data you might be using in your benchmark by running it once to
> > completion before loading Ruby (and hence, without
> > synchronization). That should eliminate any I/O you might
> > inadvertenly cause at runtime. It will also cause your benchmark
> > to run with different system interactions, and will hopefully fix
> > the Processor Lost/Processor Found problem.
> >
> > Regards,
> > Dan
> >
> > James Wang wrote:
> >
> >>
> >> Hi All:
> >>
> >> I am running some transactional memory benchmark using a
> >> customized SMP cache coherent protocol. For some reason, p0
> >> will run code other than the transactional benchmark and the
> >> other processors finishes fine. I cannot really tell what p0
> >> is doing. I tried a few different random seed, the same
> >> situation happens every time.
> >>
> >> Any idea?
> >> Thanks for any reply in advance.
> >>
> >> Regards
> >> James
> >>
> >
>
|