Re: [Gems-users] Getting Multiprocessor benchmarks running on multiple processors at once


Date: Fri, 08 Jul 2005 14:01:07 -0500
From: Alaa Alameldeen <alaa@xxxxxxxxxxx>
Subject: Re: [Gems-users] Getting Multiprocessor benchmarks running on multiple processors at once
Sean,

Also make sure that Simics is switching between processors every cycle. If you are not sure, please run the following before starting Ruby/Opal:

conf.sim.cpu_switch_time = 1

-Alaa

Mike Marty wrote:
I'm not sure what is going on.

You chould check to see that this scheduling occurs without Opal to ensure
that your trace code is correct and that Opal is not at fault.  You can
dump out memory requests that enter the Ruby memory system by entering the
following ruby commands in Simics:

    ruby0.debug-output-file trace.txt
    ruby0.debug-start-time "1"




	I have modified opal to print out traces of all memory
instructions.  I call a function of the sequencer within the execute stage
of both memop objects.  This function prints out the following:

m_local_cycles (which I am currently treating as the time)
the address of the instruction
whether it is a store
and the address being accessed.

Each sequencer has its own file.

When I merge these files, and sort them based on processor/sequencer I
observe that there are long strings in which only one processor accesses
the cache.  For instance, I start fmm -p4 (fast multipole method on four
processors from splash2), and do
c 1500000
to try to jump past some of the OS stuff.  I then load ruby and opal and
initialize them and run

opal0.sim-step 5000000

This produces several very large traces.  But sorting them and grouping
all adjacent memory accesses of the same processor as a single "string"
yields only 32 "strings", with an average length of 103,827 memory
accesses.  In other words, it appears that two threads are never executing
at the same time.  I get similar behavior from fft.  Does anyone have any
idea what I am doing wrong?

- Sean
[← Prev in Thread] Current Thread [Next in Thread→]