Sean,
This sounds strange to me. Can you provide more information, such as how do you
merge and sort your traces? What does a long string look like?
Thanks!
-Min
On Thu, 07 Jul 2005 Sean Ryan Leventhal wrote :
> 	I have modified opal to print out traces of all memory 
> instructions.  I call a function of the sequencer within the execute stage 
> of both memop objects.  This function prints out the following:
> 
> m_local_cycles (which I am currently treating as the time)
> the address of the instruction
> whether it is a store
> and the address being accessed.
> 
> Each sequencer has its own file.
> 
> When I merge these files, and sort them based on processor/sequencer I 
> observe that there are long strings in which only one processor accesses 
> the cache.  For instance, I start fmm -p4 (fast multipole method on four 
> processors from splash2), and do
> c 1500000
> to try to jump past some of the OS stuff.  I then load ruby and opal and 
> initialize them and run
> 
> opal0.sim-step 5000000
> 
> This produces several very large traces.  But sorting them and grouping 
> all adjacent memory accesses of the same processor as a single "string" 
> yields only 32 "strings", with an average length of 103,827 memory 
> accesses.  In other words, it appears that two threads are never executing 
> at the same time.  I get similar behavior from fft.  Does anyone have any 
> idea what I am doing wrong?
> 
> - Sean
> 
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
 
 |