On Thu, Mar 19, 2009 at 12:15 PM, Dan Gibson  <degibson@xxxxxxxx> wrote: 
You are using a broadcast protocol. Simply because a request is observed does not imply a state transitions. For instance, in S state, an OTHER_GETS doesn't need to change coherence state.
  What you should look for is the data from the cache's - Transitions - section of the stats file. It will look something like this: 
OM  Load  0 <--  OM  Ifetch  0 <--  OM  Store  0 <--  OM  L1_Replacement  0 <--  OM  Own_GETX  0 <--  OM  Fwd_GETX  0 <--  OM  Fwd_GETS  0 <--  OM  Ack  0 <--  OM  All_acks  0 <--  
 The format is: [CurrentState] [messageType] [count] <--
  You are looking for all the transitions that indicate a coherence miss, like: S Other_GETX 45 <-- or M Other_GETS 191 <--
  
 Thanks again Dan for your prompt response. So, accordingly total coherence misses are something like this from my ruby.stats file
  S  Other_GETX  25446 +   M  Other_GETS  26792 = 52238 (total coherence misses) 
 But, doesn't  O  Other_GETX and M  Other_GETX  cause coherence misses as well?
 
 
 
 
 I also notice that some of your instruction fetch stats appear to be zero. Is this intentional on your part? 
If not, verify you invoke Simics with the -stall flag, that you issue 'instruction-fetch-mode instruction-fetch-trace' and 'istc-disable' to simics before loading Ruby.  Here is a copy of SImics Driver Transaction Stats, Instr. requests seem to be 0.  
 Simics Driver Transaction Stats ---------------------------------- Insn requests: 0 Data requests: 20097728 Memory mapped IO register accesses: 6892964 Device initiated accesses: 0 Other initiated accesses: 0 
Atomic load accesses: 7861 Exceptions: 10494 Non stallable accesses: 166754 Prefetches: 502504 Cache Flush: 0
  However, I followed the guidelines from the wiki and ISCA tutorial, also some suggestions from previous threads. My script is like this: 
 Load warm-checkpoint (actually this checkpoint is created after loading a warm checkpoint and continue till first magic break where main computation starts)
  @sys.path.append("../../../gen-scripts") 
@import mfacet
  istc-disable dstc-disable instruction-fetch-mode instruction-fetch-trace cpu-switch-time 1 magic-break-enable break-hap "Core_Magic_Instruction"
  load-module ruby ruby0.setparam g_NUM_PROCESSORS 8 
ruby0.setparam g_MEMORY_SIZE_BYTES 2147483648 ruby0.setparam g_PROCS_PER_CHIP 1 ruby0.setparam g_NUM_L2_BANKS 16 ruby0.setparam L2_CACHE_NUM_SETS_BITS 13 ruby0.init
  ruby0.load-caches fft-8p-caches.gz 
ruby0.clear-stats
  So, I am loading Ruby and making the necessary Simics changes at this point to speed-up the simulation. Am I doing something wrong?
  Regards,
  Ed
 
 
 
On Thu, Mar 19, 2009 at 10:52 AM, Edward Lee  <edwl202@xxxxxxxxx> wrote: 
Thanks Dan for your reply. I assume you are referring to "Chip Stats" section. I thought about that I was a little confused.
  I am using MOSI_SMP_bcast, which means there is only one SLICC controller for L1 and L2 caches. Is this the reason I only see "L1Cache" under Chip Stats? Here is my output showing L1Cache and directory events. I think these are the totals for various other transition options from various cache states but anyways totals are fine for my purpose.   
 I just want to verify my understanding here as I am not very confident with my interpretation: 
 I am not sure how to understand whether I used inclusive caches or not but I believe L2 cache is inclusive and since only one controller is present, cache-to-cache transfers only occur between different L2s. And below stats show actually those L2 stats and for coherence misses I should look at "L1Cache" stats not directory stats.  
  
 And finally I am thinking of coherence misses as --> (Total of Other_* )  and the percentage of coherence misses as -->(Total of Other_* )  / (Total of all event counts in cache stats)
   --- L1Cache --- 
 - Event Counts - Load  133041 Ifetch  0 Store  56336 L1_to_L2  176418 L2_to_L1D  101743 L2_to_L1I  0 L2_Replacement  9650 Own_GETS  82664 Own_GET_INSTR  0 Own_GETX  44650 Own_PUTX  5091 
Other_GETS  578648 Other_GET_INSTR  0 Other_GETX  312550 Other_PUTX  0 Data  118632
  .....
 
   --- Directory ---  - Event Counts - OtherAddress  0 GETS  82664 GET_INSTR  0 GETX  44650 
PUTX_Owner  5091 PUTX_NotOwner  0
  Regards,
  EdOn Thu, Mar 19, 2009 at 9:04 AM, Dan Gibson  <degibson@xxxxxxxx> wrote:
 Total_misses are L2 misses -- probably not what you want. Towards the bottom of the stats file, there should be a summary of protocol transitions. Depending on your protocol, you should be able to get a notion of how many 'coherence misses' there are. 
 Regards, Dan
 
 
Let me try to summarize what I am trying to do, maybe I can get a feedback this time. I am running FFT on an 8 processor SMP target using MOSI_SMP_bcast cache coherence protocol. I used the warm caches and loaded Ruby for the main computation only. And my purpose is to somehow measure the overhead of maintaining coherent caches. Accordingly, I would like to isolate different types of cache misses especially the coherence misses. 
 I got the ruby.stats file but I am not sure if I can use this output directly for what I need. I have the total misses as copied from my ruby.stats file like this: Total_misses: 127314 total_misses: 127314 [ 22540 16951 16571 16564 12979 12803 12781 16125 ] 
user_misses: 96467 [ 13183 12989 12742 12637 11184 11161 11144 11427 ] supervisor_misses: 30847 [ 9357 3962 3829 3927 1795 1642 1637 4698 ] I didn't paste the whole stats as it is quite large but my question is whether there is any information already existing in the ruby-stats file that can isolate different cache misses (global count is fine)? Or should I try to modify the profiler code to get this info?
 Also, I have the number of misses but I don't see the total number of accesses in that section? So, would it be correct if I use the "Data requests" from "Simics Driver Transaction Stats"? However, the "Request missed" shows 189346 there, bigger than the misses shown above.
 I would really appreciate any input on this. Regards, Ed On Sun, Mar 15, 2009 at 12:09 AM, Edward Lee  <edwl202@xxxxxxxxx> wrote:
 Hi,
  I am trying to isolate the cache misses according to their types. So, what would be the best way of differentiating cold, capacity and coherence misses? 
 Thanks,
  Ed 
  
 _______________________________________________ 
Gems-users mailing list 
Gems-users@xxxxxxxxxxx 
https://lists.cs.wisc.edu/mailman/listinfo/gems-users 
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search. 
 
 
 
 
  --  http://www.cs.wisc.edu/~gibson [esc]:wq! 
 _______________________________________________ 
Gems-users mailing list 
Gems-users@xxxxxxxxxxx 
https://lists.cs.wisc.edu/mailman/listinfo/gems-users 
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search. 
 
   
 _______________________________________________ 
Gems-users mailing list 
Gems-users@xxxxxxxxxxx 
https://lists.cs.wisc.edu/mailman/listinfo/gems-users 
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search. 
 
 
 --  http://www.cs.wisc.edu/~gibson [esc]:wq!
  _______________________________________________ 
Gems-users mailing list 
Gems-users@xxxxxxxxxxx 
https://lists.cs.wisc.edu/mailman/listinfo/gems-users 
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search. 
 
   
 |