Hi Mike, 
Thanks a lot. 
 
The info about the real machine I used are given below. Actually I am
using a modified version of MSI_MOESI_CMP_directory protocol (tuned for
single CMP) and for interconnection I am using 4x4 mesh having 3 cycle
hop latency. 16MB shared L2 (16 slices of 1 MB each) and 64KBL1D and
64KB L1I. 
 
Thanks, 
Hemayet 
------------------------------------- 
SunOS swym.cs.rochester.edu 5.10 Generic_118833-33 sun4u sparc
SUNW,Sun-Fire 
0       on-line   since 06/13/2007 09:32:40 
Status of virtual processor 0 as of: 12/13/2007 11:48:56 
  on-line since 06/13/2007 09:32:40. 
  The sparcv9 processor operates at 1200 MHz, 
        and has a sparcv9 floating point processor. 
1       on-line   since 06/13/2007 09:32:41 
2       on-line   since 06/13/2007 09:32:41 
3       on-line   since 06/13/2007 09:32:41 
8       on-line   since 06/13/2007 09:32:41 
9       on-line   since 06/13/2007 09:32:41 
10      on-line   since 06/13/2007 09:32:41 
11      on-line   since 06/13/2007 09:32:41 
12      on-line   since 06/13/2007 09:32:41 
13      on-line   since 06/13/2007 09:32:41 
14      on-line   since 06/13/2007 09:32:41 
15      on-line   since 06/13/2007 09:32:41 
20      on-line   since 06/13/2007 09:32:41 
21      on-line   since 06/13/2007 09:32:41 
22      on-line   since 06/13/2007 09:32:41 
23      on-line   since 06/13/2007 09:32:41 
------------------------------------- 
 
Mike Marty wrote:
What real machine did you use? 
   
Running Simics without Ruby is cheating since there is no
synchronization penalty (everything is 1 IPC). 
   
MESI_SCMP_directory is a blocking directory protocol, so any lock
handoffs are essentially 4-hop.  What interconnect topology were you
using?   
   
--Mike 
   
   
  On Dec 13, 2007 10:01 AM, Hemayet Hossain
< hossain@xxxxxxxxxxxxxxxx>
wrote:
   Hi
All, 
I am simulating some splash2 benchmarks by using ruby with simics 2.2.19 
(Solaris 10) and to characterize the time spent in synchronization, I 
have instrumented the synchronization calls like locks and barrier. I
     
have binded each thread to a specific processor (one-to-one) and 
collecting the time by calling high resolution timer gethrtime().  In 
real machine run (having 16 processors) for 16 threads I get around 19% 
time spent on synchronization for a program. If I run the same program
     
in simics without ruby, I also get similar percentage of time spent in 
synchronization. 
     
But If I run the same program in simics with ruby, the time spent in 
synchronization is much higher (goes around 75% of total).  I have
     
collected the time from both programs and from ruby. Both are getting 
almost same percentage number. I am using MESI_SCMP_directory like 
protocol having 2 cycles for L1 and 14 cycles for L2 access. 
     
Does anyone have any idea what's going on? What wrong with my setup? I
     
would really appreciate your reply. 
Thanks, 
Hemayet 
     
_______________________________________________ 
Gems-users mailing list 
    Gems-users@xxxxxxxxxxx 
    https://lists.cs.wisc.edu/mailman/listinfo/gems-users 
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/
    " to your search. 
     
   
   
   
  
 
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.
   
 
 |   
 |