Re: [Gems-users] Private L2 in a CMP...


Date: Tue, 16 Jun 2009 15:26:04 +0200
From: David Bonavila <david.bonavila@xxxxxxxxxxxxxxxxxx>
Subject: Re: [Gems-users] Private L2 in a CMP...

You are right, I confused unified and shared.

I want a 2-core CMP with a 128KB private L2 cache (128KB per core).
So, I have configured a 2 chip system, with 1 processor per chip:

g_NUM_PROCESSORS: 2
g_PROCS_PER_CHIP: 1

and the Ruby stats file is:

Chip Config
-----------
Total_Chips: 2

L1Cache_L2cacheMemory numberPerChip: 1
Cache config: L1Cache_0_L2
  cache_associativity: 2
  num_cache_sets_bits: 10
  num_cache_sets: 1024
  cache_set_size_bytes: 65536
  cache_set_size_Kbytes: 64
  cache_set_size_Mbytes: 0.0625
  cache_size_bytes: 131072
  cache_size_Kbytes: 128
  cache_size_Mbytes: 0.125

Is this configuration correct to simulate the 2-core with 128KB private L2??

On the other hand, the network latencies for a CMP with 1 chip, 2 procs per chip and MSI_MOSI_CMP_directory protocol, are the following:

Network Configuration
---------------------
network: SIMPLE_NETWORK
topology: HIERARCHICAL_SWITCH

virtual_net_0: active, ordered
virtual_net_1: active, unordered
virtual_net_2: active, ordered
virtual_net_3: active, unordered
virtual_net_4: active, unordered

--- Begin Topology Print ---

Topology print ONLY indicates the _NETWORK_ latency between two machines
It does NOT include the latency within the machines

L1Cache-0 Network Latencies
  L1Cache-0 -> L1Cache-1 net_lat: 9
  L1Cache-0 -> L2Cache-0 net_lat: 9
  L1Cache-0 -> L2Cache-1 net_lat: 9
  L1Cache-0 -> Directory-0 net_lat: 9
  L1Cache-0 -> Directory-1 net_lat: 9

L1Cache-1 Network Latencies
  L1Cache-1 -> L1Cache-0 net_lat: 9
  L1Cache-1 -> L2Cache-0 net_lat: 9
  L1Cache-1 -> L2Cache-1 net_lat: 9
  L1Cache-1 -> Directory-0 net_lat: 9
  L1Cache-1 -> Directory-1 net_lat: 9

L2Cache-0 Network Latencies
  L2Cache-0 -> L1Cache-0 net_lat: 9
  L2Cache-0 -> L1Cache-1 net_lat: 9
  L2Cache-0 -> L2Cache-1 net_lat: 9
  L2Cache-0 -> Directory-0 net_lat: 9
  L2Cache-0 -> Directory-1 net_lat: 9

L2Cache-1 Network Latencies
  L2Cache-1 -> L1Cache-0 net_lat: 9
  L2Cache-1 -> L1Cache-1 net_lat: 9
  L2Cache-1 -> L2Cache-0 net_lat: 9
  L2Cache-1 -> Directory-0 net_lat: 9
  L2Cache-1 -> Directory-1 net_lat: 9

Directory-0 Network Latencies
  Directory-0 -> L1Cache-0 net_lat: 9
  Directory-0 -> L1Cache-1 net_lat: 9
  Directory-0 -> L2Cache-0 net_lat: 9
  Directory-0 -> L2Cache-1 net_lat: 9
  Directory-0 -> Directory-1 net_lat: 9

Directory-1 Network Latencies
  Directory-1 -> L1Cache-0 net_lat: 9
  Directory-1 -> L1Cache-1 net_lat: 9
  Directory-1 -> L2Cache-0 net_lat: 9
  Directory-1 -> L2Cache-1 net_lat: 9
  Directory-1 -> Directory-0 net_lat: 9

While the network configuration for the 2 chips, 1 proc per chip and MOSI_SMP_bcast protocol are:

Network Configuration
---------------------
network: SIMPLE_NETWORK
topology: HIERARCHICAL_SWITCH

virtual_net_0: active, ordered
virtual_net_1: active, unordered
virtual_net_2: inactive
virtual_net_3: inactive
virtual_net_4: inactive

--- Begin Topology Print ---

Topology print ONLY indicates the _NETWORK_ latency between two machines
It does NOT include the latency within the machines

L1Cache-0 Network Latencies
  L1Cache-0 -> L1Cache-1 net_lat: 5
  L1Cache-0 -> Directory-0 net_lat: 5
  L1Cache-0 -> Directory-1 net_lat: 5

L1Cache-1 Network Latencies
  L1Cache-1 -> L1Cache-0 net_lat: 5
  L1Cache-1 -> Directory-0 net_lat: 5
  L1Cache-1 -> Directory-1 net_lat: 5

Directory-0 Network Latencies
  Directory-0 -> L1Cache-0 net_lat: 5
  Directory-0 -> L1Cache-1 net_lat: 5
  Directory-0 -> Directory-1 net_lat: 5

Directory-1 Network Latencies
  Directory-1 -> L1Cache-0 net_lat: 5
  Directory-1 -> L1Cache-1 net_lat: 5
  Directory-1 -> Directory-0 net_lat: 5

are these latencies allright for what I am trying to do, or should I configure them in a different way??

Thank you again!!


2009/6/15 David Bonavila <david.bonavila@xxxxxxxxxxxxxxxxxx>

Ruby says "Non-CMP protocol should set g_PROCS_PER_CHIP to 1" when using a SMP protocol and g_PROCS_PER_CHIP greater than 1, so I guess I will have to set g_PROCS_PER_CHIP to 1 and g_NUM_PROCESSORS to 2 to model a 2-core CMP.

There are 3 SMP protocols, and the first two of them use a unified L2 cache, so can I only use MOESI_SMP_hammer??

Which network latency parameters should I set to have a good approach of a CMP using 2 chips??

Thank you!!



2009/6/15 David Bonavila <david.bonavila@xxxxxxxxxxxxxxxxxx>

OK, thanks.
I need to model a 2-core CMP with private L2 for each of the cores.
Then, how should I configure these parameters?? I guess it should be something like this:


ruby0.setparam g_PROCS_PER_CHIP 1
ruby0.setparam g_NUM_PROCESSORS 2

Is that right?? Should I set any other parameters or configuration??
I have read something about making a fast interconnect between chips. Do I need any of that??

Thank you again!!


2009/6/15 David Bonavila <david.bonavila@xxxxxxxxxxxxxxxxxx>


Hi.

I need to configure a CMP with a private L2 cache, and I would like to know which of these configs I should use:

1)
ruby0.setparam g_PROCS_PER_CHIP 1
ruby0.setparam g_NUM_PROCESSORS 2

2)
ruby0.setparam g_PROCS_PER_CHIP 2
ruby0.setparam g_NUM_PROCESSORS 2
ruby0.setparam_str g_CACHE_DESIGN PRIVATE_L2

3)
ruby0.setparam g_PROCS_PER_CHIP 1
ruby0.setparam g_NUM_PROCESSORS 2
ruby0.setparam_str g_CACHE_DESIGN PRIVATE_L2

As I want to simulate a CMP, I guess I should use config number 2, but I don't know if that is the right way to do what I want.
Does config number 1 (without the g_CACHE_DESIGN parameter) provide a private L2 cache??
Or is any of them acceptable and the result will be the same??

Thanks!!




[← Prev in Thread] Current Thread [Next in Thread→]