Re: [Gems-users] Abnormally high supervisor/user access ratio


Date: Tue, 6 Apr 2010 10:10:32 -0400
From: Polina Dudnik <pdudnik@xxxxxxxxx>
Subject: Re: [Gems-users] Abnormally high supervisor/user access ratio
Ikhwan,

Unfortunately, we are not in control of Simics, which is what
determines the superviser/user accesses. So, your guess is as good as
ours.

Polina


On Fri, Apr 2, 2010 at 7:12 PM, Ikhwan Lee <ikhwan@xxxxxxxxxxxxxxx> wrote:
> I've seen similar issues reported in the past, but could not find a
> good answer. I would appreciate it if someone could explain the
> results below.
>
> I'm running Splash-2 programs on a 16-core RUBY+OPAL simulation
> setting. Some of the important Ruby parameters are as follows:
>
> protocol: MOESI_SMP_directory
> simics_version: Simics 3.0.31
> OPAL_RUBY_MULTIPLIER: 2
> L1_CACHE_ASSOC: 2                        // 2KB
> L1_CACHE_NUM_SETS_BITS: 4
> L2_CACHE_ASSOC: 4                       // 16KB
> L2_CACHE_NUM_SETS_BITS: 6
> g_NUM_PROCESSORS: 16
> g_NUM_L2_BANKS: 16
> g_NUM_MEMORIES: 16
> g_NUM_CHIPS: 16
> g_NETWORK_TOPOLOGY: FILE_SPECIFIED   // 4x4 mesh
> g_GARNET_NETWORK: true
> g_DETAIL_NETWORK: true
>
>
> When I grep SuperviserMode access types in the ruby stat files, almost
> all Splash-2 programs show excessively high superviser/user access
> ratio as you can see.
>
> BARNES/ruby.BARNES.16k.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   266621884    85.2083%
> BARNES/ruby.BARNES.16k.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   15695183    82.9426%
> BARNES/ruby.BARNES.16k.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   145082595    83.6042%
> CHOLESKY/ruby.CHOLESKY.tk15.O.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   12393234    76.1279%
> CHOLESKY/ruby.CHOLESKY.tk15.O.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   10213059    88.0104%
> CHOLESKY/ruby.CHOLESKY.tk15.O.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   5567772    82.6523%
> FFT/ruby.FFT.64k.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   4967929    87.4088%
> FFT/ruby.FFT.64k.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   353405    90.6429%
> FFT/ruby.FFT.64k.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   2537530    92.726%
> FMM/ruby.FMM.16k.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   285541702    76.1547%
> FMM/ruby.FMM.16k.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   9056962    74.9888%
> FMM/ruby.FMM.16k.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   109348107    93.9731%
> LUcon/ruby.FMM.16k.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   285541702    76.1547%
> LUcon/ruby.FMM.16k.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   9056962    74.9888%
> LUcon/ruby.FMM.16k.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   109348107    93.9731%
> OCEAN_CONTIGUOUS/ruby.OCEAN_CONTIGUOUS.258.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   175987256    91.1572%
> OCEAN_CONTIGUOUS/ruby.OCEAN_CONTIGUOUS.258.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   12561588    85.7416%
> OCEAN_CONTIGUOUS/ruby.OCEAN_CONTIGUOUS.258.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   91914021    89.7628%
> RADIOSITY/ruby.RADIOSITY.room.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   208131190    94.4257%
> RADIOSITY/ruby.RADIOSITY.room.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   3574778    28.1978%
> RADIOSITY/ruby.RADIOSITY.room.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   99881703    97.481%
> RADIX/ruby.RADIX.1M.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   8949117    63.3698%
> RADIX/ruby.RADIX.1M.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   711276    93.5587%
> RADIX/ruby.RADIX.1M.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   4191291    67.1359%
> RAYTRACE/ruby.RAYTRACE.car.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   5046186    42.2043%
> RAYTRACE/ruby.RAYTRACE.car.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   791814    8.57802%
> RAYTRACE/ruby.RAYTRACE.car.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   2442022    31.3737%
> VOLREND/ruby.VOLREND.head.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   7130501    9.01353%
> VOLREND/ruby.VOLREND.head.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   5619630    37.7412%
> VOLREND/ruby.VOLREND.head.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   3889090    71.005%
> WATER-SPATIAL/ruby.WATER-SPATIAL.512.16p.16m.stats:
> L1D_cache_access_mode_type_SupervisorMode:   10752580    53.7889%
> WATER-SPATIAL/ruby.WATER-SPATIAL.512.16p.16m.stats:
> L1I_cache_access_mode_type_SupervisorMode:   1149841    30.0857%
> WATER-SPATIAL/ruby.WATER-SPATIAL.512.16p.16m.stats:
> L2_cache_access_mode_type_SupervisorMode:   5518409    91.074%
>
>
> Then I turned on PROFILE_HOT_LINES flag, and got the following stat for FFT.
>
> Hot Data Blocks
> ---------------
>
> Total_entries_block_address: 75145
> Total_data_misses_block_address: 2474143
> total | load store atomic | user supervisor | sharing | touched-by
> block_address | 7.99368 % [0x300c0c0, line 0x300c0c0] 197775 | 197648
> 79 48 | 0 197775 | 0 | 16
> block_address | 7.98301 % [0x1b8920c0, line 0x1b8920c0] 197511 |
> 197349 115 47 | 0 197511 | 0 | 16
> block_address | 7.90132 % [0x1b0620c0, line 0x1b0620c0] 195490 |
> 195252 126 112 | 0 195490 | 0 | 16
> block_address | 7.82303 % [0x1b3c40c0, line 0x1b3c40c0] 193553 |
> 193443 73 37 | 0 193553 | 0 | 16
> block_address | 5.06996 % [0x1b0780c0, line 0x1b0780c0] 125438 |
> 125354 58 26 | 0 125438 | 0 | 16
> block_address | 4.83691 % [0xb6e080, line 0xb6e080] 119672 | 119672 0
> 0 | 0 119672 | 0 | 16
> ....
>
> Hot Instructions
> ----------------
>
> Total_entries_pc_address: 3889
> Total_data_misses_pc_address: 2474143
> total | load store atomic | user supervisor | sharing | touched-by
> pc_address | 30.4843 % [0x1055454, line 0x1055440] 754224 | 754224 0 0
> | 0 754224 | 0 | 16
> pc_address | 20.6657 % [0x1055458, line 0x1055440] 511298 | 511298 0 0
> | 0 511298 | 0 | 16
> pc_address | 19.1072 % [0x10554b4, line 0x1055480] 472740 | 472740 0 0
> | 0 472740 | 0 | 16
> pc_address | 5.30531 % [0x1053f50, line 0x1053f40] 131261 | 131261 0 0
> | 0 131261 | 0 | 16
> ......
>
>
> It seems like just a few instructions are executed on a few data
> blocks by the OS, making the statistics for each benchmark program
> almost meaningless. Any idea on what's happening inside the OS?
>
>
> Thanks,
> Ikhwan
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>
>
[← Prev in Thread] Current Thread [Next in Thread→]