Re: [Gems-users] Ruby and Simics stall flag


Date: Tue, 13 Mar 2007 10:29:46 -0600
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] Ruby and Simics stall flag
Thomas,
A lot of your data confirms what we know about the Simics+Ruby interaction. Let me address each of your observations separately, below.

Thomas De Schampheleire wrote:

Hi,

I am doing some initial simulations of a Solaris OS using Simics
2.2.19 and Gems 1.2. The simulations simply involve sitting at the
shell prompt (no real workload, except for the background services of
Solaris).
I am trying different combinations of several parameters, like the
cpu-switch-time option of Simics, the Simics -stall flag, with and
without Ruby loaded.
First of all, my understanding of the -stall flag is this:
1) Ruby must run in "stall mode"
2) Stall mode is the default mode for Simics 2.2.19. (there might be some debate on this...certainly your data indicates a small difference between -stall and default) 3) Simics 3.0+'s default is -fast (this is certainly true, as Ruby won't operate unless -stall is specified). 4) Stall mode uses a different processor model than -fast, which allows stalling.

I am using the following command in Simics to do the simulation:
date; cc 75000000; date
Since the processors in this machine are all @ 75MHz, this should be
1s in simulated time.

I am getting the following results:

no ruby, no -stall
cpu-switch-time 1:      61s
cpu-switch-time 10:      15s
cpu-switch-time 100:      3s
cpu-switch-time 1000:      2s
cpu-switch-time 10000:      2s

This is consistent with what we know about the behaviour of Simics. Whether this is directly due to some internal working of Simics or simply due to locality in the host's memory hierarcy, we do not know.

no ruby, with -stall
cpu-switch-time 1:      60s
cpu-switch-time 10:      15s
cpu-switch-time 100:      9s
cpu-switch-time 1000:      8s
cpu-switch-time 10000:      8s

This data is interesting. As in my obervation #2 above, I had thought that -stall was the default for Simics 2.2.x. This data suggests otherwise. Can you run this expirment with the -fast option? (no ruby, with -fast, w/o -stall)

These simulations without ruby show that the -stall flag has most
impact with big cpu-switch-times.

When adding ruby, I get:
with ruby, no -stall
cpu-switch-time 1:      31m
cpu-switch-time 10:      24m
cpu-switch-time 100:      11m
cpu-switch-time 1000:      9m
cpu-switch-time 10000:      9m
This is curious. Do the results (eg. Ruby_cycles) match (allowing for differences due to random seed) the results below? If so, then I would speculate that the -stall option isn't needed for Simics 2.2.X....

with ruby, with -stall
cpu-switch-time 1:      (still to run)
cpu-switch-time 10:      (still running, > 2h)
cpu-switch-time 100:      2h
cpu-switch-time 1000:      2h
cpu-switch-time 10000:      1h37m

Looking at a cpu-switch-time of 1, the slowdowns respectively are 60x
(without ruby), 1800x (with ruby, without -stall) and more than 7200x
(with ruby, with stall).
Ruby *should* have no effect if -stall is not enabled. Again, I am curious as to what the default behaviour is. If the default behaviour is not -stall, then Ruby should never even be called by Simics... but, that contradicts the (ruby no -stall) experiment above.

When the cpu-switch-time 1 experiment (ruby, -stall) finishes, could you compare the ruby output to that of the (ruby, no -stall) experiment above?

I found some example figures in the ISCA tutorial, but I am unsure how
to relate my figures to those. I thought that I was simulating 1s, but
now on a second thought, using 4 processors @ 75MHz, the simulated
75000000 cycles would ideally take only 250ms, right? In that case,
the slowdown factors I gave above are per processor, and the whole
should be multiplied by 4 to get the total slowdown. Is this correct?
Are the results I am getting reasonable?
I don't have a Simics reference nearby, but IF I remember correctly, and I might not: *cc* is an abbreviation for instruction-step, *c* is an abbreviation for cycle-step. I don't know how those commands are affected with the number of processors....

In addition, I am a bit confused about the interaction between the
-stall flag and ruby. In my understanding, the -stall flag adds some
realism to memory transactions, by respecting some latency. Ruby adds
cache structures, and thus adds latency due to cache misses. Is that
correct?
I'm forced to also admit confusion, given your data. I eagerly await the notable differences in the Ruby output... from there I might be able to shed some light on your questions.

In the Ruby documentation, it is stated that the -stall flag should
always be used to ensure forward-compatibility with Simics 3. But in
Simics 2 I can choose to run with or without the -stall flag, so what
is the difference between both scenarios?
Certainly, we have observed that -stall is needed to run with Simics 3+. Again, I'd like to see the notable differences in Ruby's output between the -stall and no -stall experiments. If you wouldn't mind re-running the simulations with the same g_RANDOM_SEED param, that would help quite a bit as well. (cpu-switch-time 1 would be the most meaningful to me).

There is one other option you might consider: the PERFECT_MEMORY_SYSTEM flag. I don't think this was released in GEMS 1.2, but you can have a look at its implmentation in GEMS 1.4 -- it is straightforward.

Basically, turning on the PERFECT_MEMORY_SYSTEM flag causes all requests to take precisely PERFECT_MEMORY_SYSTEM_LATENCY cycles. Setting this latency to zero yields identical latencies to a simulation running without Ruby, and you can use this to measure the raw slowdown of Simics itself due to the presence of a timing module.

Thanks, Thomas
Regards,
Dan

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.


[← Prev in Thread] Current Thread [Next in Thread→]