There could be several things going on:
1) To verify the setup: Are you issuing identical commands to _Simics_?
e.g. dstc-disable, cpu-switch-time 1, etc.?
2) The RANDOM_SEED will vary from run to run unless you change it. If
your workload is sensitive to lock acquisition order (most are) then it
can make a big difference. This is why we run each data point many
times to achieve 95% confidence intervals. Since the Ruby_cycles
difference is only about 2%, it could be easily attributable to
differences in RANDOM_SEED.
3) Resolving locks in a particular order *is* random noise, but it is
not something that should be discarded from simulations -- real
applications will acquire locks in many different orders, just as the
simulator does. A sound methodology would be to run the same simulation
a few more times and take an average, rather than to "decide between A
and B".
Regards,
Dan
Nitin Bhardwaj wrote:
Hi,
My doubt is in statistics produced by Ruby+Simics simulation's.
If I have 2 copies of simulator in different login's (A and B with same
versions of GEMS and SIMICS) and if I run the
same binary on both the simulator's, load same slicc protocol, use same
ruby configuration then will the results produced from two simulations
be similar or not? If they are not exactly similar then can there be a
significant difference in the results. Below are some statistics I am
gathering from the two simulations ran for 16 Processor configuration. Both the runs use same
starting checkpoint, runs the same application binary, thread binding is
applied in application, randomization is
off, at
the first magic breakpoint ruby statistics gets
cleared.
Protocol |
Ruby_cycles |
L1DMiss |
L1IMiss |
L2Miss |
Simics_cycles |
Instruction |
L1Load |
L1Ifetch |
L1Store |
Sim A |
48027564 |
57392 |
9845 |
12811 |
96055128 |
46668403 |
8947306 |
9845 |
3380127 |
Sim B |
49265138 |
79726 |
6525 |
21283 |
98530276 |
42282268 |
8690709 |
6525 |
4649997 |
What could be
the reason for such a huge difference in results e.g. ruby cycles
differ by 1237574. Now, because of this discrepancy in results the performance
benefit (in terms of ruby cycles) while comparing two slicc protocols
varies in two simulation models. The other
observation is that both the simulation models A and B always produces
same results on multiple runs (for the same application).
My question is basically,
1.) Is this variation justified in results or do you think there is
something wrong in the set-up?
2.) If this variation is justified to some extent and hoping there is
nothing wrong in the set-up then what could be reason for this
variation?
3.) How can I choose a model (A or B) to measure the performance
benefit (with confidence) that this performance is not because of some
random noise due to OS or resolving locks in different order or
something else unknown?
I would really appreciate any help in resolving this issue.
-Thank You
Nitin Bhardwaj
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.
|
|