Dan and Kevin
Thanks a lot for the answers.
Tourmaline Serializer works with barnes without breaking. But the
biggest drawback of Serializer for me is that it stalls all other
processors when it's in a transaction. For my work, I want to model
conflicts and aborts. Maybe I'll try making some modifications to
Serializer and see how it goes from there.
As you guys suggested, I tried running with a larger cache and low
latencies. This is the config I used:
#Ruby parameters
ruby0.setparam g_NUM_PROCESSORS 16
ruby0.setparam g_PROCS_PER_CHIP 1
ruby0.setparam g_MEMORY_SIZE_BYTES 1073741824
ruby0.setparam L1_CACHE_NUM_SETS_BITS 16
ruby0.setparam L2_CACHE_NUM_SETS_BITS 16
ruby0.setparam MEMORY_RESPONSE_LATENCY_MINUS_2 1
ruby0.setparam SIMICS_RUBY_MULTIPLIER 1
ruby0.setparam CACHE_RESPONSE_LATENCY 1
ruby0.setparam L1_RESPONSE_LATENCY 1
ruby0.setparam L2_RESPONSE_LATENCY 2
#Transactional Memory Params
ruby0.setparam_str REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH false
ruby0.setparam_str PROFILE_EXCEPTIONS false
ruby0.setparam_str PROFILE_XACT true
ruby0.setparam RETRY_LATENCY 100
ruby0.setparam g_DEADLOCK_THRESHOLD 400000
I changed the function that returns the memory latency to
MEMORY_RESPONSE_LATENCY_MINUS_2 + 2, so that it always returns 3 and
does not model the random variation.
Before running barnes, I wanted to test it with the deque transactional
microbenchmark and go the following error:
failed assertion 'm_sequencer->isReady(logMsg)' at fn
MemoryTransactionResult
SimicsProcessor::makeRequest(memory_transaction_t*) in
simics/SimicsProcessor.C:442
failed assertion 'm_sequencer->isReady(logMsg)' at fn
MemoryTransactionResult
SimicsProcessor::makeRequest(memory_transaction_t*) in
simics/SimicsProcessor.C:442
At this point you might want to attach a debug to the running and get to the
crash site; otherwise press enter to continue
PID: 2452
I am not sure if this is caused by something I changed in the
microbenchmark or it is a ruby issue. I'm trying to run the barnes
benchmark now to see how it goes. And just to be on the safer side, I
set REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH to true.
As a side, could someone elaborate on how logTM tries to stall the
conflicting processor instead of aborting it outright? I'm trying to
understand the protocol since it seems to be different from the paper.
So when would it fail to stall? As I mentioned earlier, I want to model
aborts and would like to see as many rollbacks as possible. Also, does
the attempt to stall a conflicting processor generate the NACKs instead
of ABORTs?
I'd really appreciate any feedback.
Thanks
shougata
------------------------------
Message: 5
Date: Tue, 06 Mar 2007 07:42:52 -0600
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] Running SPLASH2 benchmarks with LogTM
To: Gems Users <gems-users@xxxxxxxxxxx>
Message-ID: <45ED6FDC.2040209@xxxxxxxx>
Content-Type: text/plain; charset=us-ascii; format=flowed
Shougata,
Shougata Ghosh wrote:
Hi
I am trying to run SPLASH-2 benchmarks with logTM. I have replaced the
locks in the code with magic instructions (xaction begin and commit).
The cache coherence protocol I'm using is MESI_SMP_LogTM_directory. My
version of simics is 2.2.19 and GEMS is 1.3 (not hooking up opal).
I want to collect the memory traces (along with xaction begins, commits
and aborts) and analyse them offline. I don't really care about the
timing info generated by ruby. Running with ruby slows it down too much!
And since I don't need the timing info that ruby provides, I think this
slowdown is unjustified in my case!
I thought of running it with the PERFECT_MEMORY_SYSTEM=true and setting
PERFECT_MEMORY_RESPONSE_LATENCY = 0, but then I figured out that will
probably break the transactional memory part of the memory system. When
PERFECT_MEMORY_SYSTEM is true, ruby seems to completely bypass the cache
and simply return PERFECT_MEMORY_RESPONSE_LATENCY. That way, the xaction
conflicts will never be detected. Can someone verify this?
The PERFECT_MEMORY_SYSTEM flags will indeed completely bypass LogTM.
SPLASH would behave as an unsynchronized program. I would also expect
Ruby to behave in strange and unexpected ways if transactional binaries
were run with PERFECT_MEMORY_SYSTEM = true.
One alternative I thought of was using Tourmaline. While tourmaline
worked for some small microbenchmarks, it always breaks when I'm trying
to run the SPLASH2 benchmarks.
The released controllers that attempt to allow transactional concurrency
were not very richly developed. They have a hard time handling
virtualization events. However, the Serializer controller is fast and
simple, and much more robust. I don't know the specifics of your
requirements, however... if you need non-transactional CPUs to be making
meaningful requests then obviously Serializer is not an option.
Another option I tried is to let ruby do its thing but always return 0
to simics for stall cycles. Basically, in ruby_operate(), I call
mh_memorytracer_possible_cache_miss(mem_op) and then return 0. This
would slow the execution down somewhat but atleast it won't stall
simics. Conceptually this made sense to me but when I ran it, it gave me
the following error right after the first xaction_begin:
simics-common: system/Sequencer.C:487: void Sequencer::makeRequest(const
CacheMsg&): Assertion `isReady(request)' failed.
*** Simics getting shaky, switching to 'safe' mode.
*** Simics (main thread) received an abort signal, probably an assertion.
Ruby manipulates Simics's stall condition in two ways.
1) By returning non-zero values from mh_memorytracer_possible_cache_miss().
2) By calling SIMICS_stall_cycle(), usually to unstall a processor
I would expect the above behaviour to persist if most of Ruby thinks the
processor is stalled when it is, in fact, not stalled. There are several
conditions that could be violated in isReady(), many of which are not
related to LogTM.
Regardless, forbidding Ruby from stalling Simics will break LogTM
anyway, since LogTM relies first if stalling to prevent aborts, rather
than aborting outright.
I understand there were some logTM bugs in this version of GEMS (1.3)
which were fixed in the last release (1.4). Is this error being caused
by one of those bugs? Is it worth the trouble to install GEMS 1.4 and
try this method out or is there something fundamentally wrong with what
I'm doing and won't work in 1.4 either?
I'm sure one of the LogTM architects will be glad to comment on this. I,
for one, would reccomend the latest version, simply because its not
always straightforward to manually re-solve bugs in older versions of GEMS.
Any other ways of achieving whaty+simics (where ruby stalls simics)
setup and I notic I'm trying to do?
If you're not interested in timing, try running with Ruby with
SIMICS_RUBY_MULTIPLIER = 1, L1 latency 1, L2 latency 2, and MM latency
3, link latencies small if needed, and make cache sizes huge (~GBs).
Empirically, we know that a lot of the Simics+Ruby slowdown occurs
because Simics is spinning on stalled processors. Reducing all the
latencies should help substantially.
Also, marginal increases in cpu-switch-time (eg from 1 to 5 or 10) would
probably speed things along somewhat, again at the expense of timing
accuracy.
Btw, I did try running barnes with the regular ruby+simics (where ruby
stalls simics) setup and I noticed ruby always returned 2000000000
cycles as the stall cycle! What's causing this???
2 Billion is used as "an arbitrarily long stall time" -- Ruby never
returns an exact number of cycles because Ruby will explicitly unstall
Simics when the request has completed. One cannot know reliably at
request-time the number of cycles that will be required for a given
request, hence Ruby simply stalls Simics (for 2 billion cycles), then
when the request has been satisfied (by Ruby's EventQueue.C and its
consumers), Ruby calls SIMICS_stall_cycle() to unstall Simics.
I'd really appreciate any ideas.
Thanks in advance
shougata
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.
|