Re: [Gems-users] Issues with collecting memory access trace for logtm microbenchmark (tm-deque)


Date: Fri, 12 Jan 2007 15:40:13 -0600
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] Issues with collecting memory access trace for logtm microbenchmark (tm-deque)
The return value may be zero for normal hits in the l1 cache when FAST_PATH is enabled... that could also explain some of your other issues.

Shougata Ghosh wrote:
Thanks Jayaram and Dan for your replies. I was aware that simics sends 
memory requests to ruby more than once. What I am doing inside 
ruby_operate() is that I only record the transaction in my trace file if 
the return value of mh_memorytracer_possible_cache_miss(mem_op) is 
non-zero. Does that sound ok?
Creating a processor set with pset_create and then binding the threads 
to the cpus of this set kept out all the other processes from 
interfering with my benchmark.
Thanks again
shougata

  
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] Issues with collecting memory access trace
	for logtm microbenchmark (tm-deque)
To: Gems Users <gems-users@xxxxxxxxxxx>
Message-ID: <45A6459B.3030301@xxxxxxxx>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Shougata,
Let me take a stab at the areas that I am confident in answering 
correctly (eg. NOT LogTM):

**Too many requests:
Simics uses an "are you sure?" policy when issuing memory requests to 
Ruby. That is, each request is passed to ruby *twice* -- once to 
determine the stall time, and once when the stall time is elapsed and 
Simics is verifying that Ruby wants the operation to complete. These 
dual requests are handled in SimicsProcessor.C -- for convenience (both 
of C++ language and for the filtering effect) you may want to move your 
trace generation higher into Ruby's hierarchy (say, SimicsProcessor.C or 
Sequencer.C).

**ASI:
Contrary to the name, the ASI is used to specify not the process but the 
target address space. 128 is the vanilla address space for main 
memory... ASIs are detailed quite extensively in Sun's microprocessor 
manuals (google for "sun ultrasparc manual" and see the section on 
ASIs). For example, ASI 0x70 (aka 128) is ASI_BLOCK_AS_IF_USER_PRIMARY, 
which is for user-level accesses to main memory, ASI 0x58 is for 
accesses to the data TLB, 0x59 is the data TSB, etc. There is not one 
ASI per process.

Regards,
Dan

Shougata Ghosh wrote:

 

    
Hi
I am simulating 16 processor ultrasparc-iii with solaris10. I loaded 
ruby (no opal) with simics. The protocol I used was 
MESI_SMP_LogTm_directory and I was running tm-deque microbenchmark that 
comes with GEMS. My goal was to collect the memory traces (only data 
access, no instruction access) of tm-deque and analyse the trace file 
offline.
Let me first give a brief overview of how I collect the traces.

I print the clock_cycle (simics cycle), the cpu making the request, the 
physical address of the memory location, the type of access (r or w) and 
if this cpu is currently executing a xaction (logTm). The format looks 
like this:

cycle    cpu    phys_addr    type(r/w)    in_xaction

This I print from inside ruby_operate() in ruby.c, since this function 
is called for every memory access simics makes.
In addition to this, in a different trace file, I print when a xaction 
begins, commits or aborts. This I print from 
magic_instruction_callback() in commands.C. The format is following:

cycle    cpu    xaction_type(B/C/A)    xaction_id(for nested xaction)

Once the simulation is completed, I combine the two trace files and sort 
it with the clock cycle field.

*****The biggest issue is with having too many requests. I want to 
isolate all the other processes making memory requests, except tm-deque. 
Right now, I'm isolating the kernel requests by inspecting the priv 
field in (v9_memory_transaction *) mem_op->priv. If the priv field is 1, 
I don't record that transaction. I believe this effectively keeps the 
kernel requests out of my trace. But there are other maintenance/service 
processes started by the kernel running in user space which access the 
memory and I want to isolate them. I have tried to detect the pid or 
some sort of a process id from inside ruby but haven't had any 
success/luck so far! Things I have looked into are:

- The ASID (address space id) field in (v9_memory_transaction *) 
mem_op->asi. This didn't work!! The ASID was a fixed 128 throughout. One 
possible reason is that perhaps the ASID changes between user space and 
kernel space. Since I'm only recording user-space accesses, I don't see 
any changes in ASID.

- The content of global register g7. From inspecting the opensolaris 
code, I noticed that the getpid() function gets the address of the 
current_thread structure from %g7. It then gets a pointer to the process 
the current_thread belongs to from the current_thread structure. Next, 
it reads the process_id from the process structure. Since I don't care 
about the exact pid, I inspected the value of the %g7 register. I didn't 
see any changes in that! One possibility was ofcourse %g7 stores the 
virtual address which could be the same for all processes. If all the 
processes are running just one thread, this seemed very likely. So, next 
I looked into the corresponding physical address. Unfortunately, that 
remained constant as well!
I'll try reading the content of the memory location pointed to by the 
physical address (thread_phys_addr). Maybe that will have a different 
value! I am yet to look into that.

On a side, how does LogTm differentiate xactional requests from 
non-xactional ones if they both come from the same processor??

*****My second issue is with the clock cycle I print for timestamping. I 
am using the SIM_clock_cycle to timestamp the memory accesses. When I 
combine the two traces, I notice that after a xaction has begun, 
subsequent memory accesses printed from ruby_operate() doesn't have 
in_xaction set to 1! Here's an example of it:
9067854    13    189086172    r    0
9067856    13    185775464    w    0
9068573    13    B    0            <- xaction begins
9069382    13    185775464    w    0
9069387    13    185775468    r    0
.
.
.
9069558    13    185775468    w    0
9069566    13    185775468    w    0
9069611    13    185775272    r    1    <- first time in_xaction turns 1

There's always a lag of about 1000 cycles between xaction Begin and 
in_xaction turning into 1 in the memory access traces. I did make sure I 
set the cpu-switch-cycle to 1 in simics before I started my simulations! 
I get the value of in_xaction in the following way:
#define XACT_MGR 
g_system_ptr->getChip(SIMICS_current_processor_number()/RubyConfig::numberOfProcsPerChip())->getTransactionManager(SIMICS_current_processor_number()%RubyConfig::numberOfProcsPerChip())
in_xaction = XACT_MGR->inTransaction();

As I metioned earlier, I get the clock_cycle from SIM_cycle_count(*cpu). 
Any idea what could be causing this? Do you think I should try using 
ruby_cycles instead?

*****Third issue is specific to the LogTm microbenchmark I was running. 
I was using the LogTm tm-deque microbenchmark. I ran it with 10 threads 
and set # of ops to 10. Initially I wanted small xactions without 
conflicts. When I look at the trace file, I don't see any interleaving 
threads. The 10 threads ran one after the other in the following order:
thread        cpu    start_cycle
T1        13    9068573
T2        9    10035999
T3        13    10944933
T4        2    11654399
T5        9    11781161
T6        13    11886113
T7        4    16280785
T8        13    16495097
T9        0    16917327
T10        6    17562721

Why aren't the threads running in parallel? The code dispatches all 10 
threads in a for-loop and later does a thread_join. I am simulating 16 
processors - I expected all 10 threads to run in parallel! Also, the 
number of clock cycles between the end of one thread and the start of 
the enxt one is quite large - itvaried from 200,000 to 900,000!
Am I doing something wrong with the way I am collecting the clock_cycle 
with SIM_cycle_count(current_cpu) ?

I would really appreciate if anyone could share their thoughts/ideas on 
these issues.
Thanks a lot in advance.
-shougata

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.




   

      
 

    
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.


  
[← Prev in Thread] Current Thread [Next in Thread→]