Re: [Gems-users] Some questions about MOESI_CMP_directory


Date: Thu, 6 Sep 2007 16:43:17 -0500
From: "Lide Duan" <leaderduan@xxxxxxxxx>
Subject: Re: [Gems-users] Some questions about MOESI_CMP_directory
Thank you, Mike! Please see inline...

On 9/6/07, Mike Marty <mikem@xxxxxxxxxxx> wrote:
> > I have been studying the protocol MOESI_CMP_directory recently. I hope
> > you can give me some help on the following questions:
> >
> > 1. It's said that this protocol is a "two-level directory protocol",
> > so what does the "two-level" refer to? I found that any entry of L1
> > cache, L2 cache or memory has certain fields (e.g. CacheState,
> > Sharers, Owner, etc) to record the coherence information of each cache
> > line, then there are 3 levels. But there is only one localDirectory in
> > the L2 cache, so I got confused with the "two-level directory".
> >
>
> Directory coherence between CMPs and Directory coherence within a CMP.
> For an explanation, see this paper:
>
> http://www.cs.wisc.edu/multifacet/papers/hpca05_cmp_token.pdf
>
>
> > 2. Some previous posts mentioned that the localDirectory in the L2
> > cache approximates L1 shadow tags. Could you explain more about this?
> > Why do we need such a directory in the L2 cache considering that all
> > the coherence information has been recorded in the entries of
> > CacheMemory?
> >
>
> The first-level L2 directory *can* be implemented in the L2 cache tags if
> inclusion is enforced.  I have a version of MOESI_CMP_directory that does
> exactly this, but I have not (yet) incorporated into a release of GEMS.
>
> The released version of MOESI_CMP_directory does not enforce inclusion.
> The hackish implementation will copy directory state from the tag to the
> localDirectory when an L2 line is evicted.  In reality, you wouldn't
> implement it this way in hardware.

So basically the localDirectory is used to contain the states of the
recently evicted L2 blocks, thus improving the overall system
performance, because the system will change the L2 block to be evicted
if it is found in the shadow tags, i.e. localDirectory. Am I right? So
It isn't necessary to have this structure for the correctness of the
system, is it?

>
>
> > 3. I suppose that MOESI_CMP_directory-dir.sm refers to the memory
> > although it was named "directory". If so, is it reasonable to maintain
> > a coherence entry for each memory block? Because the memories are
> > quite large in current machine, the cost would be very high to do so.
> > Did I miss something here?
> >
>
> Yes, all directory protocols implemented in DRAM have space overhead of
> storing directory state.
>
>
> > Moreover, I classified the msgs traversing the network during the
> > simulation with this protocol. All the data msgs were found in VN2
> > (Virtual Network) only, and the control msgs traversed through VN0 to
> > VN2, mainly in VN0 and VN2. VN3 has never been used. Is there any
> > specific reason to place those msgs on different virtual networks? or
> > what are the rules to do so?
> >
>
> With finite buffering, you need virtual networks to prevent protocol
> deadlock.  This is covered in the academic literature.
>
> With the default infinite buffering in GEMS, it is not so important to
> carefully manage virtual networks because protocol deadlock won't happen.
>
> For MOESI_CMP_directory, I did not carefully consider virtual networks for
> a wide variety of interconnects (or any for that matter).  I did seperate
> request messages from response messages.  I don't recall what I did for
> unblock/writeback messages.  In reality, you want a seperate virtual
> network for request messages, response messages, and unblock messages.
> There is also a dependence on writeback, but I think with
> MOESI_CMP_directory's 3-phase writeback scheme, a seperate VNET may not be
> needed (but I'd have to think about this).

I am wondering has Ruby prioritized the different VNs? I remember that
the Throttle in the network source code satisfies the bandwidth
requirements from VN3 to VN0, which means the VN3 gets the highest
priority, but these priorities will be inverted if the variable
m_wakeups_wo_switch in Throttle exceeds a constant
PRIORITY_SWITCH_LIMIT. Is there anywhere else related to the
prioritized VNs? Why did you do that?

Actually I observed the following msgs in the simulation for each VN
(X means not zero) ;
    Request_Control		[ X X 0 0 ]
    Response_Control	      [ 0 0 X 0 ]
    Writeback_Control	       [ X X X 0 ]
    Forwarded_Control	       [ X 0 0 0 ]
    Invalidate_Control		  [ X 0 0 0 ]
    Unblock_Control		[ 0 0 X 0 ]
    Response_Data	       [ 0 0 X 0 ]
    ResponseL2hit_Data       [ 0 0 X 0 ]
    ResponseLocal_Data	    [ 0 0 X 0 ]
    Writeback_Data	        [ 0 0 X 0 ]
I am confused with the different msgs. For example, is
Writeback_Control used for L2 cache to write back to the memory? Is
there any writeback between L1 and L2? What's a Forward_Control and
Unblock_Control? Some specific examples would be helpful.


>
> I recall that I assumed the inter-CMP network was completely isolated from
> the intra-CMP network even though Ruby treats them the same.  Therefore
> VN0 for intra-CMP messages can be thought of as seperate for VN0 inter-CMP
> messages.
>
> --mike
>
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>
>

Thanks,
Lide
[← Prev in Thread] Current Thread [Next in Thread→]