On Tue, Apr 1, 2008 at 1:30 PM, Enrique Vallejo <
enrique@xxxxxxxxxxxxx> wrote:
> Did you try defining BIGSET in $GEMS/ruby/common/Set.h? I haven't tried
> processor simulations, but there's a comment in that file:
>
> // Define this to use the BigSet class which is slower, but supports
> // sets of size larger than 32.
>
> // #define BIGSET
>
> Given that Netdests are defined using these sets, that's probably a source
> of conflict. Good luck,
>
> Enrique Vallejo
> University of Cantabria
>
http://www.atc.unican.es/~~enrique/
>
>
> -----Mensaje original-----
> De:
gems-users-bounces@xxxxxxxxxxx [mailto:
gems-users-bounces@xxxxxxxxxxx]
> En nombre de Konstantinos Aisopos
> Enviado el: martes, 01 de abril de 2008 9:20
> Para: Gems Users
> Asunto: Re: [Gems-users] MESI_SCMP_ protocol crush
>
>
> I found some more properties of my problem: The problem is independent
> of the protocol (I have the same asserion failure when using
> MSI_MOSI_CMP_directory), but it is dependent on the number of nodes.
> When simulating 16 nodes (in tester or n ruby) everything works fine,
> but when I try 64, the assertion fails.
>
> The assertion has to do with the mapping of memory addresses to an L2
> tile. It's inside the function addSharer, and makes sure (whenever a
> sharer is added) that this address should indeed be mapped in this
> tile:
>
> void addSharer(Address addr, MachineID requestor) {
> DEBUG_EXPR(machineID);
> DEBUG_EXPR(requestor);
> DEBUG_EXPR(addr);
> assert(map_L1CacheMachId_to_L2Cache(addr, requestor) ==
> machineID); <--FAILED
> L2cacheMemory[addr].Sharers.add(requestor);
> }
>
> I found the code that does the mapping but didn't really understand
> what's happening (I paste it at the end of this email)
>
> Huan,
>
> My topology is an 8x8 mesh, which I initally created with
> GarnetFileMaker.py, and then added the memory nodes manually.
>
> Mike,
>
> here's a trace of MSI_MOSI_CMP_directory, simulating 64 cores (
> parameters: -p 64 -e 64 -a 64 -m 64 -n FILE_SPECIFIED -l 1 -s 1)
>
> Request trace enabled to output file 'ruby.trace.gz'
> 2 46 -1 Seq Begin > [0x62c0,
> line 0x62c0] ST
> 4 18 -1 Seq Begin > [0x39c0,
> line 0x39c0] ST
> 6 31 -1 Seq Begin > [0x2cc0,
> line 0x2cc0] ST
> 6 0 46 L1Cache Store NP>L1_IM [0x62c0,
> line 0x62c0]
> 7 0 31 L1Cache Store NP>L1_IM [0x2cc0,
> line 0x2cc0]
> 8 13 -1 Seq Begin > [0x59c0,
> line 0x59c0] ST
> 8 0 18 L1Cache Store NP>L1_IM [0x39c0,
> line 0x39c0]
> 10 0 13 L1Cache Store NP>L1_IM [0x59c0,
> line 0x59c0]
> 10 51 -1 Seq Begin > [0x3fc0,
> line 0x3fc0] ATOMIC
> 12 40 -1 Seq Begin > [0x9c0,
> line 0x9c0] ST
> 13 0 51 L1Cache Store NP>L1_IM [0x3fc0,
> line 0x3fc0]
> 14 49 -1 Seq Begin > [0x60c0,
> line 0x60c0] ST
> 16 0 49 L1Cache Store NP>L1_IM [0x60c0,
> line 0x60c0]
> 16 0 40 L1Cache Store NP>L1_IM [0x9c0,
> line 0x9c0]
> 16 18 -1 Seq Begin > [0x4c0,
> line 0x4c0] ST
> 18 2 -1 Seq Begin > [0x56c0,
> line 0x56c0] ST
> 19 0 18 L1Cache Store NP>L1_IM [0x4c0,
> line 0x4c0]
> 20 14 -1 Seq Begin > [0x5dc0,
> line 0x5dc0] ST
> 21 0 2 L1Cache Store NP>L1_IM [0x56c0,
> line 0x56c0]
> 22 35 -1 Seq Begin > [0x52c0,
> line 0x52c0] ATOMIC
> 23 0 14 L1Cache Store NP>L1_IM [0x5dc0,
> line 0x5dc0]
> 24 4 -1 Seq Begin > [0x44c0,
> line 0x44c0] ST
> Runtime Error at ../protocols/MSI_MOSI_CMP_directory-L2cache.sm:275,
> Ruby Time: 24: assert failure, PID: 30232
> press return to continue.
>
> I m looking into how to print the machineID and requestor in the
> trace, but it seems that the *first time* the code reaches this
> assertion (the first time a sharer is added) the asserion fails, so it
> seems like a mapping problem, not a coherence protocol problem.
>
> thoughts?
>
> thanks a bunch for the help,
> -Kostas
>
> ------- mappng code: ruby/slicc_interface/RubySlicc_ComponentMapping.h------
>
> // input parameter is the base ruby node of the L1 cache
> // returns a value between 0 and total_L2_Caches_within_the_system
> inline
> MachineID map_L1CacheMachId_to_L2Cache(const Address& addr, MachineID
> L1CacheMachId)
> {
> int L2bank = 0;
> MachineID mach = {MACHINETYPE_L2CACHE_ENUM, 0};
>
> if (RubyConfig::L2CachePerChipBits() > 0) {
> if (MAP_L2BANKS_TO_LOWEST_BITS) {
> L2bank = addr.bitSelect(RubyConfig::dataBlockBits(),
>
> RubyConfig::dataBlockBits()+RubyConfig::L2CachePerChipBits()-1);
> } else {
> L2bank =
> addr.bitSelect(RubyConfig::dataBlockBits()+L2_CACHE_NUM_SETS_BITS,
>
> RubyConfig::dataBlockBits()+L2_CACHE_NUM_SETS_BITS+RubyConfig::L2CachePerChi
> pBits()-1);
> }
> }
>
> assert(L2bank < RubyConfig::numberOfL2CachePerChip());
> assert(L2bank >= 0);
>
> mach.num =
> RubyConfig::L1CacheNumToL2Base(L1CacheMachId.num)*RubyConfig::numberOfL2Cach
> ePerChip()
> // base #
> + L2bank; // bank #
> assert(mach.num < RubyConfig::numberOfL2Cache());
> return mach;
> }
>
>
> On Sun, Mar 30, 2008 at 10:22 PM, Mike Marty <
mike.marty@xxxxxxxxx> wrote:
> > I have no idea why that assertion would be triggered. I would print
> > out the machineID and requestor. See the wiki for generating a
> > protocol debug trace. Grep on the block address that causes the
> > assertion. Add extra debuggin information to the trace using
> > APPEND_TRANSITION_COMMENT and DEBUG_EXPR.
> >
> > --Mike
> >
> >
> > On Sun, Mar 30, 2008 at 3:27 PM, Konstantinos Aisopos
> >
> > <
kaisopos@xxxxxxxxx> wrote:
> > > Hello again,
> > >
> > > any ideas about my problem? any idea what this assertion prevents from
> > > happening? Should I provide you more information? Does the MESI_SCMP
> > > require any other parameters to be set that I don't know??
> > >
> > > I thought it was a topology problem so I created the file:
> > >
> ruby/network/simple/Network_Files/NUCA_Procs-64_ProcsPerChip-64_L2Banks-64_M
> emories-64.txt
> > > and set these parameters:
> > > ruby0.setparam_str g_CACHE_DESIGN NUCA
> > > ruby0.setparam_str g_NETWORK_TOPOLOGY FILE_SPECIFIED
> > > ... the problem still persists. I got rid of opal to make the
> > > simulation simpler. problem persists. Also, if i don't load ruby the
> > > simulation works fine.
> > >
> > > help please :P
> > >
> > > -Kostas
> > >
> > >
> > >
> > >
> > > On Thu, Mar 27, 2008 at 10:54 PM, Konstantinos Aisopos
> > > <
kaisopos@xxxxxxxxx> wrote:
> > > > Hi list,
> > > >
> > > > I am using MESI_SCMP_bankdirectory protocol to simulate a 64core
> > > > system. I haven't touched the protocol or the simulator. I am
> > > > executing the following script:
> > > >
> > > > instruction-fetch-mode instruction-fetch-trace
> > > > istc-disable
> > > > dstc-disable
> > > > cpu-switch-time 1
> > > > load-module ruby
> > > > load-module opal
> > > > ruby0.setparam g_NUM_PROCESSORS 64
> > > > ruby0.setparam g_PROCS_PER_CHIP 64
> > > > ruby0.setparam g_NUM_L2_BANKS 64
> > > > ruby0.setparam g_NUM_MEMORIES 64
> > > > ruby0.setparam NUMBER_OF_VIRTUAL_NETWORKS 5
> > > > ruby0.setparam g_MEMORY_SIZE_BYTES 4294967296
> > > > ruby0.setparam g_endpoint_bandwidth 1000
> > > > ruby0.init
> > > > opal0.init
> > > > opal0.sim-start "results.opal"
> > > > opal0.sim-step 10000000000
> > > >
> > > > and i am getting the following error, when i execute "opal0.sim-step
> > > > 10000000000":
> > > >
> > > > Runtime Error at ../protocols/MESI_SCMP_bankdirectory-L2cache.sm:224,
> > > > Ruby Time: 23: assert failure, PID: 1335
> > > >
> > > > the 224 line is:
> > > > assert(map_L1CacheMachId_to_L2Cache(addr,requestor) == machineID)
> > > >
> > > > any idea what might be wrong?
> > > >
> > > > thanks,
> > > >
> > > > Kostas
> > > >
> >
> > > _______________________________________________
> > > Gems-users mailing list
> > >
Gems-users@xxxxxxxxxxx
> > >
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> > > Use Google to search the GEMS Users mailing list by adding
> "site:
https://lists.cs.wisc.edu/archive/gems-users/" to your search.
> > >
> > >
> > _______________________________________________
> > Gems-users mailing list
> >
Gems-users@xxxxxxxxxxx
> >
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> > Use Google to search the GEMS Users mailing list by adding
> "site:
https://lists.cs.wisc.edu/archive/gems-users/" to your search.
> >
> >
> _______________________________________________
> Gems-users mailing list
>
Gems-users@xxxxxxxxxxx
>
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding
> "site:
https://lists.cs.wisc.edu/archive/gems-users/" to your search.
>
>
> _______________________________________________
> Gems-users mailing list
>
Gems-users@xxxxxxxxxxx
>
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding "site:
https://lists.cs.wisc.edu/archive/gems-users/" to your search.
>
>
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:
https://lists.cs.wisc.edu/archive/gems-users/" to your search.