Re: [Gems-users] Trouble with MOESI_CMP_NUCA


Date: Fri, 7 Apr 2006 10:44:43 -0600
From: Steve Barrus <sbarrus@xxxxxxxxxxxx>
Subject: Re: [Gems-users] Trouble with MOESI_CMP_NUCA
Here is the part of the trace around the time of the request that
deadlocks.  Do you need more of the trace?

-Steve

 314495   0  -1        Seq               Begin       >       [0x2d0ec0, line 0x2d0ec0] 
 314499   0  -1        Seq                Done       >       [0x2d0ecc, line 0x2d0ec0] 4 cycles L1Cache No
 314499   0   0    L1Cache               Store     MM>MM     [0x2d0ec0, line 0x2d0ec0] 
 314500   1  -1        Seq               Begin       >       [0x2d0f40, line 0x2d0f40] 
 314500   2  -1        Seq               Begin       >       [0x30a840, line 0x30a840] 
 314500   3  -1        Seq               Begin       >       [0x30a840, line 0x30a840] 
 314500   4  -1        Seq               Begin       >       [0x30a840, line 0x30a840] 
 314500   5  -1        Seq               Begin       >       [0x30a840, line 0x30a840] 
 314500   7  -1        Seq               Begin       >       [0x30a840, line 0x30a840] 
 314501   0  -1        Seq               Begin       >       [0x14000, line 0x14000] 
 314504   0   7    L1Cache                Load     NP>IS     [0x30a840, line 0x30a840] 
 314504   0   5    L1Cache                Load     NP>IS     [0x30a840, line 0x30a840] 
 314504   0   4    L1Cache                Load     NP>IS     [0x30a840, line 0x30a840] 
 314504   0   3    L1Cache                Load     NP>IS     [0x30a840, line 0x30a840] 
 314504   0   2    L1Cache                Load     NP>IS     [0x30a840, line 0x30a840] 
 314504   1  -1        Seq                Done       >       [0x2d0f4c, line 0x2d0f40] 4 cycles L1Cache No
 314504   0   1    L1Cache               Store     MM>MM     [0x2d0f40, line 0x2d0f40] 
 314505   0  -1        Seq                Done       >       [0x14020, line 0x14000] 4 cycles L1Cache No
 314505   0   0    L1Cache                Load      M>M      [0x14000, line 0x14000] 
 314506   0  -1        Seq               Begin       >       [0x14040, line 0x14040] 
 314510   0  49    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314510   0  -1        Seq                Done       >       [0x1404c, line 0x14040] 4 cycles L1Cache No
 314510   0   0    L1Cache                Load      M>M      [0x14040, line 0x14040] 
 314510   0 113    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314510   0  33    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314510   0  65    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314510   0  81    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314513   0  -1        Seq               Begin       >       [0x2d0ec0, line 0x2d0ec0] 
 314517   0  -1        Seq                Done       >       [0x2d0ecc, line 0x2d0ec0] 4 cycles L1Cache No
 314517   0   0    L1Cache                Load     MM>MM     [0x2d0ec0, line 0x2d0ec0] 
 314518   0  -1        Seq               Begin       >       [0x2d0ec0, line 0x2d0ec0] 
 314522   0  -1        Seq                Done       >       [0x2d0ecc, line 0x2d0ec0] 4 cycles L1Cache No
 314522   0   0    L1Cache               Store     MM>MM     [0x2d0ec0, line 0x2d0ec0] 
 314524   0  -1        Seq               Begin       >       [0x14000, line 0x14000] 
 314526   0   3  Collector       Miss_Get_last Col_NP>Col_P  [0x30a840, line 0x30a840] 
 314526   0   5  Collector       Miss_Get_last Col_NP>Col_P  [0x30a840, line 0x30a840] 
 314527   0   3  Collector        Issue_L2_Get  Col_P>Col_P  [0x30a840, line 0x30a840] 
 314527   0   5  Collector        Issue_L2_Get  Col_P>Col_P  [0x30a840, line 0x30a840] 
 314528   0  -1        Seq                Done       >       [0x14020, line 0x14000] 4 cycles L1Cache No
 314528   0   0    L1Cache                Load      M>M      [0x14000, line 0x14000] 
 314528   0   4  Collector       Miss_Get_last Col_NP>Col_P  [0x30a840, line 0x30a840] 
 314529   0   4  Collector        Issue_L2_Get  Col_P>Col_P  [0x30a840, line 0x30a840] 
 314529   0  -1        Seq               Begin       >       [0x14040, line 0x14040] 
 314530   0   7  Collector       Miss_Get_last Col_NP>Col_P  [0x30a840, line 0x30a840] 
 314531   0   7  Collector        Issue_L2_Get  Col_P>Col_P  [0x30a840, line 0x30a840] 
 314532   0   2  Collector       Miss_Get_last Col_NP>Col_P  [0x30a840, line 0x30a840] 
 314533   0  -1        Seq                Done       >       [0x1404c, line 0x14040] 4 cycles L1Cache No
 314533   0   0    L1Cache                Load      M>M      [0x14040, line 0x14040] 
 314533   0   2  Collector        Issue_L2_Get  Col_P>Col_P  [0x30a840, line 0x30a840] 
 314536   0 241    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314536   0 241    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314536   0  -1        Seq               Begin       >       [0x2d0ec0, line 0x2d0ec0] 
 314538   0 241    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314538   0 145    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314538   0 145    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314538   0 129    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314538   0 129    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314538   0  17    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314540   0 241    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314540   0  17    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314540   0 129    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314540   0 145    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 
 314540   0  -1        Seq                Done       >       [0x2d0ecc, line 0x2d0ec0] 4 cycles L1Cache No
 314540   0   0    L1Cache                Load     MM>MM     [0x2d0ec0, line 0x2d0ec0] 
 314541   0  -1        Seq               Begin       >       [0x2d0ec0, line 0x2d0ec0] 
 314542   0  17    L2Cache             L1_GETS  L2_NP>L2_NP  [0x30a840, line 0x30a840] 

...

Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:103: Possible Deadlock detected
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:104: request is [CacheMsg: Address=[0x30a850, line 0x30a840] Type=LD ProgramCounter=[0x41e808, line 0x41e800] AccessMode=SupervisorMode Size=8 Prefetch=No Version=0 Aborted=0 Time=314500 ]
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:105: m_chip_ptr->getID() is 0
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:106: m_version is 3
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:107: keys.size() is 1
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:108: current_time is 400001
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:109: request.getTime() is 314500
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:110: current_time - request.getTime() is 85501
Warning: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:111: *m_readRequestTable_ptr is [ [0x30a840, line 0x30a840]=[CacheMsg: Address=[0x30a850, line 0x30a840] Type =LD ProgramCounter=[0x41e808, line 0x41e800] AccessMode=SupervisorMode Size=8 Prefetch=No Version=0 Aborted=0 Time=314500 ] ]
Fatal Error: in fn virtual void Sequencer::wakeup() in system/Sequencer.C:112: Aborting
***  Simics getting shaky, switching to 'safe' mode.
***  Simics (main thread) received an abort signal, probably an assertion.


On Thu, Apr 06, 2006 at 06:43:35PM -0500, Mike Marty wrote:
> I suspect a configuration issue, but it would be helpful to see a trace.
> 
> the following command should start a trace at time 1 (immediately):
> 
> ruby0.debug-start-time "1"
> 
> Or, to see exactly where the deadlocked request started:
> 
> ruby0.debug-start-time "349000"
> 
> --Mike
> 
> 
> > I have been trying to get the MOESI_CMP_NUCA protocol to work in Ruby,
> > but I haven't had any luck.  Sequencer::wakeup() alway reports a
> > "Possible Deadlock detected" and cause it to abort.  Can anyone tell
> > me why this might be happening?  The commands and error message are
> > shown below.  Thanks.
> >
> > -Steve
> >
> >
> > read-configuration test-check
> > instruction-fetch-mode instruction-fetch-trace
> > istc-disable
> > dstc-disable
> > load-module ruby
> > ruby0.setparam g_NUM_PROCESSORS 8
> > ruby0.setparam g_MEMORY_SIZE_BYTES 4294967296
> > ruby0.setparam g_PROCS_PER_CHIP 8
> > ruby0.setparam g_NUM_L2_BANKS 256
> > ruby0.setparam g_NUM_MEMORIES 8
> > ruby0.setparam g_NUM_DNUCA_BANK_SETS 16
> > ruby0.setparam_str g_NETWORK_TOPOLOGY FILE_SPECIFIED
> > ruby0.setparam_str g_DYNAMIC_TIMEOUT_ENABLED false
> > ruby0.setparam_str REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH true
> > ruby0.setparam_str g_CACHE_DESIGN NUCACOL
> > ruby0.setparam_str g_adaptive_routing false
> > ruby0.setparam NUMBER_OF_VIRTUAL_NETWORKS 7
> > ruby0.setparam g_endpoint_bandwidth 1000
> > ruby0.setparam_str g_NUCA_PREDICTOR_CONFIG DNUCA
> > ruby0.setparam_str ENABLE_MIGRATION true
> > ruby0.setparam_str COLLECTOR_HANDLES_OFF_CHIP_REQUESTS true
> > ruby0.setparam_str PERFECT_DNUCA_SEARCH true
> > ruby0.init
> > c
> >
> >
> > [Turbo] Trampoline found at block start. Warning: in fn virtual void
> > Sequencer::wakeup() in system/Sequencer.C:103: Possible Deadlock
> > detected Warning: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:103: Possible Deadlock detected Warning: in fn
> > virtual void Sequencer::wakeup() in system/Sequencer.C:104: request is
> > [CacheMsg: Address=[0x30a850, line 0x30a840] Type=LD
> > ProgramCounter=[0x41e808, line 0x41e800] AccessMode=SupervisorMode
> > Size=8 Prefetch=No Version=0 Aborted=0 Time=349000 ] Warning: in fn
> > virtual void Sequencer::wakeup() in system/Sequencer.C:104: request is
> > [CacheMsg: Address=[0x30a850, line 0x30a840] Type=LD
> > ProgramCounter=[0x41e808, line 0x41e800] AccessMode=SupervisorMode
> > Size=8 Prefetch=No Version=0 Aborted=0 Time=349000 ] Warning: in fn
> > virtual void Sequencer::wakeup() in system/Sequencer.C:105:
> > m_chip_ptr->getID() is 0 Warning: in fn virtual void Sequencer::wakeup()
> > in system/Sequencer.C:105: m_chip_ptr->getID() is 0 Warning: in fn
> > virtual void Sequencer::wakeup() in system/Sequencer.C:106: m_version is
> > 4 Warning: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:106: m_version is 4 Warning: in fn virtual void
> > Sequencer::wakeup() in system/Sequencer.C:107: keys.size() is 1 Warning:
> > in fn virtual void Sequencer::wakeup() in system/Sequencer.C:107:
> > keys.size() is 1 Warning: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:108: current_time is 400500 Warning: in fn virtual
> > void Sequencer::wakeup() in system/Sequencer.C:108: current_time is
> > 400500 Warning: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:109: request.getTime() is 349000 Warning: in fn
> > virtual void Sequencer::wakeup() in system/Sequencer.C:109:
> > request.getTime() is 349000 Warning: in fn virtual void
> > Sequencer::wakeup() in system/Sequencer.C:110: current_time -
> > request.getTime() is 51500 Warning: in fn virtual void
> > Sequencer::wakeup() in system/Sequencer.C:110: current_time -
> > request.getTime() is 51500 Warning: in fn virtual void
> > Sequencer::wakeup() in system/Sequencer.C:111: *m_readRequestTable_ptr
> > is [ [0x30a840, line 0x30a840]=[CacheMsg: Address=[0x30a850, line
> > 0x30a840] Type=LD ProgramCounter=[0x41e808, line 0x41e800]
> > AccessMode=SupervisorMode Size=8 Prefetch=No Version=0 Aborted=0
> > Time=349000 ] ] Warning: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:111: *m_readRequestTable_ptr is [ [0x30a840, line
> > 0x30a840]=[CacheMsg: Address=[0x30a850, line 0x30a840] Type=LD
> > ProgramCounter=[0x41e808, line 0x41e800] AccessMode=SupervisorMode
> > Size=8 Prefetch=No Version=0 Aborted=0 Time=349000 ] ] Fatal Error: in
> > fn virtual void Sequencer::wakeup() in system/Sequencer.C:112: Aborting
> > Fatal Error: in fn virtual void Sequencer::wakeup() in
> > system/Sequencer.C:112: Aborting *** Simics getting shaky, switching to
> > 'safe' mode. *** Simics (main thread) received an abort signal, probably
> > an assertion.
> >
> > _______________________________________________
> > Gems-users mailing list
> > Gems-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> >
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
[← Prev in Thread] Current Thread [Next in Thread→]