Re: [Gems-users] running barnes-hut


Date: Thu, 02 Jun 2005 09:31:22 -0500
From: Alaa Alameldeen <alaa@xxxxxxxxxxx>
Subject: Re: [Gems-users] running barnes-hut
Ayse,

Does this error occur only when you run with Gems, or does it occur when you just run Barnes in Simics?

-Alaa

yilmazer@xxxxxxxxxxx wrote:
Below are from the target system console. What can be explanation of synching
os?

# ./LU -n512 -p8

Blocked Dense LU Factorization
     512 by 512 Matrix
     8 Processors
     16 by 16 Element Blocks


panic: failed to stop cpu4

panic[cpu0]/thread=2a10007dd40: send mondo timeout (target 0x4) [196984 NACK 0
BUSY]

000002a10007d5a0 SUNW,UltraSPARC-III+:send_one_mondo+104 (2, a698f397, 1471800,
146f000, 4000000000000, 300001b1f00)
  %l0-3: 00000000a698f1cd 00000000a698f2e6 0000000000000000 0000000000000001
  %l4-7: 0000000000030178 0000000000000004 0000000000000004 0000002d78cd0690
000002a10007d650 unix:xt_one_unchecked+f0 (b, 0, 1479f2c, 0, 1, ff)
  %l0-3: 0000000000000000 0000000000000004 0000000000000000 0000000000000004
  %l4-7: 000000000100accc 0000000000010001 00000300024dcd20 000002a10007d708
000002a10007d750 TS:ts_tick+1e8 (300024ddce0, 116cc18, 1700000, 1700000, 1, 0)
  %l0-3: 00000000014fd400 0000000000000000 0000030001f2d780 0000000000000000
  %l4-7: 000000007fffffff 0000000001000000 0000030002136a40 00000300013ac8c0
000002a10007d820 genunix:clock_tick+30 (300024ddce0, 1, 20, 3905f, bebc20, c3)
  %l0-3: 000000000118bc04 000003000241a708 0000000000000000 0000000000000002
  %l4-7: 0000000000000001 00000300013ac8c0 0000000001424e30 0000000000000000
000002a10007d8d0 genunix:clock+51c (300013ac8c0, 3000241a708, 0, 1495400,
1471800, 1481800)
  %l0-3: 0000030001c58000 00000300024ddce0 0000000000000000 0000000000000000
  %l4-7: 00000000014ed000 0000000000000000 0000000000000000 0000000000000000
000002a10007d9b0 genunix:cyclic_softint+b0 (1082668, d92, 3, 10, 1,
30000b78fb0)
  %l0-3: 0000030000a4fed0 0000000000000002 0000030000b78f48 0000000000000001
  %l4-7: 0000030000a4fea8 0000030000a6c880 0000000000000008 0000000000000000
000002a10007daa0 unix:cbe_level10+8 (0, 10003, 1400000, 2a10007dd40, 200060,
100be24)
  %l0-3: 0000000000000000 0000000000010000 0000000000000000 0000000000000000
  %l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

syncing file systems...





Quoting yilmazer@xxxxxxxxxxx:


Thank you Luke.
But I wonder why it results in operating system synch on target system.

Thanks again.
ayse.
Quoting Luke Yen <lyen@xxxxxxxxxxx>:


Hi:

 I have noticed alot of "patches" also on our version of the barnes
workload.  However I don't recall seeing the decodeFails messages.  This
is displayed whenever Opal cannot properly decode an instruction.  The
best way to debug this is to go through the code in Opal where

instructions

are
decoded (opal/system/static.C, in the decodeInstructionInfo() function),
and track down which switch statements are causing the failures.
From your output this seems to be isolated to a single instruction.

FYI decodeFail() gets called whenever the FAIL macro gets invoked inside decodeInstructionInfo().


Hope this helps, Luke

On Wed, 1 Jun 2005 yilmazer@xxxxxxxxxxx wrote:


Hello,
When I was running splash benchmarks(fft,lu) first I got the warning

messages

below and after that operating system began to sync. What can be the

problem?



### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
### statici::decodeFails. line:877. inst=0x737472
0000 0000 0111 0011:0111 0100 0111 0010
patch  NPC: 0xffffffffff3b3840 0xff3b3840
patch  NPC: 0xffffffffff3b3840 0xff3b3840
patch  NPC: 0xffffffffff3b3840 0xff3b3840
patch  NPC: 0xffffffffff3b3840 0xff3b3840
patch  PC: 0xff104fe8 0x1000400
patch  NPC: 0xff104fec 0x1000404
patch  PC: 0x1000480 0x1000820
patch  NPC: 0x1000484 0x1000824
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298
patch  PC: 0x1000d00 0xf000d294
patch  NPC: 0x1000d04 0xf000d298




Quoting Mike Marty <mikem@xxxxxxxxxxx>:


This looks like an operating system issue. Make sure you have enough

locks

available specified in your OS. Different systems have different

parameters

to

set, usually in /etc/system, but you might want to check with your

OS

reference

manual.


And be sure that it works without Ruby/Opal loaded.

--Mike


_______________________________________________ Gems-users mailing list Gems-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/gems-users




_______________________________________________ Gems-users mailing list Gems-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/gems-users


_______________________________________________ Gems-users mailing list Gems-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/gems-users




_______________________________________________ Gems-users mailing list Gems-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/gems-users





_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
[← Prev in Thread] Current Thread [Next in Thread→]