Re: [Gems-users] LogTM Transactions Hanging (Gems 2.1)


Date: Wed, 18 Jun 2008 17:34:41 +1200
From: "Fuad Tabba" <fuad@xxxxxxxxxxxxxxxxx>
Subject: Re: [Gems-users] LogTM Transactions Hanging (Gems 2.1)
Hi again,

I'm having a hard time building and compiling deque using the sun c
compiler, and I don't have gcc on my virtual sun box. Could someone
send me the precompiled binary for deque please (setting up gcc would
take ages).

Thanks,
/Fuad


On Wed, Jun 18, 2008 at 4:23 PM, Fuad Tabba <fuad@xxxxxxxxxxxxxxxxx> wrote:
> Thanks for your reply Jayaram.
>
>> I am not sure what instruction-cache-access-trace is and how it differs
>> from instruction-fetch-trace. The difference
>> if any could also affect trap handling.
>
> Not really sure how why it is instruction-cache-access-trace as
> opposed to instruction-fetch-trace. Is there a parameter I could
> change?
>
> I've changed my compiler options (sun c compiler) for my benchmark from:-
>
> -xO3 -xarch=v9 -xtarget=native -m32
>
> to
>
> -xO3 -m32 -xarch=sparcvis -xregs=no%appl
>
> since the compiler suggests that "-xarch=v8plusa is deprecated, use
> -m32 -xarch=sparcvis".
>
> Anyway, now instead of crashing, thread 1 finishes its transactions
> but thread 0 gets stuck in exceptions. Not sure if that's an
> improvement of if just moving things around kinda displaced the real
> problem. I updated the dump file at:-
>
> http://www.cs.auckland.ac.nz/~fuad/dump.BASE
>
>> Were you able to run any of the distributed microbenchmarks (deque or
>> btree)?
>
> Nope, I'm trying to figure out how to generate the scripts and run
> them now. Thought I'd send off this email first in case you or anyone
> else has any other ideas.
>
> Thanks again for your help.
>
> Cheers,
> /Fuad
>
> On Wed, Jun 18, 2008 at 3:31 PM, Jayaram Bobba <bobba@xxxxxxxxxxx> wrote:
>>
>> Fred,
>>
>> Were you able to run any of the distributed microbenchmarks (deque or
>> btree)?
>> I looked at your dump file that you sent with an earlier post.
>> The following lines look suspicious
>>
>> [cpu0 info] Note that on this cpu, instruction-fetch-trace is
>> implemented using instruction-cache-access-trace with a suitable cache
>> line size.
>> [cpu1 info] Note that on this cpu, instruction-fetch-trace is
>> implemented using instruction-cache-access-trace with a suitable cache
>> line size.
>> [cpu2 info] Note that on this cpu, instruction-fetch-trace is
>> implemented using instruction-cache-access-trace with a suitable cache
>> line size.
>> [cpu3 info] Note that on this cpu, instruction-fetch-trace is
>> implemented using instruction-cache-access-trace with a suitable cache
>> line size.
>>
>>  134461 0 [0,0] TRAP TO HANDLER: TID: 0 TRAP_TYPE 1 TRAP ADDRESS
>> 0x925604c NUM_RETRIES 0 LOG_SIZE 204 XACT_LEVEL 1
>> XACT_LOWEST_CONFLICT_LEVEL 1 Handler Address = [0x17c70, line
>> 0x17c40] PC = [0x100707c, line 0x1007040]
>>  134461 0 [0,0] Begin ESCAPE ACTION - ESCAPE DEPTH: 1 PC [0x100707c,
>> line 0x1007040]
>>  134598 0 [0,0] ADD XACT FRAME oldLogFramePointer: [0x3209020, line
>> 0x3209000] newLogFramePointer: [0x32090ec, line 0x32090c0] 1
>>  134598 0 [0,0] BEGIN XACT: TID 0 XID 0 XACT_LEVEL: 2 PC: [0x169cc, line
>> 0x169c0]
>>
>> The expected behaviour is for the escape action to unroll the log and
>> then jump back to the simulator
>> for restoring the checkpoint. So you need to see an End ESCAPE ACTION
>> before the transaction is restarted.
>>
>> How are you compiling your workload? I would recommending compiling with
>> the same flags as those used by
>> the sample workloads (at least for transaction.c). Specifically, the
>> offsets are important for the simulator
>> to know exactly where to jump for trap handling (see
>> set_transaction_registers() in transaction.c)
>>
>> I am not sure what instruction-cache-access-trace is and how it differs
>> from instruction-fetch-trace. The difference
>> if any could also affect trap handling.
>>
>> Jayaram
>>
>>
>> Fuad Tabba wrote:
>> > I've uploaded the dump files to
>> >
>> > http://www.cs.auckland.ac.nz/~fuad/dump.TIMESTAMP
>> > <http://www.cs.auckland.ac.nz/%7Efuad/dump.TIMESTAMP>
>> > http://www.cs.auckland.ac.nz/~fuad/dump.BASE
>> > <http://www.cs.auckland.ac.nz/%7Efuad/dump.BASE>
>> >
>> > so that you don't need to download/extract the tar in the previous email.
>> >
>> > Thanks,
>> > /Fuad
>> >
>> > On Wed, Jun 18, 2008 at 2:30 PM, Fuad Tabba <fuad@xxxxxxxxxxxxxxxxx
>> > <mailto:fuad@xxxxxxxxxxxxxxxxx>> wrote:
>> >
>> >     Hello again,
>> >
>> >     Sorry to bump this thread. But I have tried playing around with
>> >     the settings, and performing a clean installation of LogTM again,
>> >     compiling my binaries with a lower optimization level and I still
>> >     can't get LogTMSe (MESI_CMP_FILTER) to work with more than one
>> >     thread (ATMTP on the other hand is working fine for the same
>> >     binaries - different paths for begin/commit transaction obviously).
>> >
>> >     I am using the default settings of microbench.py , except for:-
>> >
>> >
>> >     g_NETWORK_TOPOLOGY: PT_TO_PT
>> >     RETRY_LATENCY: 10
>> >     XACT_MEMORY: true
>> >     REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH: true
>> >     NUMBER_OF_VIRTUAL_NETWORKS: 5
>> >     g_PROCS_PER_CHIP=4 (since my checkpoint has four processors)
>> >
>> >     If I use the BASE conflict resolution scheme, I get an abortion
>> >     followed by (for details refer to dump.BASE):-
>> >      137639   1 [1,0] TID 1 XACT ABORT 0 caused by 0 [ 0, 0 ] xid: 0
>> >     address: [0x13248040, line 0x13248040] delay: 3142  PC [0x1be10,
>> >     line 0x1be00]  *PC 0xf624e00c 'stw %i3, [%l3 + 12]'
>> >     Starting command line. (May have skipped commands in script files.)
>> >     [cpu1] v:0x0000000000015c74 p:0x00018871c74  ba 0x15c8c
>> >     Setting new inspection cpu: cpu1
>> >     Traceback (most recent call last):
>> >       File "../../../gen-scripts/mfacet.py", line 308, in
>> >     console_branch_internal
>> >         wait_for_string(get_console(), __prompt)
>> >       File
>> >     "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/text_console_common.py",
>> >     line 10, in wait_for_string
>> >         wait_for_obj_hap("Xterm_Break_String", obj, break_id)
>> >       File
>> >     "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/cli_impl.py",
>> >     line 3374, in wait_for_obj_hap
>> >         return wait_for_hap_common([hap_name, name, idx0])
>> >       File
>> >     "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/cli_impl.py",
>> >     line 3352, in wait_for_hap_common
>> >     simics>     raise SimExc_Break, "Script branch interrupted"
>> >     sim_core.SimExc_Break: Script branch interrupted
>> >
>> >     On the other hand, if I use XACT_CONFLICT_RES=TIMESTAMP, I fall
>> >     (into exceptions) and cannot get up (dump.TIMESTAMP).
>> >
>> >     I'm completely baffled and would appreciate any help.
>> >
>> >     My benchmark is a redblack tree (that I wrote, not the one that
>> >     comes with gems), and what I'm doing is spawning two thread, then
>> >     starting ruby. I then run a few transactions on each thread (for
>> >     warmup), clear ruby statistics, and then run some more transactions.
>> >
>> >     Cheers,
>> >     /Fuad
>> >
>> >     On Mon, Jun 16, 2008 at 11:35 AM, Fuad Tabba
>> >     <fuad@xxxxxxxxxxxxxxxxx <mailto:fuad@xxxxxxxxxxxxxxxxx>> wrote:
>> >
>> >         Hi,
>> >
>> >         I recently had to reinstall gems2.1 and things have been kind
>> >         of acting up. What I'm trying to do is to run more than one
>> >         thread with transactions using LogTM (this particular run has
>> >         2 threads):-
>> >         XACT_CONFLICT_RES=TIMESTAMP
>> >         g_NETWORK_TOPOLOGY=PT_TO_PT
>> >
>> >         otherwise, everything else is set to the default values.
>> >
>> >         One thread runs fine. However, two threads or more act up.
>> >         Initially I get a bunch of "XACT CONSISTENCY CHECK FAILURE"
>> >         and then a "SIMICS SEG FAULT", but it still continues to run.
>> >         Finally however, I get a "Begin ESCAPE ACTION" and the
>> >         simulation doesn't terminate (and the trace produces nothing
>> >         afterwards).
>> >
>> >         As I mentioned, 1 thread works fine, and so does ATMTP (for
>> >         any number of threads). I believe that I'm doing the
>> >         TM_Workload_Setup as well. Any ideas what I'm doing wrong?
>> >
>> >         I've attached the dump for the run at debug level 2.
>> >
>> >         Thanks,
>> >         /Fuad
>> >
>> >
>> >
>> > ------------------------------------------------------------------------
>> >
>> > _______________________________________________
>> > Gems-users mailing list
>> > Gems-users@xxxxxxxxxxx
>> > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
>> > Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>> >
>> >
>> _______________________________________________
>> Gems-users mailing list
>> Gems-users@xxxxxxxxxxx
>> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
>> Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>>
>
[← Prev in Thread] Current Thread [Next in Thread→]