Re: [Gems-users] LogTM Transactions Hanging (Gems 2.1)


Date: Tue, 17 Jun 2008 22:31:04 -0500
From: Jayaram Bobba <bobba@xxxxxxxxxxx>
Subject: Re: [Gems-users] LogTM Transactions Hanging (Gems 2.1)
Fred,

Were you able to run any of the distributed microbenchmarks (deque or btree)?
I looked at your dump file that you sent with an earlier post.
The following lines look suspicious

[cpu0 info] Note that on this cpu, instruction-fetch-trace is implemented using instruction-cache-access-trace with a suitable cache line size. [cpu1 info] Note that on this cpu, instruction-fetch-trace is implemented using instruction-cache-access-trace with a suitable cache line size. [cpu2 info] Note that on this cpu, instruction-fetch-trace is implemented using instruction-cache-access-trace with a suitable cache line size. [cpu3 info] Note that on this cpu, instruction-fetch-trace is implemented using instruction-cache-access-trace with a suitable cache line size.

134461 0 [0,0] TRAP TO HANDLER: TID: 0 TRAP_TYPE 1 TRAP ADDRESS 0x925604c NUM_RETRIES 0 LOG_SIZE 204 XACT_LEVEL 1 XACT_LOWEST_CONFLICT_LEVEL 1 Handler Address = [0x17c70, line
0x17c40] PC = [0x100707c, line 0x1007040]
134461 0 [0,0] Begin ESCAPE ACTION - ESCAPE DEPTH: 1 PC [0x100707c, line 0x1007040] 134598 0 [0,0] ADD XACT FRAME oldLogFramePointer: [0x3209020, line 0x3209000] newLogFramePointer: [0x32090ec, line 0x32090c0] 1 134598 0 [0,0] BEGIN XACT: TID 0 XID 0 XACT_LEVEL: 2 PC: [0x169cc, line 0x169c0]

The expected behaviour is for the escape action to unroll the log and then jump back to the simulator for restoring the checkpoint. So you need to see an End ESCAPE ACTION before the transaction is restarted.

How are you compiling your workload? I would recommending compiling with the same flags as those used by the sample workloads (at least for transaction.c). Specifically, the offsets are important for the simulator to know exactly where to jump for trap handling (see set_transaction_registers() in transaction.c)

I am not sure what instruction-cache-access-trace is and how it differs from instruction-fetch-trace. The difference
if any could also affect trap handling.

Jayaram


Fuad Tabba wrote:
I've uploaded the dump files to

http://www.cs.auckland.ac.nz/~fuad/dump.TIMESTAMP <http://www.cs.auckland.ac.nz/%7Efuad/dump.TIMESTAMP> http://www.cs.auckland.ac.nz/~fuad/dump.BASE <http://www.cs.auckland.ac.nz/%7Efuad/dump.BASE>

so that you don't need to download/extract the tar in the previous email.

Thanks,
/Fuad

On Wed, Jun 18, 2008 at 2:30 PM, Fuad Tabba <fuad@xxxxxxxxxxxxxxxxx <mailto:fuad@xxxxxxxxxxxxxxxxx>> wrote:

    Hello again,

    Sorry to bump this thread. But I have tried playing around with
    the settings, and performing a clean installation of LogTM again,
    compiling my binaries with a lower optimization level and I still
    can't get LogTMSe (MESI_CMP_FILTER) to work with more than one
    thread (ATMTP on the other hand is working fine for the same
    binaries - different paths for begin/commit transaction obviously).

    I am using the default settings of microbench.py , except for:-


    g_NETWORK_TOPOLOGY: PT_TO_PT
    RETRY_LATENCY: 10
    XACT_MEMORY: true
    REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH: true
    NUMBER_OF_VIRTUAL_NETWORKS: 5
    g_PROCS_PER_CHIP=4 (since my checkpoint has four processors)

    If I use the BASE conflict resolution scheme, I get an abortion
    followed by (for details refer to dump.BASE):-
137639 1 [1,0] TID 1 XACT ABORT 0 caused by 0 [ 0, 0 ] xid: 0 address: [0x13248040, line 0x13248040] delay: 3142 PC [0x1be10,
    line 0x1be00]  *PC 0xf624e00c 'stw %i3, [%l3 + 12]'
    Starting command line. (May have skipped commands in script files.)
    [cpu1] v:0x0000000000015c74 p:0x00018871c74  ba 0x15c8c
    Setting new inspection cpu: cpu1
    Traceback (most recent call last):
      File "../../../gen-scripts/mfacet.py", line 308, in
    console_branch_internal
        wait_for_string(get_console(), __prompt)
      File
    "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/text_console_common.py",
    line 10, in wait_for_string
        wait_for_obj_hap("Xterm_Break_String", obj, break_id)
      File
    "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/cli_impl.py",
    line 3374, in wait_for_obj_hap
        return wait_for_hap_common([hap_name, name, idx0])
      File
    "/home/fuad/Desktop/NoBackup/simics-3.0.30/x86-linux/lib/python/cli_impl.py",
    line 3352, in wait_for_hap_common
    simics>     raise SimExc_Break, "Script branch interrupted"
    sim_core.SimExc_Break: Script branch interrupted

    On the other hand, if I use XACT_CONFLICT_RES=TIMESTAMP, I fall
    (into exceptions) and cannot get up (dump.TIMESTAMP).

    I'm completely baffled and would appreciate any help.

    My benchmark is a redblack tree (that I wrote, not the one that
    comes with gems), and what I'm doing is spawning two thread, then
    starting ruby. I then run a few transactions on each thread (for
    warmup), clear ruby statistics, and then run some more transactions.

    Cheers,
    /Fuad

    On Mon, Jun 16, 2008 at 11:35 AM, Fuad Tabba
    <fuad@xxxxxxxxxxxxxxxxx <mailto:fuad@xxxxxxxxxxxxxxxxx>> wrote:

        Hi,

        I recently had to reinstall gems2.1 and things have been kind
        of acting up. What I'm trying to do is to run more than one
        thread with transactions using LogTM (this particular run has
        2 threads):-
        XACT_CONFLICT_RES=TIMESTAMP
        g_NETWORK_TOPOLOGY=PT_TO_PT

        otherwise, everything else is set to the default values.

        One thread runs fine. However, two threads or more act up.
        Initially I get a bunch of "XACT CONSISTENCY CHECK FAILURE"
        and then a "SIMICS SEG FAULT", but it still continues to run.
        Finally however, I get a "Begin ESCAPE ACTION" and the
        simulation doesn't terminate (and the trace produces nothing
        afterwards).

        As I mentioned, 1 thread works fine, and so does ATMTP (for
        any number of threads). I believe that I'm doing the
        TM_Workload_Setup as well. Any ideas what I'm doing wrong?

        I've attached the dump for the run at debug level 2.

        Thanks,
/Fuad


------------------------------------------------------------------------

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.

[← Prev in Thread] Current Thread [Next in Thread→]