[DynInst_API:] Fwd: DynInst problem of generating extra spill code


Date: Fri, 8 Sep 2023 20:00:02 +0000
From: "Tallent, Nathan R" <Nathan.Tallent@xxxxxxxx>
Subject: [DynInst_API:] Fwd: DynInst problem of generating extra spill code
Dear Bart,

You might recall that I mentioned a use of DynInst where instrumention is generated with with register saves/restores when none should be needed.

We have packaged a test case to illustrate the problem. 

A detailed description of the problem is in the hyperlink below. We are also happy to have a quick chat with one of your team members to make any needed clarifications.

Thank you!
__________________________________________
Nathan R. Tallent, PhD
Scalable Computing & Data Team Lead
High Performance Computing Group
Pacific Northwest National Laboratory
509.372.4206 â https://hpc.pnnl.gov/people/tallent/

Begin forwarded message:

From: "Suriyakumar, Yasodha" <yasodha@xxxxxxxx>
Subject: Re: DynInst problem of generating extra spill code
Date: September 7, 2023 at 12:13:07 PM PDT
To: "Tallent, Nathan R" <Nathan.Tallent@xxxxxxxx>
Cc: "Kilic, Ozgur" <okilic@xxxxxxx>, "Velugoti, Nanda K" <nanda.velugoti@xxxxxxxx>

Hello Nathan,
Link to the description of the issue is under (2) https://github.com/pnnl/memgaze/tree/master/xlib#dyninst
I have attached the testcase with all the relevant files. I have included a copy of the README.txt from the testcase.
The testcase covers a correct mapping as well as the added spill code. If there is any additional information needed, let me know. If not, it can be sent to the Dyninst group.
Thanks,
Yasodha.
 

From: Tallent, Nathan R <Nathan.Tallent@xxxxxxxx>
Date: Thursday, August 17, 2023 at 9:35 AM
To: Suriyakumar, Yasodha <yasodha@xxxxxxxx>
Cc: Kilic, Ozgur <okilic@xxxxxxx>, Velugoti, Nanda K <nanda.velugoti@xxxxxxxx>
Subject: DynInst problem of generating extra spill code

[With Ozgurâs correct email â stupid mac mail still pulls pnnl]
 
Hi Yasodha,
 
I talked to Bart Miller (head of DynInst group) specifically about our problem of occasionally generating extra spill code. He said (a) it was not expected (as we expected and (b) they would be happy to take a look.
 
When there is a convenient moment, we should prepare a package showing the (a) the normal case and (b) anomalous case.
 
Thanks!
__________________________________________
Nathan R. Tallent, PhD
Scalable Computing & Data Team Lead
High Performance Computing Group
Pacific Northwest National Laboratory
509.372.4206 â https://hpc.pnnl.gov/people/tallent/
 

Attachment: dyninst-spill-testcase.tgz
Description: dyninst-spill-testcase.tgz

This testcase illustrates the spill code added by Dyninst as part of the MemGaze instrumentation.

MemGaze:
https://github.com/pnnl/memgaze

Original binary:
SpMM kernel with COO storage format for the Sparse Matrix
Source code - https://github.com/pnnl/HiParTI
Binary file location - build/benchmark/matrix/spmm_mat

Files included in this directory
spmm_mat                  - Copy of the original binary
spmm_mat-memgaze          - Instrumented binary using MemGaze instrumentor
obj_spmm                  - Object dump of original binary 
obj_spmm_mat_inst         - Object dump of instrumented binary 
run_dyninst_spill.sh      - Script to run MemGaze instrumentor (there's not much here - 
                                it is copying the files, and a single call to MemGaze instrementor)
spmm_mat-memgaze.binanlys - File with mapping information between original and instrmented binary
      Format : 
               first column - instrumented binary IP
               sixth column - original binary IP
             seventh column - function in original binary
spmm_mat-memgaze.binanlys.log - auxiliary file generated by MemGaze instrumentor
spmm_mat-memgaze.hpcstruct    - auxiliary file generated by MemGaze instrumentor

To identify spill code - if run with the included orginal binary:
1. Example of correct mapping
  a) Entry in spmm_mat-memgaze.binanlys
      0x1060ee 1 0x50 0x1 0 77a4 ptiOmpSparseMatrixMulMatrix_Reduce._omp_fn.2
  b) obj_spmm
      77a4: 4d 8b 6d 50           mov    0x50(%r13),%r13
  c) obj_spmm_mat_inst
      1060ee: f3 49 0f ae e5        ptwrite %r13
      1060f3: 4d 8b 6d 50           mov    0x50(%r13),%r13

2. Spill code - if run with the included orginal binary:
  a) Find the mapping for original-to-instrumented in spmm_mat-memgaze.binanlys for 7800, 780a
    Entry in spmm_mat-memgaze.binanlys
      0x10618a 2 0x0 0x4 0 7800 ptiOmpSparseMatrixMulMatrix_Reduce._omp_fn.2
      0x1061b7 2 0x0 0x1 0 780a ptiOmpSparseMatrixMulMatrix_Reduce._omp_fn.2
  b) obj_spmm
     77fd: 41 39 c9              cmp    %ecx,%r9d
     7800: f3 41 0f 10 04 ba     movss  (%r10,%rdi,4),%xmm0
     7806: f3 0f 59 c1           mulss  %xmm1,%xmm0
     780a: f3 0f 58 06           addss  (%rsi),%xmm0
  c) obj_spmm_mat_inst has the following code instead of "ptwrite"
    10618a: 48 8d 64 24 80        lea    -0x80(%rsp),%rsp   <--- Start of spill code
    10618f: 50                    push   %rax
    106190: 9f                    lahf
    106191: 0f 90 c0              seto   %al
    106194: 50                    push   %rax
    106195: f3 49 0f ae e2        ptwrite %r10              <--- Intended ptwrite
    10619a: f3 48 0f ae e7        ptwrite %rdi
    10619f: 58                    pop    %rax
    1061a0: 80 c0 7f              add    $0x7f,%al
    1061a3: 9e                    sahf
    1061a4: 58                    pop    %rax
    1061a5: 48 8d a4 24 80 00 00  lea    0x80(%rsp),%rsp
    1061ac: 00
    1061ad: f3 41 0f 10 04 ba     movss  (%r10,%rdi,4),%xmm0
    1061b3: f3 0f 59 c1           mulss  %xmm1,%xmm0
    1061b7: 48 8d 64 24 80        lea    -0x80(%rsp),%rsp     <--- Start of spill code
    1061bc: 50                    push   %rax
    1061bd: 9f                    lahf
    1061be: 0f 90 c0              seto   %al
    1061c1: 50                    push   %rax
    1061c2: f3 48 0f ae e6        ptwrite %rsi                <--- Intended ptwrite
    1061c7: 58                    pop    %rax
    1061c8: 80 c0 7f              add    $0x7f,%al
    1061cb: 9e                    sahf
    1061cc: 58                    pop    %rax
    1061cd: 48 8d a4 24 80 00 00  lea    0x80(%rsp),%rsp

3. To find spill code if a regenerated binary is used in for instrmentation
  a) Find ptiOmpSparseMatrixMulMatrix in object dump of instrumented binary 
  b) Search for "seto" within that section, 
  c) Notice that the IP of the "lea" instruction occuring before the "seto" is mentioned in spmm_mat-memgaze.binanlys for the mapping instead of the "ptwrite"
[← Prev in Thread] Current Thread [Next in Thread→]