summaryrefslogtreecommitdiffstats
path: root/Documentation/s390/Debugging390.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/s390/Debugging390.txt')
-rw-r--r--Documentation/s390/Debugging390.txt2142
1 files changed, 0 insertions, 2142 deletions
diff --git a/Documentation/s390/Debugging390.txt b/Documentation/s390/Debugging390.txt
deleted file mode 100644
index 5ae7f868a007..000000000000
--- a/Documentation/s390/Debugging390.txt
+++ /dev/null
@@ -1,2142 +0,0 @@
-
- Debugging on Linux for s/390 & z/Architecture
- by
- Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com)
- Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation
- Best viewed with fixed width fonts
-
-Overview of Document:
-=====================
-This document is intended to give a good overview of how to debug Linux for
-s/390 and z/Architecture. It is not intended as a complete reference and not a
-tutorial on the fundamentals of C & assembly. It doesn't go into
-390 IO in any detail. It is intended to complement the documents in the
-reference section below & any other worthwhile references you get.
-
-It is intended like the Enterprise Systems Architecture/390 Reference Summary
-to be printed out & used as a quick cheat sheet self help style reference when
-problems occur.
-
-Contents
-========
-Register Set
-Address Spaces on Intel Linux
-Address Spaces on Linux for s/390 & z/Architecture
-The Linux for s/390 & z/Architecture Kernel Task Structure
-Register Usage & Stackframes on Linux for s/390 & z/Architecture
-A sample program with comments
-Compiling programs for debugging on Linux for s/390 & z/Architecture
-Debugging under VM
-s/390 & z/Architecture IO Overview
-Debugging IO on s/390 & z/Architecture under VM
-GDB on s/390 & z/Architecture
-Stack chaining in gdb by hand
-Examining core dumps
-ldd
-Debugging modules
-The proc file system
-SysRq
-References
-Special Thanks
-
-Register Set
-============
-The current architectures have the following registers.
-
-16 General propose registers, 32 bit on s/390 and 64 bit on z/Architecture,
-r0-r15 (or gpr0-gpr15), used for arithmetic and addressing.
-
-16 Control registers, 32 bit on s/390 and 64 bit on z/Architecture, cr0-cr15,
-kernel usage only, used for memory management, interrupt control, debugging
-control etc.
-
-16 Access registers (ar0-ar15), 32 bit on both s/390 and z/Architecture,
-normally not used by normal programs but potentially could be used as
-temporary storage. These registers have a 1:1 association with general
-purpose registers and are designed to be used in the so-called access
-register mode to select different address spaces.
-Access register 0 (and access register 1 on z/Architecture, which needs a
-64 bit pointer) is currently used by the pthread library as a pointer to
-the current running threads private area.
-
-16 64 bit floating point registers (fp0-fp15 ) IEEE & HFP floating
-point format compliant on G5 upwards & a Floating point control reg (FPC)
-4 64 bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines.
-Note:
-Linux (currently) always uses IEEE & emulates G5 IEEE format on older machines,
-( provided the kernel is configured for this ).
-
-
-The PSW is the most important register on the machine it
-is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of
-a program counter (pc), condition code register,memory space designator.
-In IBM standard notation I am counting bit 0 as the MSB.
-It has several advantages over a normal program counter
-in that you can change address translation & program counter
-in a single instruction. To change address translation,
-e.g. switching address translation off requires that you
-have a logical=physical mapping for the address you are
-currently running at.
-
- Bit Value
-s/390 z/Architecture
-0 0 Reserved ( must be 0 ) otherwise specification exception occurs.
-
-1 1 Program Event Recording 1 PER enabled,
- PER is used to facilitate debugging e.g. single stepping.
-
-2-4 2-4 Reserved ( must be 0 ).
-
-5 5 Dynamic address translation 1=DAT on.
-
-6 6 Input/Output interrupt Mask
-
-7 7 External interrupt Mask used primarily for interprocessor
- signalling and clock interrupts.
-
-8-11 8-11 PSW Key used for complex memory protection mechanism
- (not used under linux)
-
-12 12 1 on s/390 0 on z/Architecture
-
-13 13 Machine Check Mask 1=enable machine check interrupts
-
-14 14 Wait State. Set this to 1 to stop the processor except for
- interrupts and give time to other LPARS. Used in CPU idle in
- the kernel to increase overall usage of processor resources.
-
-15 15 Problem state ( if set to 1 certain instructions are disabled )
- all linux user programs run with this bit 1
- ( useful info for debugging under VM ).
-
-16-17 16-17 Address Space Control
-
- 00 Primary Space Mode:
- The register CR1 contains the primary address-space control ele-
- ment (PASCE), which points to the primary space region/segment
- table origin.
-
- 01 Access register mode
-
- 10 Secondary Space Mode:
- The register CR7 contains the secondary address-space control
- element (SASCE), which points to the secondary space region or
- segment table origin.
-
- 11 Home Space Mode:
- The register CR13 contains the home space address-space control
- element (HASCE), which points to the home space region/segment
- table origin.
-
- See "Address Spaces on Linux for s/390 & z/Architecture" below
- for more information about address space usage in Linux.
-
-18-19 18-19 Condition codes (CC)
-
-20 20 Fixed point overflow mask if 1=FPU exceptions for this event
- occur ( normally 0 )
-
-21 21 Decimal overflow mask if 1=FPU exceptions for this event occur
- ( normally 0 )
-
-22 22 Exponent underflow mask if 1=FPU exceptions for this event occur
- ( normally 0 )
-
-23 23 Significance Mask if 1=FPU exceptions for this event occur
- ( normally 0 )
-
-24-31 24-30 Reserved Must be 0.
-
- 31 Extended Addressing Mode
- 32 Basic Addressing Mode
- Used to set addressing mode
- PSW 31 PSW 32
- 0 0 24 bit
- 0 1 31 bit
- 1 1 64 bit
-
-32 1=31 bit addressing mode 0=24 bit addressing mode (for backward
- compatibility), linux always runs with this bit set to 1
-
-33-64 Instruction address.
- 33-63 Reserved must be 0
- 64-127 Address
- In 24 bits mode bits 64-103=0 bits 104-127 Address
- In 31 bits mode bits 64-96=0 bits 97-127 Address
- Note: unlike 31 bit mode on s/390 bit 96 must be zero
- when loading the address with LPSWE otherwise a
- specification exception occurs, LPSW is fully backward
- compatible.
-
-
-Prefix Page(s)
---------------
-This per cpu memory area is too intimately tied to the processor not to mention.
-It exists between the real addresses 0-4096 on s/390 and between 0-8192 on
-z/Architecture and is exchanged with one page on s/390 or two pages on
-z/Architecture in absolute storage by the set prefix instruction during Linux
-startup.
-This page is mapped to a different prefix for each processor in an SMP
-configuration (assuming the OS designer is sane of course).
-Bytes 0-512 (200 hex) on s/390 and 0-512, 4096-4544, 4604-5119 currently on
-z/Architecture are used by the processor itself for holding such information
-as exception indications and entry points for exceptions.
-Bytes after 0xc00 hex are used by linux for per processor globals on s/390 and
-z/Architecture (there is a gap on z/Architecture currently between 0xc00 and
-0x1000, too, which is used by Linux).
-The closest thing to this on traditional architectures is the interrupt
-vector table. This is a good thing & does simplify some of the kernel coding
-however it means that we now cannot catch stray NULL pointers in the
-kernel without hard coded checks.
-
-
-
-Address Spaces on Intel Linux
-=============================
-
-The traditional Intel Linux is approximately mapped as follows forgive
-the ascii art.
-0xFFFFFFFF 4GB Himem *****************
- * *
- * Kernel Space *
- * *
- ***************** ****************
-User Space Himem * User Stack * * *
-(typically 0xC0000000 3GB ) ***************** * *
- * Shared Libs * * Next Process *
- ***************** * to *
- * * <== * Run * <==
- * User Program * * *
- * Data BSS * * *
- * Text * * *
- * Sections * * *
-0x00000000 ***************** ****************
-
-Now it is easy to see that on Intel it is quite easy to recognise a kernel
-address as being one greater than user space himem (in this case 0xC0000000),
-and addresses of less than this are the ones in the current running program on
-this processor (if an smp box).
-If using the virtual machine ( VM ) as a debugger it is quite difficult to
-know which user process is running as the address space you are looking at
-could be from any process in the run queue.
-
-The limitation of Intels addressing technique is that the linux
-kernel uses a very simple real address to virtual addressing technique
-of Real Address=Virtual Address-User Space Himem.
-This means that on Intel the kernel linux can typically only address
-Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines
-can typically use.
-They can lower User Himem to 2GB or lower & thus be
-able to use 2GB of RAM however this shrinks the maximum size
-of User Space from 3GB to 2GB they have a no win limit of 4GB unless
-they go to 64 Bit.
-
-
-On 390 our limitations & strengths make us slightly different.
-For backward compatibility we are only allowed use 31 bits (2GB)
-of our 32 bit addresses, however, we use entirely separate address
-spaces for the user & kernel.
-
-This means we can support 2GB of non Extended RAM on s/390, & more
-with the Extended memory management swap device &
-currently 4TB of physical memory currently on z/Architecture.
-
-
-Address Spaces on Linux for s/390 & z/Architecture
-==================================================
-
-Our addressing scheme is basically as follows:
-
- Primary Space Home Space
-Himem 0x7fffffff 2GB on s/390 ***************** ****************
-currently 0x3ffffffffff (2^42)-1 * User Stack * * *
-on z/Architecture. ***************** * *
- * Shared Libs * * *
- ***************** * *
- * * * Kernel *
- * User Program * * *
- * Data BSS * * *
- * Text * * *
- * Sections * * *
-0x00000000 ***************** ****************
-
-This also means that we need to look at the PSW problem state bit and the
-addressing mode to decide whether we are looking at user or kernel space.
-
-User space runs in primary address mode (or access register mode within
-the vdso code).
-
-The kernel usually also runs in home space mode, however when accessing
-user space the kernel switches to primary or secondary address mode if
-the mvcos instruction is not available or if a compare-and-swap (futex)
-instruction on a user space address is performed.
-
-When also looking at the ASCE control registers, this means:
-
-User space:
-- runs in primary or access register mode
-- cr1 contains the user asce
-- cr7 contains the user asce
-- cr13 contains the kernel asce
-
-Kernel space:
-- runs in home space mode
-- cr1 contains the user or kernel asce
- -> the kernel asce is loaded when a uaccess requires primary or
- secondary address mode
-- cr7 contains the user or kernel asce, (changed with set_fs())
-- cr13 contains the kernel asce
-
-In case of uaccess the kernel changes to:
-- primary space mode in case of a uaccess (copy_to_user) and uses
- e.g. the mvcp instruction to access user space. However the kernel
- will stay in home space mode if the mvcos instruction is available
-- secondary space mode in case of futex atomic operations, so that the
- instructions come from primary address space and data from secondary
- space
-
-In case of KVM, the kernel runs in home space mode, but cr1 gets switched
-to contain the gmap asce before the SIE instruction gets executed. When
-the SIE instruction is finished, cr1 will be switched back to contain the
-user asce.
-
-
-Virtual Addresses on s/390 & z/Architecture
-===========================================
-
-A virtual address on s/390 is made up of 3 parts
-The SX (segment index, roughly corresponding to the PGD & PMD in Linux
-terminology) being bits 1-11.
-The PX (page index, corresponding to the page table entry (pte) in Linux
-terminology) being bits 12-19.
-The remaining bits BX (the byte index are the offset in the page )
-i.e. bits 20 to 31.
-
-On z/Architecture in linux we currently make up an address from 4 parts.
-The region index bits (RX) 0-32 we currently use bits 22-32
-The segment index (SX) being bits 33-43
-The page index (PX) being bits 44-51
-The byte index (BX) being bits 52-63
-
-Notes:
-1) s/390 has no PMD so the PMD is really the PGD also.
-A lot of this stuff is defined in pgtable.h.
-
-2) Also seeing as s/390's page indexes are only 1k in size
-(bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k )
-to make the best use of memory by updating 4 segment indices
-entries each time we mess with a PMD & use offsets
-0,1024,2048 & 3072 in this page as for our segment indexes.
-On z/Architecture our page indexes are now 2k in size
-( bits 12-19 x 8 bytes per pte ) we do a similar trick
-but only mess with 2 segment indices each time we mess with
-a PMD.
-
-3) As z/Architecture supports up to a massive 5-level page table lookup we
-can only use 3 currently on Linux ( as this is all the generic kernel
-currently supports ) however this may change in future
-this allows us to access ( according to my sums )
-4TB of virtual storage per process i.e.
-4096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes,
-enough for another 2 or 3 of years I think :-).
-to do this we use a region-third-table designation type in
-our address space control registers.
-
-
-The Linux for s/390 & z/Architecture Kernel Task Structure
-==========================================================
-Each process/thread under Linux for S390 has its own kernel task_struct
-defined in linux/include/linux/sched.h
-The S390 on initialisation & resuming of a process on a cpu sets
-the __LC_KERNEL_STACK variable in the spare prefix area for this cpu
-(which we use for per-processor globals).
-
-The kernel stack pointer is intimately tied with the task structure for
-each processor as follows.
-
- s/390
- ************************
- * 1 page kernel stack *
- * ( 4K ) *
- ************************
- * 1 page task_struct *
- * ( 4K ) *
-8K aligned ************************
-
- z/Architecture
- ************************
- * 2 page kernel stack *
- * ( 8K ) *
- ************************
- * 2 page task_struct *
- * ( 8K ) *
-16K aligned ************************
-
-What this means is that we don't need to dedicate any register or global
-variable to point to the current running process & can retrieve it with the
-following very simple construct for s/390 & one very similar for z/Architecture.
-
-static inline struct task_struct * get_current(void)
-{
- struct task_struct *current;
- __asm__("lhi %0,-8192\n\t"
- "nr %0,15"
- : "=r" (current) );
- return current;
-}
-
-i.e. just anding the current kernel stack pointer with the mask -8192.
-Thankfully because Linux doesn't have support for nested IO interrupts
-& our devices have large buffers can survive interrupts being shut for
-short amounts of time we don't need a separate stack for interrupts.
-
-
-
-
-Register Usage & Stackframes on Linux for s/390 & z/Architecture
-=================================================================
-Overview:
----------
-This is the code that gcc produces at the top & the bottom of
-each function. It usually is fairly consistent & similar from
-function to function & if you know its layout you can probably
-make some headway in finding the ultimate cause of a problem
-after a crash without a source level debugger.
-
-Note: To follow stackframes requires a knowledge of C or Pascal &
-limited knowledge of one assembly language.
-
-It should be noted that there are some differences between the
-s/390 and z/Architecture stack layouts as the z/Architecture stack layout
-didn't have to maintain compatibility with older linkage formats.
-
-Glossary:
----------
-alloca:
-This is a built in compiler function for runtime allocation
-of extra space on the callers stack which is obviously freed
-up on function exit ( e.g. the caller may choose to allocate nothing
-of a buffer of 4k if required for temporary purposes ), it generates
-very efficient code ( a few cycles ) when compared to alternatives
-like malloc.
-
-automatics: These are local variables on the stack,
-i.e they aren't in registers & they aren't static.
-
-back-chain:
-This is a pointer to the stack pointer before entering a
-framed functions ( see frameless function ) prologue got by
-dereferencing the address of the current stack pointer,
- i.e. got by accessing the 32 bit value at the stack pointers
-current location.
-
-base-pointer:
-This is a pointer to the back of the literal pool which
-is an area just behind each procedure used to store constants
-in each function.
-
-call-clobbered: The caller probably needs to save these registers if there
-is something of value in them, on the stack or elsewhere before making a
-call to another procedure so that it can restore it later.
-
-epilogue:
-The code generated by the compiler to return to the caller.
-
-frameless-function
-A frameless function in Linux for s390 & z/Architecture is one which doesn't
-need more than the register save area (96 bytes on s/390, 160 on z/Architecture)
-given to it by the caller.
-A frameless function never:
-1) Sets up a back chain.
-2) Calls alloca.
-3) Calls other normal functions
-4) Has automatics.
-
-GOT-pointer:
-This is a pointer to the global-offset-table in ELF
-( Executable Linkable Format, Linux'es most common executable format ),
-all globals & shared library objects are found using this pointer.
-
-lazy-binding
-ELF shared libraries are typically only loaded when routines in the shared
-library are actually first called at runtime. This is lazy binding.
-
-procedure-linkage-table
-This is a table found from the GOT which contains pointers to routines
-in other shared libraries which can't be called to by easier means.
-
-prologue:
-The code generated by the compiler to set up the stack frame.
-
-outgoing-args:
-This is extra area allocated on the stack of the calling function if the
-parameters for the callee's cannot all be put in registers, the same
-area can be reused by each function the caller calls.
-
-routine-descriptor:
-A COFF executable format based concept of a procedure reference
-actually being 8 bytes or more as opposed to a simple pointer to the routine.
-This is typically defined as follows
-Routine Descriptor offset 0=Pointer to Function
-Routine Descriptor offset 4=Pointer to Table of Contents
-The table of contents/TOC is roughly equivalent to a GOT pointer.
-& it means that shared libraries etc. can be shared between several
-environments each with their own TOC.
-
-
-static-chain: This is used in nested functions a concept adopted from pascal
-by gcc not used in ansi C or C++ ( although quite useful ), basically it
-is a pointer used to reference local variables of enclosing functions.
-You might come across this stuff once or twice in your lifetime.
-
-e.g.
-The function below should return 11 though gcc may get upset & toss warnings
-about unused variables.
-int FunctionA(int a)
-{
- int b;
- FunctionC(int c)
- {
- b=c+1;
- }
- FunctionC(10);
- return(b);
-}
-
-
-s/390 & z/Architecture Register usage
-=====================================
-r0 used by syscalls/assembly call-clobbered
-r1 used by syscalls/assembly call-clobbered
-r2 argument 0 / return value 0 call-clobbered
-r3 argument 1 / return value 1 (if long long) call-clobbered
-r4 argument 2 call-clobbered
-r5 argument 3 call-clobbered
-r6 argument 4 saved
-r7 pointer-to arguments 5 to ... saved
-r8 this & that saved
-r9 this & that saved
-r10 static-chain ( if nested function ) saved
-r11 frame-pointer ( if function used alloca ) saved
-r12 got-pointer saved
-r13 base-pointer saved
-r14 return-address saved
-r15 stack-pointer saved
-
-f0 argument 0 / return value ( float/double ) call-clobbered
-f2 argument 1 call-clobbered
-f4 z/Architecture argument 2 saved
-f6 z/Architecture argument 3 saved
-The remaining floating points
-f1,f3,f5 f7-f15 are call-clobbered.
-
-Notes:
-------
-1) The only requirement is that registers which are used
-by the callee are saved, e.g. the compiler is perfectly
-capable of using r11 for purposes other than a frame a
-frame pointer if a frame pointer is not needed.
-2) In functions with variable arguments e.g. printf the calling procedure
-is identical to one without variable arguments & the same number of
-parameters. However, the prologue of this function is somewhat more
-hairy owing to it having to move these parameters to the stack to
-get va_start, va_arg & va_end to work.
-3) Access registers are currently unused by gcc but are used in
-the kernel. Possibilities exist to use them at the moment for
-temporary storage but it isn't recommended.
-4) Only 4 of the floating point registers are used for
-parameter passing as older machines such as G3 only have only 4
-& it keeps the stack frame compatible with other compilers.
-However with IEEE floating point emulation under linux on the
-older machines you are free to use the other 12.
-5) A long long or double parameter cannot be have the
-first 4 bytes in a register & the second four bytes in the
-outgoing args area. It must be purely in the outgoing args
-area if crossing this boundary.
-6) Floating point parameters are mixed with outgoing args
-on the outgoing args area in the order the are passed in as parameters.
-7) Floating point arguments 2 & 3 are saved in the outgoing args area for
-z/Architecture
-
-
-Stack Frame Layout
-------------------
-s/390 z/Architecture
-0 0 back chain ( a 0 here signifies end of back chain )
-4 8 eos ( end of stack, not used on Linux for S390 used in other linkage formats )
-8 16 glue used in other s/390 linkage formats for saved routine descriptors etc.
-12 24 glue used in other s/390 linkage formats for saved routine descriptors etc.
-16 32 scratch area
-20 40 scratch area
-24 48 saved r6 of caller function
-28 56 saved r7 of caller function
-32 64 saved r8 of caller function
-36 72 saved r9 of caller function
-40 80 saved r10 of caller function
-44 88 saved r11 of caller function
-48 96 saved r12 of caller function
-52 104 saved r13 of caller function
-56 112 saved r14 of caller function
-60 120 saved r15 of caller function
-64 128 saved f4 of caller function
-72 132 saved f6 of caller function
-80 undefined
-96 160 outgoing args passed from caller to callee
-96+x 160+x possible stack alignment ( 8 bytes desirable )
-96+x+y 160+x+y alloca space of caller ( if used )
-96+x+y+z 160+x+y+z automatics of caller ( if used )
-0 back-chain
-
-A sample program with comments.
-===============================
-
-Comments on the function test
------------------------------
-1) It didn't need to set up a pointer to the constant pool gpr13 as it is not
-used ( :-( ).
-2) This is a frameless function & no stack is bought.
-3) The compiler was clever enough to recognise that it could return the
-value in r2 as well as use it for the passed in parameter ( :-) ).
-4) The basr ( branch relative & save ) trick works as follows the instruction
-has a special case with r0,r0 with some instruction operands is understood as
-the literal value 0, some risc architectures also do this ). So now
-we are branching to the next address & the address new program counter is
-in r13,so now we subtract the size of the function prologue we have executed
-+ the size of the literal pool to get to the top of the literal pool
-0040037c int test(int b)
-{ # Function prologue below
- 40037c: 90 de f0 34 stm %r13,%r14,52(%r15) # Save registers r13 & r14
- 400380: 0d d0 basr %r13,%r0 # Set up pointer to constant pool using
- 400382: a7 da ff fa ahi %r13,-6 # basr trick
- return(5+b);
- # Huge main program
- 400386: a7 2a 00 05 ahi %r2,5 # add 5 to r2
-
- # Function epilogue below
- 40038a: 98 de f0 34 lm %r13,%r14,52(%r15) # restore registers r13 & 14
- 40038e: 07 fe br %r14 # return
-}
-
-Comments on the function main
------------------------------
-1) The compiler did this function optimally ( 8-) )
-
-Literal pool for main.
-400390: ff ff ff ec .long 0xffffffec
-main(int argc,char *argv[])
-{ # Function prologue below
- 400394: 90 bf f0 2c stm %r11,%r15,44(%r15) # Save necessary registers
- 400398: 18 0f lr %r0,%r15 # copy stack pointer to r0
- 40039a: a7 fa ff a0 ahi %r15,-96 # Make area for callee saving
- 40039e: 0d d0 basr %r13,%r0 # Set up r13 to point to
- 4003a0: a7 da ff f0 ahi %r13,-16 # literal pool
- 4003a4: 50 00 f0 00 st %r0,0(%r15) # Save backchain
-
- return(test(5)); # Main Program Below
- 4003a8: 58 e0 d0 00 l %r14,0(%r13) # load relative address of test from
- # literal pool
- 4003ac: a7 28 00 05 lhi %r2,5 # Set first parameter to 5
- 4003b0: 4d ee d0 00 bas %r14,0(%r14,%r13) # jump to test setting r14 as return
- # address using branch & save instruction.
-
- # Function Epilogue below
- 4003b4: 98 bf f0 8c lm %r11,%r15,140(%r15)# Restore necessary registers.
- 4003b8: 07 fe br %r14 # return to do program exit
-}
-
-
-Compiler updates
-----------------
-
-main(int argc,char *argv[])
-{
- 4004fc: 90 7f f0 1c stm %r7,%r15,28(%r15)
- 400500: a7 d5 00 04 bras %r13,400508 <main+0xc>
- 400504: 00 40 04 f4 .long 0x004004f4
- # compiler now puts constant pool in code to so it saves an instruction
- 400508: 18 0f lr %r0,%r15
- 40050a: a7 fa ff a0 ahi %r15,-96
- 40050e: 50 00 f0 00 st %r0,0(%r15)
- return(test(5));
- 400512: 58 10 d0 00 l %r1,0(%r13)
- 400516: a7 28 00 05 lhi %r2,5
- 40051a: 0d e1 basr %r14,%r1
- # compiler adds 1 extra instruction to epilogue this is done to
- # avoid processor pipeline stalls owing to data dependencies on g5 &
- # above as register 14 in the old code was needed directly after being loaded
- # by the lm %r11,%r15,140(%r15) for the br %14.
- 40051c: 58 40 f0 98 l %r4,152(%r15)
- 400520: 98 7f f0 7c lm %r7,%r15,124(%r15)
- 400524: 07 f4 br %r4
-}
-
-
-Hartmut ( our compiler developer ) also has been threatening to take out the
-stack backchain in optimised code as this also causes pipeline stalls, you
-have been warned.
-
-64 bit z/Architecture code disassembly
---------------------------------------
-
-If you understand the stuff above you'll understand the stuff
-below too so I'll avoid repeating myself & just say that
-some of the instructions have g's on the end of them to indicate
-they are 64 bit & the stack offsets are a bigger,
-the only other difference you'll find between 32 & 64 bit is that
-we now use f4 & f6 for floating point arguments on 64 bit.
-00000000800005b0 <test>:
-int test(int b)
-{
- return(5+b);
- 800005b0: a7 2a 00 05 ahi %r2,5
- 800005b4: b9 14 00 22 lgfr %r2,%r2 # downcast to integer
- 800005b8: 07 fe br %r14
- 800005ba: 07 07 bcr 0,%r7
-
-
-}
-
-00000000800005bc <main>:
-main(int argc,char *argv[])
-{
- 800005bc: eb bf f0 58 00 24 stmg %r11,%r15,88(%r15)
- 800005c2: b9 04 00 1f lgr %r1,%r15
- 800005c6: a7 fb ff 60 aghi %r15,-160
- 800005ca: e3 10 f0 00 00 24 stg %r1,0(%r15)
- return(test(5));
- 800005d0: a7 29 00 05 lghi %r2,5
- # brasl allows jumps > 64k & is overkill here bras would do fune
- 800005d4: c0 e5 ff ff ff ee brasl %r14,800005b0 <test>
- 800005da: e3 40 f1 10 00 04 lg %r4,272(%r15)
- 800005e0: eb bf f0 f8 00 04 lmg %r11,%r15,248(%r15)
- 800005e6: 07 f4 br %r4
-}
-
-
-
-Compiling programs for debugging on Linux for s/390 & z/Architecture
-====================================================================
--gdwarf-2 now works it should be considered the default debugging
-format for s/390 & z/Architecture as it is more reliable for debugging
-shared libraries, normal -g debugging works much better now
-Thanks to the IBM java compiler developers bug reports.
-
-This is typically done adding/appending the flags -g or -gdwarf-2 to the
-CFLAGS & LDFLAGS variables Makefile of the program concerned.
-
-If using gdb & you would like accurate displays of registers &
- stack traces compile without optimisation i.e make sure
-that there is no -O2 or similar on the CFLAGS line of the Makefile &
-the emitted gcc commands, obviously this will produce worse code
-( not advisable for shipment ) but it is an aid to the debugging process.
-
-This aids debugging because the compiler will copy parameters passed in
-in registers onto the stack so backtracing & looking at passed in
-parameters will work, however some larger programs which use inline functions
-will not compile without optimisation.
-
-Debugging with optimisation has since much improved after fixing
-some bugs, please make sure you are using gdb-5.0 or later developed
-after Nov'2000.
-
-
-
-Debugging under VM
-==================
-
-Notes
------
-Addresses & values in the VM debugger are always hex never decimal
-Address ranges are of the format <HexValue1>-<HexValue2> or
-<HexValue1>.<HexValue2>
-For example, the address range 0x2000 to 0x3000 can be described as 2000-3000
-or 2000.1000
-
-The VM Debugger is case insensitive.
-
-VM's strengths are usually other debuggers weaknesses you can get at any
-resource no matter how sensitive e.g. memory management resources, change
-address translation in the PSW. For kernel hacking you will reap dividends if
-you get good at it.
-
-The VM Debugger displays operators but not operands, and also the debugger
-displays useful information on the same line as the author of the code probably
-felt that it was a good idea not to go over the 80 columns on the screen.
-This isn't as unintuitive as it may seem as the s/390 instructions are easy to
-decode mentally and you can make a good guess at a lot of them as all the
-operands are nibble (half byte aligned).
-So if you have an objdump listing by hand, it is quite easy to follow, and if
-you don't have an objdump listing keep a copy of the s/390 Reference Summary
-or alternatively the s/390 principles of operation next to you.
-e.g. even I can guess that
-0001AFF8' LR 180F CC 0
-is a ( load register ) lr r0,r15
-
-Also it is very easy to tell the length of a 390 instruction from the 2 most
-significant bits in the instruction (not that this info is really useful except
-if you are trying to make sense of a hexdump of code).
-Here is a table
-Bits Instruction Length
-------------------------------------------
-00 2 Bytes
-01 4 Bytes
-10 4 Bytes
-11 6 Bytes
-
-The debugger also displays other useful info on the same line such as the
-addresses being operated on destination addresses of branches & condition codes.
-e.g.
-00019736' AHI A7DAFF0E CC 1
-000198BA' BRC A7840004 -> 000198C2' CC 0
-000198CE' STM 900EF068 >> 0FA95E78 CC 2
-
-
-
-Useful VM debugger commands
----------------------------
-
-I suppose I'd better mention this before I start
-to list the current active traces do
-Q TR
-there can be a maximum of 255 of these per set
-( more about trace sets later ).
-To stop traces issue a
-TR END.
-To delete a particular breakpoint issue
-TR DEL <breakpoint number>
-
-The PA1 key drops to CP mode so you can issue debugger commands,
-Doing alt c (on my 3270 console at least ) clears the screen.
-hitting b <enter> comes back to the running operating system
-from cp mode ( in our case linux ).
-It is typically useful to add shortcuts to your profile.exec file
-if you have one ( this is roughly equivalent to autoexec.bat in DOS ).
-file here are a few from mine.
-/* this gives me command history on issuing f12 */
-set pf12 retrieve
-/* this continues */
-set pf8 imm b
-/* goes to trace set a */
-set pf1 imm tr goto a
-/* goes to trace set b */
-set pf2 imm tr goto b
-/* goes to trace set c */
-set pf3 imm tr goto c
-
-
-
-Instruction Tracing
--------------------
-Setting a simple breakpoint
-TR I PSWA <address>
-To debug a particular function try
-TR I R <function address range>
-TR I on its own will single step.
-TR I DATA <MNEMONIC> <OPTIONAL RANGE> will trace for particular mnemonics
-e.g.
-TR I DATA 4D R 0197BC.4000
-will trace for BAS'es ( opcode 4D ) in the range 0197BC.4000
-if you were inclined you could add traces for all branch instructions &
-suffix them with the run prefix so you would have a backtrace on screen
-when a program crashes.
-TR BR <INTO OR FROM> will trace branches into or out of an address.
-e.g.
-TR BR INTO 0 is often quite useful if a program is getting awkward & deciding
-to branch to 0 & crashing as this will stop at the address before in jumps to 0.
-TR I R <address range> RUN cmd d g
-single steps a range of addresses but stays running &
-displays the gprs on each step.
-
-
-
-Displaying & modifying Registers
---------------------------------
-D G will display all the gprs
-Adding a extra G to all the commands is necessary to access the full 64 bit
-content in VM on z/Architecture. Obviously this isn't required for access
-registers as these are still 32 bit.
-e.g. DGG instead of DG
-D X will display all the control registers
-D AR will display all the access registers
-D AR4-7 will display access registers 4 to 7
-CPU ALL D G will display the GRPS of all CPUS in the configuration
-D PSW will display the current PSW
-st PSW 2000 will put the value 2000 into the PSW &
-cause crash your machine.
-D PREFIX displays the prefix offset
-
-
-Displaying Memory
------------------
-To display memory mapped using the current PSW's mapping try
-D <range>
-To make VM display a message each time it hits a particular address and
-continue try
-D I<range> will disassemble/display a range of instructions.
-ST addr 32 bit word will store a 32 bit aligned address
-D T<range> will display the EBCDIC in an address (if you are that way inclined)
-D R<range> will display real addresses ( without DAT ) but with prefixing.
-There are other complex options to display if you need to get at say home space
-but are in primary space the easiest thing to do is to temporarily
-modify the PSW to the other addressing mode, display the stuff & then
-restore it.
-
-
-
-Hints
------
-If you want to issue a debugger command without halting your virtual machine
-with the PA1 key try prefixing the command with #CP e.g.
-#cp tr i pswa 2000
-also suffixing most debugger commands with RUN will cause them not
-to stop just display the mnemonic at the current instruction on the console.
-If you have several breakpoints you want to put into your program &
-you get fed up of cross referencing with System.map
-you can do the following trick for several symbols.
-grep do_signal System.map
-which emits the following among other things
-0001f4e0 T do_signal
-now you can do
-
-TR I PSWA 0001f4e0 cmd msg * do_signal
-This sends a message to your own console each time do_signal is entered.
-( As an aside I wrote a perl script once which automatically generated a REXX
-script with breakpoints on every kernel procedure, this isn't a good idea
-because there are thousands of these routines & VM can only set 255 breakpoints
-at a time so you nearly had to spend as long pruning the file down as you would
-entering the msgs by hand), however, the trick might be useful for a single
-object file. In the 3270 terminal emulator x3270 there is a very useful option
-in the file menu called "Save Screen In File" - this is very good for keeping a
-copy of traces.
-
-From CMS help <command name> will give you online help on a particular command.
-e.g.
-HELP DISPLAY
-
-Also CP has a file called profile.exec which automatically gets called
-on startup of CMS ( like autoexec.bat ), keeping on a DOS analogy session
-CP has a feature similar to doskey, it may be useful for you to
-use profile.exec to define some keystrokes.
-e.g.