summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorHeiko Carstens <heiko.carstens@de.ibm.com>2019-07-25 09:23:39 +0200
committerVasily Gorbik <gor@linux.ibm.com>2019-08-21 12:41:43 +0200
commitf62f7dcbf023160ca47eb4bc7228ece8207f8a8e (patch)
tree2d661e3a7fc877bedcd4d17588721fbf79e30f26
parent8c72e0c85212df4e7c77fca55556e423fe17e801 (diff)
Documentation/s390: remove outdated debugging390 documentation
This file would need a lot of work to make sense again. Thomas Huth started working on that four years ago, but that wasn't finished. Therefore remove this. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
-rw-r--r--Documentation/s390/debugging390.rst2613
-rw-r--r--Documentation/s390/index.rst1
2 files changed, 0 insertions, 2614 deletions
diff --git a/Documentation/s390/debugging390.rst b/Documentation/s390/debugging390.rst
deleted file mode 100644
index 73ad0b06c666..000000000000
--- a/Documentation/s390/debugging390.rst
+++ /dev/null
@@ -1,2613 +0,0 @@
-=============================================
-Debugging on Linux for s/390 & z/Architecture
-=============================================
-
-Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com)
-
-Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation
-
-.. Best viewed with fixed width fonts
-
-Overview of Document:
-=====================
-This document is intended to give a good overview of how to debug Linux for
-s/390 and z/Architecture. It is not intended as a complete reference and not a
-tutorial on the fundamentals of C & assembly. It doesn't go into
-390 IO in any detail. It is intended to complement the documents in the
-reference section below & any other worthwhile references you get.
-
-It is intended like the Enterprise Systems Architecture/390 Reference Summary
-to be printed out & used as a quick cheat sheet self help style reference when
-problems occur.
-
-.. Contents
- ========
- Register Set
- Address Spaces on Intel Linux
- Address Spaces on Linux for s/390 & z/Architecture
- The Linux for s/390 & z/Architecture Kernel Task Structure
- Register Usage & Stackframes on Linux for s/390 & z/Architecture
- A sample program with comments
- Compiling programs for debugging on Linux for s/390 & z/Architecture
- Debugging under VM
- s/390 & z/Architecture IO Overview
- Debugging IO on s/390 & z/Architecture under VM
- GDB on s/390 & z/Architecture
- Stack chaining in gdb by hand
- Examining core dumps
- ldd
- Debugging modules
- The proc file system
- SysRq
- References
- Special Thanks
-
-Register Set
-============
-The current architectures have the following registers.
-
-16 General propose registers, 32 bit on s/390 and 64 bit on z/Architecture,
-r0-r15 (or gpr0-gpr15), used for arithmetic and addressing.
-
-16 Control registers, 32 bit on s/390 and 64 bit on z/Architecture, cr0-cr15,
-kernel usage only, used for memory management, interrupt control, debugging
-control etc.
-
-16 Access registers (ar0-ar15), 32 bit on both s/390 and z/Architecture,
-normally not used by normal programs but potentially could be used as
-temporary storage. These registers have a 1:1 association with general
-purpose registers and are designed to be used in the so-called access
-register mode to select different address spaces.
-Access register 0 (and access register 1 on z/Architecture, which needs a
-64 bit pointer) is currently used by the pthread library as a pointer to
-the current running threads private area.
-
-16 64-bit floating point registers (fp0-fp15 ) IEEE & HFP floating
-point format compliant on G5 upwards & a Floating point control reg (FPC)
-
-4 64-bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines.
-
-Note:
- Linux (currently) always uses IEEE & emulates G5 IEEE format on older
- machines, ( provided the kernel is configured for this ).
-
-
-The PSW is the most important register on the machine it
-is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of
-a program counter (pc), condition code register,memory space designator.
-In IBM standard notation I am counting bit 0 as the MSB.
-It has several advantages over a normal program counter
-in that you can change address translation & program counter
-in a single instruction. To change address translation,
-e.g. switching address translation off requires that you
-have a logical=physical mapping for the address you are
-currently running at.
-
-+-------------------------+-------------------------------------------------+
-| Bit | |
-+--------+----------------+ Value |
-| s/390 | z/Architecture | |
-+========+================+=================================================+
-| 0 | 0 | Reserved (must be 0) otherwise specification |
-| | | exception occurs. |
-+--------+----------------+-------------------------------------------------+
-| 1 | 1 | Program Event Recording 1 PER enabled, |
-| | | PER is used to facilitate debugging e.g. |
-| | | single stepping. |
-+--------+----------------+-------------------------------------------------+
-| 2-4 | 2-4 | Reserved (must be 0). |
-+--------+----------------+-------------------------------------------------+
-| 5 | 5 | Dynamic address translation 1=DAT on. |
-+--------+----------------+-------------------------------------------------+
-| 6 | 6 | Input/Output interrupt Mask |
-+--------+----------------+-------------------------------------------------+
-| 7 | 7 | External interrupt Mask used primarily for |
-| | | interprocessor signalling and clock interrupts. |
-+--------+----------------+-------------------------------------------------+
-| 8-11 | 8-11 | PSW Key used for complex memory protection |
-| | | mechanism (not used under linux) |
-+--------+----------------+-------------------------------------------------+
-| 12 | 12 | 1 on s/390 0 on z/Architecture |
-+--------+----------------+-------------------------------------------------+
-| 13 | 13 | Machine Check Mask 1=enable machine check |
-| | | interrupts |
-+--------+----------------+-------------------------------------------------+
-| 14 | 14 | Wait State. Set this to 1 to stop the processor |
-| | | except for interrupts and give time to other |
-| | | LPARS. Used in CPU idle in the kernel to |
-| | | increase overall usage of processor resources. |
-+--------+----------------+-------------------------------------------------+
-| 15 | 15 | Problem state (if set to 1 certain instructions |
-| | | are disabled). All linux user programs run with |
-| | | this bit 1 (useful info for debugging under VM).|
-+--------+----------------+-------------------------------------------------+
-| 16-17 | 16-17 | Address Space Control |
-| | | |
-| | | 00 Primary Space Mode: |
-| | | |
-| | | The register CR1 contains the primary |
-| | | address-space control element (PASCE), which |
-| | | points to the primary space region/segment |
-| | | table origin. |
-| | | |
-| | | 01 Access register mode |
-| | | |
-| | | 10 Secondary Space Mode: |
-| | | |
-| | | The register CR7 contains the secondary |
-| | | address-space control element (SASCE), which |
-| | | points to the secondary space region or |
-| | | segment table origin. |
-| | | |
-| | | 11 Home Space Mode: |
-| | | |
-| | | The register CR13 contains the home space |
-| | | address-space control element (HASCE), which |
-| | | points to the home space region/segment |
-| | | table origin. |
-| | | |
-| | | See "Address Spaces on Linux for s/390 & |
-| | | z/Architecture" below for more information |
-| | | about address space usage in Linux. |
-+--------+----------------+-------------------------------------------------+
-| 18-19 | 18-19 | Condition codes (CC) |
-+--------+----------------+-------------------------------------------------+
-| 20 | 20 | Fixed point overflow mask if 1=FPU exceptions |
-| | | for this event occur (normally 0) |
-+--------+----------------+-------------------------------------------------+
-| 21 | 21 | Decimal overflow mask if 1=FPU exceptions for |
-| | | this event occur (normally 0) |
-+--------+----------------+-------------------------------------------------+
-| 22 | 22 | Exponent underflow mask if 1=FPU exceptions |
-| | | for this event occur (normally 0) |
-+--------+----------------+-------------------------------------------------+
-| 23 | 23 | Significance Mask if 1=FPU exceptions for this |
-| | | event occur (normally 0) |
-+--------+----------------+-------------------------------------------------+
-| 24-31 | 24-30 | Reserved Must be 0. |
-| +----------------+-------------------------------------------------+
-| | 31 | Extended Addressing Mode |
-| +----------------+-------------------------------------------------+
-| | 32 | Basic Addressing Mode |
-| | | |
-| | | Used to set addressing mode:: |
-| | | |
-| | | +---------+----------+----------+ |
-| | | | PSW 31 | PSW 32 | | |
-| | | +---------+----------+----------+ |
-| | | | 0 | 0 | 24 bit | |
-| | | +---------+----------+----------+ |
-| | | | 0 | 1 | 31 bit | |
-| | | +---------+----------+----------+ |
-| | | | 1 | 1 | 64 bit | |
-| | | +---------+----------+----------+ |
-+--------+----------------+-------------------------------------------------+
-| 32 | | 1=31 bit addressing mode 0=24 bit addressing |
-| | | mode (for backward compatibility), linux |
-| | | always runs with this bit set to 1 |
-+--------+----------------+-------------------------------------------------+
-| 33-64 | | Instruction address. |
-| +----------------+-------------------------------------------------+
-| | 33-63 | Reserved must be 0 |
-| +----------------+-------------------------------------------------+
-| | 64-127 | Address |
-| | | |
-| | | - In 24 bits mode bits 64-103=0 bits 104-127 |
-| | | Address |
-| | | - In 31 bits mode bits 64-96=0 bits 97-127 |
-| | | Address |
-| | | |
-| | | Note: |
-| | | unlike 31 bit mode on s/390 bit 96 must be |
-| | | zero when loading the address with LPSWE |
-| | | otherwise a specification exception occurs, |
-| | | LPSW is fully backward compatible. |
-+--------+----------------+-------------------------------------------------+
-
-Prefix Page(s)
---------------
-This per cpu memory area is too intimately tied to the processor not to mention.
-It exists between the real addresses 0-4096 on s/390 and between 0-8192 on
-z/Architecture and is exchanged with one page on s/390 or two pages on
-z/Architecture in absolute storage by the set prefix instruction during Linux
-startup.
-
-This page is mapped to a different prefix for each processor in an SMP
-configuration (assuming the OS designer is sane of course).
-
-Bytes 0-512 (200 hex) on s/390 and 0-512, 4096-4544, 4604-5119 currently on
-z/Architecture are used by the processor itself for holding such information
-as exception indications and entry points for exceptions.
-
-Bytes after 0xc00 hex are used by linux for per processor globals on s/390 and
-z/Architecture (there is a gap on z/Architecture currently between 0xc00 and
-0x1000, too, which is used by Linux).
-
-The closest thing to this on traditional architectures is the interrupt
-vector table. This is a good thing & does simplify some of the kernel coding
-however it means that we now cannot catch stray NULL pointers in the
-kernel without hard coded checks.
-
-
-
-Address Spaces on Intel Linux
-=============================
-
-The traditional Intel Linux is approximately mapped as follows forgive
-the ascii art::
-
- 0xFFFFFFFF 4GB Himem *****************
- * *
- * Kernel Space *
- * *
- ***************** ****************
- User Space Himem * User Stack * * *
- (typically 0xC0000000 3GB ) ***************** * *
- * Shared Libs * * Next Process *
- ***************** * to *
- * * <== * Run * <==
- * User Program * * *
- * Data BSS * * *
- * Text * * *
- * Sections * * *
- 0x00000000 ***************** ****************
-
-Now it is easy to see that on Intel it is quite easy to recognise a kernel
-address as being one greater than user space himem (in this case 0xC0000000),
-and addresses of less than this are the ones in the current running program on
-this processor (if an smp box).
-
-If using the virtual machine ( VM ) as a debugger it is quite difficult to
-know which user process is running as the address space you are looking at
-could be from any process in the run queue.
-
-The limitation of Intels addressing technique is that the linux
-kernel uses a very simple real address to virtual addressing technique
-of Real Address=Virtual Address-User Space Himem.
-This means that on Intel the kernel linux can typically only address
-Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines
-can typically use.
-
-They can lower User Himem to 2GB or lower & thus be
-able to use 2GB of RAM however this shrinks the maximum size
-of User Space from 3GB to 2GB they have a no win limit of 4GB unless
-they go to 64 Bit.
-
-
-On 390 our limitations & strengths make us slightly different.
-For backward compatibility we are only allowed use 31 bits (2GB)
-of our 32 bit addresses, however, we use entirely separate address
-spaces for the user & kernel.
-
-This means we can support 2GB of non Extended RAM on s/390, & more
-with the Extended memory management swap device &
-currently 4TB of physical memory currently on z/Architecture.
-
-
-Address Spaces on Linux for s/390 & z/Architecture
-==================================================
-
-Our addressing scheme is basically as follows::
-
- Primary Space Home Space
- Himem 0x7fffffff 2GB on s/390 ***************** ****************
- currently 0x3ffffffffff (2^42)-1 * User Stack * * *
- on z/Architecture. ***************** * *
- * Shared Libs * * *
- ***************** * *
- * * * Kernel *
- * User Program * * *
- * Data BSS * * *
- * Text * * *
- * Sections * * *
- 0x00000000 ***************** ****************
-
-This also means that we need to look at the PSW problem state bit and the
-addressing mode to decide whether we are looking at user or kernel space.
-
-User space runs in primary address mode (or access register mode within
-the vdso code).
-
-The kernel usually also runs in home space mode, however when accessing
-user space the kernel switches to primary or secondary address mode if
-the mvcos instruction is not available or if a compare-and-swap (futex)
-instruction on a user space address is performed.
-
-When also looking at the ASCE control registers, this means:
-
-User space:
-
-- runs in primary or access register mode
-- cr1 contains the user asce
-- cr7 contains the user asce
-- cr13 contains the kernel asce
-
-Kernel space:
-
-- runs in home space mode
-- cr1 contains the user or kernel asce
-
- - the kernel asce is loaded when a uaccess requires primary or
- secondary address mode
-
-- cr7 contains the user or kernel asce, (changed with set_fs())
-- cr13 contains the kernel asce
-
-In case of uaccess the kernel changes to:
-
-- primary space mode in case of a uaccess (copy_to_user) and uses
- e.g. the mvcp instruction to access user space. However the kernel
- will stay in home space mode if the mvcos instruction is available
-- secondary space mode in case of futex atomic operations, so that the
- instructions come from primary address space and data from secondary
- space
-
-In case of KVM, the kernel runs in home space mode, but cr1 gets switched
-to contain the gmap asce before the SIE instruction gets executed. When
-the SIE instruction is finished, cr1 will be switched back to contain the
-user asce.
-
-
-Virtual Addresses on s/390 & z/Architecture
-===========================================
-
-A virtual address on s/390 is made up of 3 parts
-The SX (segment index, roughly corresponding to the PGD & PMD in Linux
-terminology) being bits 1-11.
-
-The PX (page index, corresponding to the page table entry (pte) in Linux
-terminology) being bits 12-19.
-
-The remaining bits BX (the byte index are the offset in the page )
-i.e. bits 20 to 31.
-
-On z/Architecture in linux we currently make up an address from 4 parts.
-
-- The region index bits (RX) 0-32 we currently use bits 22-32
-- The segment index (SX) being bits 33-43
-- The page index (PX) being bits 44-51
-- The byte index (BX) being bits 52-63
-
-Notes:
- 1) s/390 has no PMD so the PMD is really the PGD also.
- A lot of this stuff is defined in pgtable.h.
-
- 2) Also seeing as s/390's page indexes are only 1k in size
- (bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k )
- to make the best use of memory by updating 4 segment indices
- entries each time we mess with a PMD & use offsets
- 0,1024,2048 & 3072 in this page as for our segment indexes.
- On z/Architecture our page indexes are now 2k in size
- ( bits 12-19 x 8 bytes per pte ) we do a similar trick
- but only mess with 2 segment indices each time we mess with
- a PMD.
-
- 3) As z/Architecture supports up to a massive 5-level page table lookup we
- can only use 3 currently on Linux ( as this is all the generic kernel
- currently supports ) however this may change in future
- this allows us to access ( according to my sums )
- 4TB of virtual storage per process i.e.
- 4096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes,
- enough for another 2 or 3 of years I think :-).
- to do this we use a region-third-table designation type in
- our address space control registers.
-
-
-The Linux for s/390 & z/Architecture Kernel Task Structure
-==========================================================
-Each process/thread under Linux for S390 has its own kernel task_struct
-defined in linux/include/linux/sched.h
-The S390 on initialisation & resuming of a process on a cpu sets
-the __LC_KERNEL_STACK variable in the spare prefix area for this cpu
-(which we use for per-processor globals).
-
-The kernel stack pointer is intimately tied with the task structure for
-each processor as follows::
-
- s/390
- ************************
- * 1 page kernel stack *
- * ( 4K ) *
- ************************
- * 1 page task_struct *
- * ( 4K ) *
- 8K aligned ************************
-
- z/Architecture
- ************************
- * 2 page kernel stack *
- * ( 8K ) *
- ************************
- * 2 page task_struct *
- * ( 8K ) *
- 16K aligned ************************
-
-What this means is that we don't need to dedicate any register or global
-variable to point to the current running process & can retrieve it with the
-following very simple construct for s/390 & one very similar for
-z/Architecture::
-
- static inline struct task_struct * get_current(void)
- {
- struct task_struct *current;
- __asm__("lhi %0,-8192\n\t"
- "nr %0,15"
- : "=r" (current) );
- return current;
- }
-
-i.e. just anding the current kernel stack pointer with the mask -8192.
-Thankfully because Linux doesn't have support for nested IO interrupts
-& our devices have large buffers can survive interrupts being shut for
-short amounts of time we don't need a separate stack for interrupts.
-
-
-
-
-Register Usage & Stackframes on Linux for s/390 & z/Architecture
-=================================================================
-Overview:
----------
-This is the code that gcc produces at the top & the bottom of
-each function. It usually is fairly consistent & similar from
-function to function & if you know its layout you can probably
-make some headway in finding the ultimate cause of a problem
-after a crash without a source level debugger.
-
-Note: To follow stackframes requires a knowledge of C or Pascal &
-limited knowledge of one assembly language.
-
-It should be noted that there are some differences between the
-s/390 and z/Architecture stack layouts as the z/Architecture stack layout
-didn't have to maintain compatibility with older linkage formats.
-
-Glossary:
----------
-alloca:
- This is a built in compiler function for runtime allocation
- of extra space on the callers stack which is obviously freed
- up on function exit ( e.g. the caller may choose to allocate nothing
- of a buffer of 4k if required for temporary purposes ), it generates
- very efficient code ( a few cycles ) when compared to alternatives
- like malloc.
-
-automatics:
- These are local variables on the stack, i.e they aren't in registers &
- they aren't static.
-
-back-chain:
- This is a pointer to the stack pointer before entering a
- framed functions ( see frameless function ) prologue got by
- dereferencing the address of the current stack pointer,
- i.e. got by accessing the 32 bit value at the stack pointers
- current location.
-
-base-pointer:
- This is a pointer to the back of the literal pool which
- is an area just behind each procedure used to store constants
- in each function.
-
-call-clobbered:
- The caller probably needs to save these registers if there
- is something of value in them, on the stack or elsewhere before making a
- call to another procedure so that it can restore it later.
-
-epilogue:
- The code generated by the compiler to return to the caller.
-
-frameless-function:
- A frameless function in Linux for s390 & z/Architecture is one which doesn't
- need more than the register save area (96 bytes on s/390, 160 on z/Architecture)
- given to it by the caller.
-
- A frameless function never:
-
- 1) Sets up a back chain.
- 2) Calls alloca.
- 3) Calls other normal functions
- 4) Has automatics.
-
-GOT-pointer:
- This is a pointer to the global-offset-table in ELF
- ( Executable Linkable Format, Linux'es most common executable format ),
- all globals & shared library objects are found using this pointer.
-
-lazy-binding
- ELF shared libraries are typically only loaded when routines in the shared
- library are actually first called at runtime. This is lazy binding.
-
-procedure-linkage-table
- This is a table found from the GOT which contains pointers to routines
- in other shared libraries which can't be called to by easier means.
-
-prologue:
- The code generated by the compiler to set up the stack frame.
-
-outgoing-args:
- This is extra area allocated on the stack of the calling function if the
- parameters for the callee's cannot all be put in registers, the same
- area can be reused by each function the caller calls.
-
-routine-descriptor:
- A COFF executable format based concept of a procedure reference
- actually being 8 bytes or more as opposed to a simple pointer to the routine.
- This is typically defined as follows:
-
- - Routine Descriptor offset 0=Pointer to Function
- - Routine Descriptor offset 4=Pointer to Table of Contents
-
- The table of contents/TOC is roughly equivalent to a GOT pointer.
- & it means that shared libraries etc. can be shared between several
- environments each with their own TOC.
-
-static-chain:
- This is used in nested functions a concept adopted from pascal
- by gcc not used in ansi C or C++ ( although quite useful ), basically it
- is a pointer used to reference local variables of enclosing functions.
- You might come across this stuff once or twice in your lifetime.
-
- e.g.
-
- The function below should return 11 though gcc may get upset & toss warnings
- about unused variables::
-
- int FunctionA(int a)
- {
- int b;
- FunctionC(int c)
- {
- b=c+1;
- }
- FunctionC(10);
- return(b);
- }
-
-
-s/390 & z/Architecture Register usage
-=====================================
-
-======== ========================================== ===============
-r0 used by syscalls/assembly call-clobbered
-r1 used by syscalls/assembly call-clobbered
-r2 argument 0 / return value 0 call-clobbered
-r3 argument 1 / return value 1 (if long long) call-clobbered
-r4 argument 2 call-clobbered
-r5 argument 3 call-clobbered
-r6 argument 4 saved
-r7 pointer-to arguments 5 to ... saved
-r8 this & that saved
-r9 this & that saved
-r10 static-chain ( if nested function ) saved
-r11 frame-pointer ( if function used alloca ) saved
-r12 got-pointer saved
-r13 base-pointer saved
-r14 return-address saved
-r15 stack-pointer saved
-
-f0 argument 0 / return value ( float/double ) call-clobbered
-f2 argument 1 call-clobbered
-f4 z/Architecture argument 2 saved
-f6 z/Architecture argument 3 saved
-======== ========================================== ===============
-
-The remaining floating points
-f1,f3,f5 f7-f15 are call-clobbered.
-
-Notes:
-------
-1) The only requirement is that registers which are used
- by the callee are saved, e.g. the compiler is perfectly
- capable of using r11 for purposes other than a frame a
- frame pointer if a frame pointer is not needed.
-2) In functions with variable arguments e.g. printf the calling procedure
- is identical to one without variable arguments & the same number of
- parameters. However, the prologue of this function is somewhat more
- hairy owing to it having to move these parameters to the stack to
- get va_start, va_arg & va_end to work.
-3) Access registers are currently unused by gcc but are used in
- the kernel. Possibilities exist to use them at the moment for
- temporary storage but it isn't recommended.
-4) Only 4 of the floating point registers are used for
- parameter passing as older machines such as G3 only have only 4
- & it keeps the stack frame compatible with other compilers.
- However with IEEE floating point emulation under linux on the
- older machines you are free to use the other 12.
-5) A long long or double parameter cannot be have the
- first 4 bytes in a register & the second four bytes in the
- outgoing args area. It must be purely in the outgoing args
- area if crossing this boundary.
-6) Floating point parameters are mixed with outgoing args
- on the outgoing args area in the order the are passed in as parameters.
-7) Floating point arguments 2 & 3 are saved in the outgoing args area for
- z/Architecture
-
-
-Stack Frame Layout
-------------------
-
-========= ============== ======================================================
-s/390 z/Architecture
-========= ============== ======================================================
-0 0 back chain ( a 0 here signifies end of back chain )
-4 8 eos ( end of stack, not used on Linux for S390 used
- in other linkage formats )
-8 16 glue used in other s/390 linkage formats for saved
- routine descriptors etc.
-12 24 glue used in other s/390 linkage formats for saved
- routine descriptors etc.
-16 32 scratch area
-20 40 scratch area
-24 48 saved r6 of caller function
-28 56 saved r7 of caller function
-32 64 saved r8 of caller function
-36 72 saved r9 of caller function
-40 80 saved r10 of caller function
-44 88 saved r11 of caller function
-48 96 saved r12 of caller function
-52 104 saved r13 of caller function
-56 112 saved r14 of caller function
-60 120 saved r15 of caller function
-64 128 saved f4 of caller function
-72 132 saved f6 of caller function
-80 undefined
-96 160 outgoing args passed from caller to callee
-96+x 160+x possible stack alignment ( 8 bytes desirable )
-96+x+y 160+x+y alloca space of caller ( if used )
-96+x+y+z 160+x+y+z automatics of caller ( if used )
-0 back-chain
-========= ============== ======================================================
-
-A sample program with comments.
-===============================
-
-Comments on the function test
------------------------------
-1) It didn't need to set up a pointer to the constant pool gpr13 as it is not
- used ( :-( ).
-2) This is a frameless function & no stack is bought.
-3) The compiler was clever enough to recognise that it could return the
- value in r2 as well as use it for the passed in parameter ( :-) ).
-4) The basr ( branch relative & save ) trick works as follows the instruction
- has a special case with r0,r0 with some instruction operands is understood as
- the literal value 0, some risc architectures also do this ). So now
- we are branching to the next address & the address new program counter is
- in r13,so now we subtract the size of the function prologue we have executed
- the size of the literal pool to get to the top of the literal pool::
-
-
- 0040037c int test(int b)
- { # Function prologue below
- 40037c: 90 de f0 34 stm %r13,%r14,52(%r15) # Save registers r13 & r14
- 400380: 0d d0 basr %r13,%r0 # Set up pointer to constant pool using
- 400382: a7 da ff fa ahi %r13,-6 # basr trick
- return(5+b);
- # Huge main program
- 400386: a7 2a 00 05 ahi %r2,5 # add 5 to r2
-
- # Function epilogue below
- 40038a: 98 de f0 34 lm %r13,%r14,52(%r15) # restore registers r13 & 14
- 40038e: 07 fe br %r14 # return
- }
-
-Comments on the function main
------------------------------
-1) The compiler did this function optimally ( 8-) )::
-
- Literal pool for main.
- 400390: ff ff ff ec .long 0xffffffec
- main(int argc,char *argv[])
- { # Function prologue below
- 400394: 90 bf f0 2c stm %r11,%r15,44(%r15) # Save necessary registers
- 400398: 18 0f lr %r0,%r15 # copy stack pointer to r0
- 40039a: a7 fa ff a0 ahi %r15,-96 # Make area for callee saving
- 40039e: 0d d0 basr %r13,%r0 # Set up r13 to point to
- 4003a0: a7 da ff f0 ahi %r13,-16 # literal pool
- 4003a4: 50 00 f0 00 st %r0,0(%r15) # Save backchain
-
- return(test(5)); # Main Program Below
- 4003a8: 58 e0 d0 00 l %r14,0(%r13) # load relative address of test from
- # literal pool
- 4003ac: a7 28 00 05 lhi %r2,5 # Set first parameter to 5
- 4003b0: 4d ee d0 00 bas %r14,0(%r14,%r13) # jump to test setting r14 as return
- # address using branch & save instruction.
-
- # Function Epilogue below
- 4003b4: 98 bf f0 8c lm %r11,%r15,140(%r15)# Restore necessary registers.
- 4003b8: 07 fe br %r14 # return to do program exit
- }
-
-
-Compiler updates
-----------------
-
-::
-
- main(int argc,char *argv[])
- {
- 4004fc: 90 7f f0 1c stm %r7,%r15,28(%r15)
- 400500: a7 d5 00 04 bras %r13,400508 <main+0xc>
- 400504: 00 40 04 f4 .long 0x004004f4
- # compiler now puts constant pool in code to so it saves an instruction
- 400508: 18 0f lr %r0,%r15
- 40050a: a7 fa ff a0 ahi %r15,-96
- 40050e: 50 00 f0 00 st %r0,0(%r15)
- return(test(5));
- 400512: 58 10 d0 00 l %r1,0(%r13)
- 400516: a7 28 00 05 lhi %r2,5
- 40051a: 0d e1 basr %r14,%r1
- # compiler adds 1 extra instruction to epilogue this is done to
- # avoid processor pipeline stalls owing to data dependencies on g5 &
- # above as register 14 in the old code was needed directly after being loaded
- # by the lm %r11,%r15,140(%r15) for the br %14.
- 40051c: 58 40 f0 98 l %r4,152(%r15)
- 400520: 98 7f f0 7c lm %r7,%r15,124(%r15)
- 400524: 07 f4 br %r4
- }
-
-
-Hartmut ( our compiler developer ) also has been threatening to take out the
-stack backchain in optimised code as this also causes pipeline stalls, you
-have been warned.
-
-64 bit z/Architecture code disassembly
---------------------------------------
-
-If you understand the stuff above you'll understand the stuff
-below too so I'll avoid repeating myself & just say that
-some of the instructions have g's on the end of them to indicate
-they are 64 bit & the stack offsets are a bigger,
-the only other difference you'll find between 32 & 64 bit is that
-we now use f4 & f6 for floating point arguments on 64 bit::
-
- 00000000800005b0 <test>:
- int test(int b)
- {
- return(5+b);
- 800005b0: a7 2a 00 05 ahi %r2,5
- 800005b4: b9 14 00 22 lgfr %r2,%r2 # downcast to integer
- 800005b8: 07 fe br %r14
- 800005ba: 07 07 bcr 0,%r7
-
-
- }
-
- 00000000800005bc <main>:
- main(int argc,char *argv[])
- {
- 800005bc: eb bf f0 58 00 24 stmg %r11,%r15,88(%r15)
- 800005c2: b9 04 00 1f lgr %r1,%r15
- 800005c6: a7 fb ff 60 aghi %r15,-160
- 800005ca: e3 10 f0 00 00 24 stg %r1,0(%r15)
- return(test(5));
- 800005d0: a7 29 00 05 lghi %r2,5
- # brasl allows jumps > 64k & is overkill here bras would do fune
- 800005d4: c0 e5 ff ff ff ee brasl %r14,800005b0 <test>
- 800005da: e3 40 f1 10 00 04 lg %r4,272(%r15)
- 800005e0: eb bf f0 f8 00 04 lmg %r11,%r15,248(%r15)
- 800005e6: 07 f4 br %r4
- }
-
-
-
-Compiling programs for debugging on Linux for s/390 & z/Architecture
-====================================================================
--gdwarf-2 now works it should be considered the default debugging
-format for s/390 & z/Architecture as it is more reliable for debugging
-shared libraries, normal -g debugging works much better now
-Thanks to the IBM java compiler developers bug reports.
-
-This is typically done adding/appending the flags -g or -gdwarf-2 to the
-CFLAGS & LDFLAGS variables Makefile of the program concerned.
-
-If using gdb & you would like accurate displays of registers &
-stack traces compile without optimisation i.e make sure
-that there is no -O2 or similar on the CFLAGS line of the Makefile &
-the emitted gcc commands, obviously this will produce worse code
-( not advisable for shipment ) but it is an aid to the debugging process.
-
-This aids debugging because the compiler will copy parameters passed in
-in registers onto the stack so backtracing & looking at passed in
-parameters will work, however some larger programs which use inline functions
-will not compile without optimisation.
-
-Debugging with optimisation has since much improved after fixing
-some bugs, please make sure you are using gdb-5.0 or later developed
-after Nov'2000.
-
-
-
-Debugging under VM
-==================
-
-Notes
------
-Addresses & values in the VM debugger are always hex never decimal
-Address ranges are of the format <HexValue1>-<HexValue2> or
-<HexValue1>.<HexValue2>
-For example, the address range 0x2000 to 0x3000 can be described as 2000-3000
-or 2000.1000
-
-The VM Debugger is case insensitive.
-
-VM's strengths are usually other debuggers weaknesses you can get at any
-resource no matter how sensitive e.g. memory management resources, change
-address translation in the PSW. For kernel hacking you will reap dividends if
-you get good at it.
-
-The VM Debugger displays operators but not operands, and also the debugger
-displays useful information on the same line as the author of the code probably
-felt that it was a good idea not to go over the 80 columns on the screen.