hide random home http://www.sgi.com/tech/relnotes/compiler_dev.html (Silicon Surf Promotional CD, 01/1995)

compiler_dev Release Notes

compiler_dev Release Notes

1 Introduction

2 Installation Information

3 Changes and Additions

4 Bug Fixes

5 Known Problems and Workarounds

A Dynamic Shared Objects

 



                                  - 1 -



       1.  Introduction

       These notes describe the Base Compiler Development portion
       (compiler_dev) of the 5.2 IRIS Development Option from
       Silicon Graphics, Inc.  They include discussion of compiler
       tools, header files, libraries, dynamic shared objects, and
       KPIC directives.

       Note:  Packaged with the IRIS Development Option software is
              a separate sheet that contains the Software License
              Agreement.  This software is provided to you solely
              under the terms and conditions of the Software
              License Agreement.  Please take a few moments to
              review the Agreement.

       This document contains the following chapters:

         1.  Introduction

         2.  Installation Information

         3.  Changes and Additions

         4.  Bug Fixes

         5.  Known Problems and Workarounds

       In addition, Appendix A discusses dynamically shared objects
       (DSOs).


       1.1  Release_Identification_Information

       Following is the release identification information for the
       Base Compiler Development portion (compiler_dev) of the 5.2
       IRIS Development Option:

       Software Product               Compiler_dev

       Version                        3.18
       Product Code                   SC4-IDO-5.2

       System Software Requirements   IRIX 5.2 or later

       1.2  Online_Release_Notes

       After you install the online documentation for a product
       (the relnotes subsystem), you can view the release notes on
       your screen.













                                  - 2 -



       If you have a graphics system, select ``Release Notes'' from
       the Tools submenu of the Toolchest. This displays the
       grelnotes(1) graphical browser for the online release notes.

       Refer to the grelnotes(1) man page for information on
       options to this command.

       If you have a nongraphics system, you can use the relnotes
       command.  Refer to the relnotes(1) man page for accessing
       the online release notes.

       1.3  Product_Support

       Silicon Graphics, Inc., provides a comprehensive product
       support maintenance program for its products.

       If you are in the U.S. or Canada and would like support for
       your Silicon Graphics-supported products, contact the
       Technical Assistance Center at 1-800-800-4SGI.  If you are
       outside these areas, contact the Silicon Graphics subsidiary
       or authorized distributor in your country.






































 



                                  - 1 -



       2.  Installation_Information

       The IRIS Software Installation Guide fully documents the
       process for installing the Base Compiler Development
       software.  In addition, each compiler has its own set of
       release notes that describes product-specific installation
       information.

       2.1  3.18_Base_Compiler_Development_Subsystems

       The 3.18 Base Compiler Development software (compiler_dev)
       includes these subsystems:

       compiler_dev.books         Base compiler books

       compiler_dev.books.dbx     Base compiler dbx User's Guide

       compiler_dev.hdr           Base compiler headers

       compiler_dev.hdr.internal  Base compiler internal headers

       compiler_dev.hdr.lib       Base compiler environment headers

       compiler_dev.man.base      Base compiler components man
                                  pages

       compiler_dev.man.ld        Base compiler loader man pages

       compiler_dev.man.perf      Base compiler performance man
                                  pages

       compiler_dev.man.util      Base compiler utility man pages

       compiler_dev.sw            Base compiler software

       compiler_dev.sw.abi        Base compiler ABI software

       compiler_dev.sw.base       Base compiler components

       compiler_dev.sw.ld         Base compiler loader

       compiler_dev.sw.perf       Base compiler performance tools

       compiler_dev.sw.util       Base compiler utilities

       compiler_dev.man.dbx       dbx manual page

       compiler_dev.man.lib       Development environment manual
                                  pages













                                  - 2 -



       compiler_dev.man.relnotes  These release notes

       compiler_dev.sw.dbx        dbx debugger

       compiler_dev.sw.lib        Development libraries

       2.1.1  Subsystem_Disk_Space_Requirements  This section lists
       the compiler_dev subsystems (and their sizes).

       If you are installing this software for the first time, the
       subsystems marked ``default'' are those selected for
       installation automatically.  They will be installed when you
       give the go command unless you explicitly request (with the
       keep command) that they not be installed.

       Those marked ``miniroot'' must be installed from the
       miniroot.

       Note:  The listed subsystem sizes are approximate.  Refer to
              the IRIS Software Installation Guide for information
              on finding exact sizes.









































                                  - 3 -



       Subsystem Name                        Subsystem Size
                                             (512-byte blocks)
       compiler_dev.hdr.internal                     455
       compiler_dev.man.base (default)                48
       compiler_dev.man.ld (default)                  50
       compiler_dev.man.perf (default)                53
       compiler_dev.man.util (default)                82
       compiler_dev.sw.base (default)               8415
       compiler_dev.sw.ld (default)                 1549
       compiler_dev.sw.perf (default)               2250
       compiler_dev.sw.util (default)               2684
       compiler_dev.hdr.lib (default)               2691
       compiler_dev.man.dbx (default)                 95
       compiler_dev.man.lib (default)               5405
       compiler_dev.man.relnotes (default)           147
       compiler_dev.sw.dbx (default)                1656
       compiler_dev.sw.lib (default)               12299










































 



                                  - 1 -



       3.  Changes_and_Additions

       The features in this chapter are new or significantly
       changed in the Base Compiler Development software since the
       IRIX 4.0.5 Maintenance release.  Except as noted, changes
       apply to all versions.

       3.1  Compiler_System

       This section lists changes and additions to compilers and
       development tools since the IRIX 4.0.5 Maintenance release.

       3.1.1  Obsoleting_libmld  libmld, either in the form of an
       archive or a DSO, will not be released or supported in
       future releases.  If you use functions in the existing
       libmld library, contact the Technical Assistance Center for
       details concerning the migration of your libmld function
       calls in your existing source code to other calls, probably
       in libraries such as libelf, the ELF object-file support
       library, and libraries containing symbol table manipulation
       routines.

       3.1.2  Dynamic_Linking_and_DSOs  In earlier versions of IRIX
       (pre-5.0), executables were only statically linked.  This
       means that all references must be resolved (and their
       addresses fixed) at link time (by ld(1)).  In this release,
       such programs, although they might use pre-5.0 shared
       libraries (which are referred to now as static shared
       libraries) are referred to as non-shared. They are produced
       by compiling and linking with the -non_shared option.  The
       code so created is not position-independent (PIC).

       In 5.0 and later IRIX releases, in addition to being
       statically linked by ld(1), programs are, by default,
       compiled as PIC code and dynamically linked, that is, part
       of the program may be relocated dynamically at run time.
       There are two types of dynamically linked objects:

          o The executable itself.  This consists of your main
            program and PIC code extracted from all archive
            libraries linked with it.  Code within the executable
            is not relocated at run time, but some of its
            references will be.  The executable is linked
            -call_shared.

          o External sharable dynamically linked objects called
            dynamic shared objects (DSOs), which are not part of
            the executable itself.  DSOs and their references may
            be dynamically relocated at run time.  DSOs are linked
            -shared.  DSOs by convention have the extension .so.  A
            DSO may be shared by several users and/or programs,











                                  - 2 -



            possibly at different addresses.

       You cannot mix non-shared objects and PIC objects in the
       same executable.

       On this and future release, static shared libraries are
       supported only for the use of existing (pre-5.0) executables
       that reference them.  You can neither create new static
       shared libraries nor link new code with existing static
       shared libraries.

       PIC code satisfies references indirectly by using a Global
       Offset Table (GOT), which allows code to be relocated simply
       by updating the GOT.  Your executable has one GOT, and each
       DSO it uses has one GOT.

       When a dynamically linked executable is started, the runtime
       linker, rld(1), is invoked to prepare the program for
       execution.  This preparation involves:

          o Filling in certain global values.

          o Relocating any dynamic shared objects (DSOs) that your
            program references.

          o Resolving data symbols in DSOs that were unresolved at
            static link time by ld(1).

       With very few exceptions, all executable objects in this
       release are dynamically linked.  A new component, the
       runtime linker /lib/rld, and all standard DSOs (file
       extension .so) are necessary for programs to execute.

       More information about these types of objects appears in
       Appendix A, ``Frequently Asked Questions about DSOs,'' and
       in the IRIX System Programming Guide.

       3.1.3  Object_File_Format_Changes  The compiler tools and
       the link editor now produce ELF format objects and
       executables by default.  DSO is supported only in ELF
       executables and object files.  COFF files are run on IRIX
       5.0 and later releases with the IRIX 4.0.5 ABI, and ELF
       files are run with the IRIX 5.0 and later ABI; hence, the
       linker refuses to mix (pre-5.0) COFF and ELF objects.

       Two new header files are associated with ELF objects:
       /usr/include/elf.h contains definitions that are generic to
       all implementations.  /usr/include/sys/elf.h contains
       definitions specific to the MIPS architecture.  See the
       System V Application Binary Interface and System V
       Application Binary Interface MIPS Processor Supplement,











                                  - 3 -



       published by Prentice Hall.

       A new object file reader, elfdump(1), is associated with ELF
       format files.  This program is known on some other SVR4-
       compliant systems as dump.

       3.1.4  ABI_Development  For information about ABI
       development issues, see the man pages abicc(1), abild(1),
       check_abi_compliance, check_abi_interface and
       check_for_syscalls.

       3.1.5  Versioning_of_Shared_Objects  In the 5.0.1 release, a
       mechanism for the versioning of shared objects was
       introduced for SGI-specific shared objects and executables.
       Note that this mechanism is outside the scope of the ABI,
       and, thus, must not be relied on for code that must be ABI-
       compliant and run on non-SGI platforms.  Currently, all
       executables produced on SGI systems are marked SGI_ONLY,
       which allows use of the versioning mechanism.

       Versioning allows the creator of a shared object to update
       it in a way that may be incompatible with executables
       previously linked against the shared object.  This is
       accomplished by renaming the original shared object and
       providing it along with the (incompatible) new version.
       Versioning is mainly of interest only to developers of
       shared objects.  It may not be of interest to you if you
       simply use shared objects.

       3.1.5.1  What_Is_a_Version?  A version is part or all of an
       identifying version_string that can be associated with a
       shared object by using the -set_version version_string
       option to ld(1) when the shared object is created.

       A version_string consists of one or more versions separated
       by colons (:). A single version has the form:

       sgi.

       where

             is a comment string, which is ignored by the
                      versioning mechanism.  It consists of any
                      sequence of characters followed by a #.

       sgi            is the literal string sgi.

               is the major version number, which is a
                      string of digits [0-9].













                                  - 4 -



       .              a literal period.

               is the minor version number, which is a
                      string of digits [0-9].

       Here is what to do when building your shared library:

          o When you first build your shared library, give it an
            initial version, say sgi1.0.  Thus, add the option
            -set_version sgi1.0 to the command to build your shared
            library (cc -shared, ld -shared).

          o Whenever you make a compatible change to the shared
            object, create another version by changing the minor
            version number, for example, sgi1.1, and add it to the
            end of the version_string.  The command to set the
            version of the shared library might now look like
            -set_version "sgi1.0:sgi1.1" .

          o When you make an incompatible change to the shared
            object:

               - Change the filename of the old shared object by
                 adding a dot followed by the major number of one
                 of the versions to the filename of the shared
                 object.  Do not change the soname of the shared
                 object or its contents.  Simply rename the file.

               - Update the major version number and set the
                 version_string of the shared object when you
                 create it to this new version, for example,
                 -set_version sgi2.0.

       Here is how this versioning mechanism affects executables:

          o When an executable is linked against a shared object,
            the last version in the shared object's version_string
            is recorded in the executable as part of the liblist.
            This can be examined by elfdump -Dl.

          o When you run an executable, rld looks for the proper
            filename in its usual search routine.

          o If a file with the correct name is found, the version
            specified in the executable for this shared object is
            compared to each of the versions in the version_string
            in the shared object.  If one of the versions in the
            version_string matches the executable's version exactly
            (ignoring comments), then that library is used.













                                  - 5 -



          o If no proper match is found, a new filename for the
            shared object is built by taking the soname specified
            in the executable for this shared object and the major
            number found in the version specified in the executable
            for this shared object, and putting them together as
            soname.major.  (Remember that you did not change the
            soname of the object, only the filename.)  The new file
            is searched for using rld's usual search procedure.

       3.1.5.2  Example:  Suppose you have a shared object foo.so
       with initial version sgi10.0.  Over time, you make two
       compatible changes for foo.so, which result in the following
       final version_string for foo.so:

       initial version #sgi10.0: upgrade I/O#sgi10.1:new devices#sgi10.2

       You then link an executable that uses this shared object,
       useoldfoo.  This executable specifies version sgi10.2 for
       soname foo.so.  (Remember that the executable inherits the
       last version in the version_string of the shared object.)

       The time comes to upgrade foo.so in an incompatible way.
       Note that the major version of foo.so is 10, so you move the
       existing foo.so to the filename foo.so.10 and create a new
       foo.so with the version_string:

       efficient interfaces #sgi11.0

       New executables linked with foo.so use it directly.  Older
       executables, like useoldfoo, attempt to use foo.so, but find
       that its version (sgi11.0) is not the version they need
       (sgi10.2).  They then attempt to find a foo.so in the
       filename foo.so.10 with version sgi10.2.

       3.1.6  Runtime_Link_Editor_rld(1) and libdl

          o rld is a new program that is invoked when running a
            dynamic executable.  It maps in shared objects used by
            this executable, resolves relocations as ld does at
            static link time, and allocates common if required.
            rld is mapped in at program startup time by the kernel.
            Its path is /lib/rld, but you can change it with the
            _RLD_PATH environment variable.

            There are two versions: rld and rld.debug.  The first
            is faster, the second provides debugginh support.  Both
            are described on the rld(1) man page.

          o Options to rld can be specified via the _RLD_ARGS
            environment variable. It is possible to replace
            libraries without recompiling, get extra information











                                  - 6 -



            from the runtime linker, and alter some of the dynamic
            linking semantics by specifying arguments in this way.
            See the manual page rld(1) for details.

          o The functionality  previously available in
            /usr/lib/libdl.so, a user interface to the dynamic
            linker for manipulating the shared objects used by a
            dynamic executable, is now part of libc.so.1.
            Specifically, this includes the function calls
            dlopen(3), dlclose(3), dlsym(3), and dlerror(3).

       The following change in rld(1) was made in the 5.0.1 release
       of IRIX:

          o In release 5.0, rld zeroed the stack space it had used
            before invoking the main program.  As of release 5.0.1,
            it no longer zeroes this space.  If your program had a
            bug that relied on an uninitialized automatic variable
            being zero, the bug may be uncovered by this rld
            change.  If you suspect this to be the case, the
            previous behavior (rld clearing its used stack space at
            exit) can be obtained temporarily by adding the option
            -clearstack to the environment variable _RLD_ARGS when
            you run the program. However, do not rely on this
            mechanism; there is no guarantee that the stack space
            your program is relying on being zero will not be
            dirtied by other startup code in future releases.  The
            buggy behavior in your program must be corrected.  Note
            that these problems most often will occur relatively
            early in the call graph of your program.

       The following change was made to functions in libdl in the
       5.0.1 release:

          o In the 5.0 release, when a shared object was opened via
            dlopen(3x), its symbols became globally visible.  This
            behavior has been changed to be consistent with SVR4.
            As of the 5.0.1 release, objects loaded by one
            invocation of dlopen may not directly reference symbols
            from objects loaded by a different dlopen invocation.
            Those symbols may, however, be referenced indirectly
            using dlsym(3x).

            See the NOTES section of the dlopen(3x) manual page for
            further information.

       3.1.7  Changes_to_dbx(1)

          o In 5.0.1 and later, you can set the variable
            $assumenormalframe to decrease the time dbx takes to
            produce a stack trace (by the where command), by using:











                                  - 7 -



            set $assumenormalframe=1

            This variable should be set to zero (the default) when
            requesting a stack trace if you are stopped in the
            function prologue.

          o Two new commands in dbx(1) deal with shared objects:
            listobj and whichobj.

            There are three new printing commands: printo, printx,
            and printd.  These print in octal, hexadecimal, and
            decimal, respectively.

            Command-line editing similar to that available in
            emacs(1) is now available in dbx.

            See /usr/lib/dbx.help for details on these new
            commands.

          o The dbx help system has been enhanced.

          o The -f and -F options to dbx have been removed.  The
            readsyms and readglobals commands have been removed.
            dbx now always does fast startup (the -f option) so
            these options and commands are no longer needed.

       3.1.8  Archiver_ar(1)  The default format for the archive
       symbol table has been changed.  The default is now the same
       as ar E and produces an SVR4-compatible symbol table.  If
       you want to produce the old symbol table format, use ar C.

       3.1.9  Link_Editor_ld(1)  The following changes have been
       made to the linker ld(1):

          o As of release 5.0.1, the linker can adjust executables
            to avoid certain problems with early versions of the
            R4000.  If the -no_jump_at_eop flag is on (it is on by
            default), small amounts of padding are added between
            component objects to avoid placing a branch instruction
            at the end of a page.  Slightly smaller executables and
            significantly faster executables can result by turning
            this option off (using the -allow_jump_at_eop flag).
            Binaries built either way should be compatible across
            all Silicon Graphics systems, but those made with
            -no_jump_at_eop (the default) often show performance
            gains on R4000 systems.

          o New options have been added to ld(1) for aligning
            variables in the global uninitialized data area (bss).
            See the manual page for ld(1) for options with names
            beginning with -X.  These new options are unique to











                                  - 8 -



            IRIX and might change across releases.

          o The default object and executable file format has been
            changed to ELF.  Under no circumstances can you link
            together ELF and (old) COFF objects.

          o Static shared libraries are replaced by dynamic shared
            objects. The linker no longer supports linking with
            static shared libraries. However, existing executables
            linked with static shared libraries continue to work.

          o By default, the linker reports all undefined and
            unresolved symbols and exits with non-zero status.
            However, for shared linking, it is possible to allow
            unresolved symbols at static link time and rely on the
            runtime linker to complete the resolution at run time.
            If you specify -ignore_unresolved, the linker does not
            consider unresolved symbols to be errors.  This option
            is turned on by the driver if the environment variable
            SGI_SVR4 is set.

          o The linker now reports a maximum of 50 warnings
            messages.  If you want all warning messages to be
            printed, specify -wall.

          o The following new flags are related to DSO support.
            Please refer to the manual page for details: -B
            symbolic, -non_shared, -call_shared (default), -shared,
            -all, -exclude, -no_archive, -transitive_link (default)
            -check_registry, -update_registry, -set_version,
            -ignore_unresolved (default), -no_unresolved,
            -no_library_replacement, -soname, -delay_load, and
            -export.

       3.1.10  Optimizer_(uopt(5))  New optimizations and
       improvements to existing optimizations have been added to
       uopt.

          o -strictIEEE

            The optimizer performs some floating point expression
            simplification in the presence of floating point
            constants, which can cause different behavior in
            programs that rely on strict adherence to the IEEE
            floating point standard.  An example is the
            substitution of zero for multiplication by zero.  This
            flag suppresses such optimizations.

          o -Wo,-nomultibbunroll













                                  - 9 -



            The optimizer now unrolls loops whose bodies contain
            branches (that is, loop bodies made up of multiple
            basic blocks).  This internal optimizer flag suppresses
            such unrolls.

          o -noinline

            This option disables the inlining operation performed
            by umerge under -O3.  This flag is not meaningful if
            -O3 is not specified.

          o -inline_to 

            The default value of this parameter is 0.  A positive
            value of this parameter asks umerge to perform
            additional inlining of calls to leaf routines up to the
            specified level, in addition to its automatic decision
            mechanism.  A value of 1 causes all calls to leaf
            procedures to be inlined.  A value of 2 additionally
            causes all calls to procedures that became leaves due
            to level 1 inlining to be inlined, etc.  Under this
            option, a procedure becomes a leaf in the inlined
            output code if and only if the procedure's maximum
            distance from a leaf in the call graph is less than or
            equal to the value of this parameter.  This option is
            not affected by the -noinline option and is meaningful
            only if -O3 is not specified.

          o -nokpicopt

            This option tells uopt not to perform the special
            optimization for accesses of global variables when
            compiling shared.  (-kpicopt is the default for shared
            compilations)

          o -kpicopt

            This option tells uopt to perform the special
            optimization for accesses of global variables that are
            not gp-relative whether compiling shared or non-shared.
            (-nokpicopt is the default for non-shared compilations;
            however, some programs, particularly if compiled -G 0,
            might benefit from this optimization even if compiled
            -non_shared.)

       3.1.11  Assembler_(as(1))

          o Several new assembler directives are added to support
            generation of PIC (Position-Independent Code). You
            should also become familiar with the MIPS ABI
            Supplement and the PIC coding model it describes. See











                                  - 10 -



            Section 3.2, ``KPIC Directives.''

          o The assembler generates ELF object file format. Whether
            the resulting object is PIC depends on whether an
            .option pic0 or .option pic2 directive appears in the
            assembler file and on command-line arguments.  (The
            directive appearing in the .s file takes precedence.)
            In the .option directive, pic0 indicates non-PIC, and
            pic2 indicates PIC code.  PIC code can also be
            specified on the command line (in the absence of an
            .option directive) by the switch -KPIC.  If no .option
            is present in the assembler file and -KPIC does not
            appear on the command line, the default is non-PIC.

          o A number of new optimizations have been added to the
            assembler.  They are invoked automatically at
            optimization level 2 (-O2) and above.  See the as man
            page for more information about -peep, -swpipe, and
            -symregs.

          o Cross basic-block scheduling is now enabled by default
            at optimization levels 2 and above. It can be disabled
            with the -Wb,-noxbb option.  This optimization moves
            instructions from one basic block to another to allow
            for better scheduling.

          o Since the last release, enhancements have been made in
            the software pipelining and peephole optimizations in
            the assembler.

       3.1.12  Libraries  The following changes to the libraries
       that are part of the compiler system were made in the 5.0.1
       release.

          o The exception handling library, libexc.so, has been
            changed to allow for correct handling of exceptions in
            Ada code and for the correct functioning of non-local
            GOTOs in Pascal code.  Previous to this release, non-
            local GOTOs appearing in Pascal code in a shared object
            did not function correctly.  Due to implementation
            changes in the handling of non-local GOTOs necessary to
            correct this problem, all Pascal code, whether in a
            shared object or not, should be compiled and relinked
            in 5.0.1 and later.  If you are certain that none of
            your Pascal code uses non-local GOTOs, you can ignore
            this requirement.

          o With the 5.0.1 and later releases, C++ code is linked
            by default with the new shared object libC.so, which is
            a shared version of libC.a.  See the C++ release notes
            for further information.











                                  - 11 -




       3.1.13  Performance_Tools  This section includes changes to
       pixie(1), pixstats(1), and prof(1).  It also includes a
       detailed note (with an example) on using these tools with
       DSOs.

          o The program cord(1) is not provided in this release.

          o These tools will not work on executables produced on
            IRIX 4 systems.  For IRIX 4 functionality, you should
            invoke the IRIX 4 pixie,prof, and pixstats explicitly.
            They will not be run automatically under the IRIX
            compatibility mode.

       The following changes to pixie have occurred in the 3.18
       Base Compiler Development release.  See the pixie(1) manual
       page for more information.

          o pixie no longer produces a .Addrs file.  This
            information is now contained in a section called
            ``.MIPS.Addrs'' in the instrumented object.

          o During runtime, there will be only one .Counts
            generated per thread.  Previously, there was one
            .Counts file for every DSO and main in a thread.
            Multiple .Counts files occur when forks and
            multiprocessing calls occur.

          o pixie now instruments automatically all shared
            libraries in the program's internal liblist.  This
            means that for most shared programs, you only need to
            invoke pixie on the main executable.

          o The old -o has been renamed -pixie_file.  The
            -pixie_file option allows the user to rename the the
            instrumented output executable. The default is to name
            the output file the same as the input file with the
            suffix .pixie added.

          o The option previously named -bbcounts has been renamed
            -counts_file.  The -counts_file option allows the user
            to rename the the output counts file. The default is to
            name the output file the same as the input file with
            the suffix .Counts added.

          o The -branchcounts option is now default.  See the
            description of the -branchcounts option below.

          o -verbose permits printing most pixie transformation
            messages.












                                  - 12 -



          o The new -liblist option causes pixie to write out a
            list of dependent dynamic shared libraries to a file
            with the same base name as the main executable with
            .liblist as the suffix.  This has no effect when used
            on libraries or non-shared programs.  The commmand:

            pixie -liblist my_prog

            generates a file my_prog.liblist.

          o The new -autopixie option tells pixie to instrument all
            dependent dynamic shared libraries recursively.  This
            has no effect when used on libraries or non-shared
            programs.  -autopixie is on by default.

          o When the new -longbranch option is used, pixie
            transforms branches into jumps.  This should only be
            used when pixie complains about branches out of range.
            A branch can become out of range because pixie inserts
            code into the executable in order to perform the
            runtime performance data gathering and branches
            previously within range become out of range.

       In addition, the following changes of note have occurred in
       pixie in recent 5.x releases.

          o When instrumenting a shared library, the text segment
            could grow to overlap with the data segment.  In the
            current implementation, the data segment is moved to a
            higher region in the virtual address space to avoid
            this conflict.  For the main program, if the user
            compiled it with the -ld option to specify the text and
            data address, it is the user's responsibility to leave
            enough space in the data segment to have it
            instrumented properly.  Otherwise, pixie will generate
            an error message.

          o Signal handling is done by intercepting the ksigaction
            system call at runtime and instrumenting the sigreturn
            system call at when pixie is run. Pixie image register
            values not saved in the sigcontext structure are thus
            saved and restored.

          o The -branchcounts option causes pixie to add more
            counting code so the instrumented program produces
            specific information on branch use.  pixstats
            automatically understands the new information.
            Specifically, information is produced for the following
            events:













                                  - 13 -



               - Branch to branch taken

               - Branch to branch untaken

               - Untaken conditional branches

               - Taken conditional branches

               - Taken conditional branches with branch nops

               - Untaken conditional branches with branch nops

               - Direction-predicted conditional branches with
                 branch nops

               - Non-sequential fetches

               - Taken branches per conditional branch

               - Forward taken branches per conditional branch

               - Forward untaken branches per conditional branch

               - Backward taken branches per conditional branch

               - Backward untaken branches per conditional branch

          o The -pids option tells pixie to append the process ID
            number on the end of the .Counts name.  This is handy
            if you want to run the program instrumented with pixie
            through a variety of tests before generating the
            statistics with pixstats. This option should be used
            with the -pids option to pixstats, which is available
            on the 5.0.1 and later releases.

          o -threeway  may be used on the 5.0.1 and
            later releases to suppress pixie transformations on
            threeway transfers (low-level graphics hardware
            access).  If you are instrumenting libgl.so with pixie
            on a system that has VGX, GTX or Reality Engine
            graphics, your program may use this special mechanism
            for some graphics operations.  If you experience
            problems running your instrumented graphics application
            on these systems (problems usually result in the
            graphics simply being black), re-instrument your
            libgl.so with the correct -threeway option.  Use
            -threeway 3000 for RealityEngine systems and -threeway
            6000 for VGX and GTX systems.

          o -quiet was added in 5.0.1 to suppress most pixie
            transformation messages.











                                  - 14 -



          o -table can be used in 5.0.1 and later releases to cause
            pixie to write a copy of its translation table to the
            stdout device.  The translation table is a map of the
            original addresses to the instrumented addresses.

          o Static shared libraries are no longer supported.

          o -oldtrace is no longer supported.

          o Several options to pixie meant for internal use only
            are no longer available.  These are:

               - -get_shared_data

               - -calculate_registers

               - -sharedlib

       The following changes to pixstats(1) have been made in the
       5.x release.  See the manual page for more information.

          o -excludelibs tells pixstats to ignore statistics from
            libraries.  By default, pixstats outputs statistics
            that include all libraries.

          o -pids   ... tells pixstats to combine the
            statistics found in .Counts., .Counts.,
            etc., in its output. If your program uses sproc(2),
            fork(2), is compiled with Power Fortran or Power C, or
            you used the -pids option when you instrumented it, the
            .Counts file resulting from its execution will be
            placed in .Counts., and you must use pixstats
            -pids to process it.

          o The .Counts and .Addrs files generated by 4.0.5 pixie
            are no longer supported.  You cannot use old versions
            of these files with the performance tools on this
            release.

          o pixstats now looks at the file header to choose the
            timing table. If the file header indicates:

            MIPS3   r4000 timing is used

            MIPS2   r6000 timing is used

            MIPS1   r2000 timing is used

          o -disassemble disassembles basic blocks with zero
            counts.  The old behavior can be produced with
            -dislimit 1.











                                  - 15 -



          o -source or -S option has been added to provide source
            listing with disassembly.

          o -mips2 has been added as a synonym for -r6000.

          o -mips3 has been added as a synonym for -r4000.

       3.1.13.1  Using_pixie(1)_and_pixstats(1)_with_DSOs  DSOs can
       be instrumented for basic block counting.  All shared
       libraries used by an instrumented executable must also be
       instrumented.

       3.1.13.1.1  Example:  Instrument a Program with Shared
       Libraries  To run a program instrumented with pixie, you
       must instrument all the dependent DSOs. pixie will now
       instrument the main program and the needed libraries
       automatically:

       pixie my_prog

       Or, you can instrument each one individually:

       pixie -noautopixie my_prog
       pixie lib1
       pixie lib2
           :
       pixie libn

       pixie tells you which libraries need to be instrumented if
       you use the -liblist option. With this option, pixie
       produces a file named my_prog.liblist that contains the
       names of the needed dynamic shared libraries with their full
       paths. This is convenient if you wish to build a dependency
       list for a makefile or shell script. For example:

       pixie -liblist -noautopixie my_prog
               foreach lib (`cat my_prog.liblist`)
                       pixie $lib
               end

       WARNING: during static instrumenting, pixie cannot detect
       accurately dynamic shared libraries that are with calls to
       dlopen().  rld will detect that the main program has been
       instrumented and will append .pixie to the name of any file
       to be opened with dlopen().  However, you then still need to
       instrument these libraries.

       The runtime linker (rld) needs to know where the
       instrumented libraries are.  Set the environment variable
       LD_LIBRARY_PATH to the directory where you keep the
       libraries or put the instrumented libraries in the current











                                  - 16 -



       default search path for rld.

       setenv LD_LIBRARY_PATH `pwd`

       or

       setenv LD_LIBRARY_PATH .

       tells rld to look in the current directory.

       You could just as easily put all of your instrumented
       libraries in a single directory and set LD_LIBRARY_PATH to
       that path.  Just remember that to profile the program, both
       pixstats and prof need either the original or a link to:

          o The original DSOs and a.out

          o The instrumented DSOs and a.out

          o The .Counts files that were produced by running the
            instrumented program

       You can gather statistics for the whole program or a
       specific DSO:

       pixstats   gives the statistics (including DSOs).

       pixstats  -excludelibs gives the statistics
                           (excluding DSOs).

       pixstats      gives the statistics of a DSO.

       3.1.13.1.2  Example: Instrument a Program That Uses Multiple
       DSOs

         1.  Run pixie on the program to instrument both the the
             main program and the shared libraries it depends on:

             pixie my_prog

         2.  Run the program to completion:

             my_prog.pixie file1 file2

             There should now be one .Counts file, myprog.Counts.
             The .Counts file was created when the application ran.

         3.  Run pixstats to generate the statistics:

             pixstats my_prog > my_prog.stat












                                  - 17 -



       3.1.13.1.3  Example:_Instrument_an_MP_Program  In this
       example, you instrument a Fortran Multiprocessing Program.

         1.  Compile a MP Fortran program:

             f77 -o myprog -mp myprog.f

         2.  Instrument the program and its libraries:

             pixie myprog

         3.  Run the program to completion:

             setenv LD_LIBRARY_PATH .
             myprog.pixie

             There should be one .Counts file per thread per DSO.
             For example running myprog.pixie with four threads:

              myprog.Counts.1001, myprog.Counts.1002,
              myprog.Counts.1003, myprog.Counts.1004,
              .
              .

         4.  Analyze the output in one of the following ways:

                o To analyze each of the threads:

                  pixstats myprog myprog.Count.1001
                  pixstats myprog myprog.Count.1002
                  pixstats myprog myprog.Count.1003
                  pixstats myprog myprog.Count.1004

                o To analyze the sum of the threads:

                  pixstats myprog myprog.Counts.*

                o To analyze the sum of the threads excluding all
                  libraries:

                  pixstats myprog -dso myprog

                o To analyze a thread using prof:

                  prof -pixie myprog myprog.Counts.1004

                o To analyze all threads together using prof:

                  prof -pixie myprog.Counts.*













                                  - 18 -



       3.2  KPIC_Directives

       PIC code is generated if either the directive .option pic2
       appears in the assembler file or the assembler (as(1)) is
       invoked with -KPIC in the absence of an explicit .option
       pic0 or .option pic2 in the assembler file. Unless PIC code
       is being generated, the other options in this section are
       ignored by the assembler, with the exception of .gpword,
       which becomes .word.  Thus, you can easily use the same
       assembler file for generating PIC and non-PIC (that is,
       non-shared) objects by not placing .option pic0 or .option
       pic2 in the assembler file and invoking the assembler
       without -KPIC (for non-shared) or with -KPIC (for PIC code).

          o  .option pic2

            This directive forces the assembler to mark the output
            object file as containing PIC code and activates the
            following directives.  It overrides the command line
            argument.  Normally, you don't need to specify this
            directive.  Instead, you should use -KPIC or
            -non_shared to toggle between generating PIC or non-
            PIC.

            Note that even though -KPIC is the default for the
            high-level language driver (cc/pc/f77), it is not the
            default for assembly sources.  In the absence of an
            .option pic0 or .option pic2, you must explicitly
            specify -KPIC for compiling .s files to get PIC code.

          o  .cpload reg

            This directive expands into three instructions that set
            the gp register to the context pointer value for the
            current function. It should always be placed in a
            noreorder area (that is, it should be preceded by .set
            noreorder and followed by .set reorder.)  This
            directive expands into:

            lui    gp,_gp_disp
            addui  gp,gp,_gp_disp
            addu   gp,gp,reg

            _gp_disp is a reserved symbol that the linker sets to
            the distance between the lui instruction and the
            context pointer.  This directive is required at the
            beginning of each subroutine that uses the gp register.

            You must add this directive at the beginning of every
            procedure, with the exception of leaf procedures that
            do not access any global variables and procedures that











                                  - 19 -



            are static (that is, not marked .globl or .extern).

            Note:  The MIPS ABI requires that .cpload use register
                   $25.

          o  .cprestore  offset

            This directive causes the assembler to issue:

            sw      gp,offset(sp)

            where it appears.  Additionally, it causes the
            assembler to emit:

            lw      gp,offset(sp)

            after every jump-and-link (jal) (but not branch-and-
            link (bal)) operation, thereby restoring the gp
            register after function calls.  You are responsible for
            allocating the stack space for the gp.  This space
            should be in the saved register area of the stack frame
            to remain consistent with calling and debugger
            conventions.

          o  .gpword local-sym

            This directive is similar to .word, except that the
            relocation entry for local-sym has the R_MIPS_GPREL32
            type.  After linkage, this results in a 32-bit value
            that is the distance between local-sym and the context
            pointer (that is, the gp).  local-sym must be local.
            It is currently used for PIC switch tables.

          o  .cpadd  reg

            This directive adds the value of the context pointer
            (gp) to reg.

























                                  - 20 -



       EXAMPLES:

       This following is a simplified version of the hello world
       program:

           .option pic2
           .data
           .align  2
       $$5:
           .ascii  "hello world\\X0A\\X00"
           .text
           .align  2
       main:
           .set     noreorder
           .cpload $25
           .set     reorder
           subu    $sp, 40
           sw      $31, 36($sp)
           .cprestore      32
           la      $4, $$5
           jal     printf
           move    $2, $0
           lw      $31, 36($sp)
           addu    $sp, 40
           j       $31

       The actual instructions generated by the assembler will be:

           lui     gp,0            #
           addiu   gp,gp,0         # generated by .cpload
           addu    gp,gp,t9        #
           lw      a0,0(gp)        # gp-relative addressing used
           lw      t9,0(gp)        # t9 is used for func. call
           addiu   sp,sp,-40
           sw      ra,36(sp)
           sw      gp,32(sp)       # from .cprestore
           jalr    ra,t9           # jal is changed to jalr
           addiu   a0,a0,0
           lw      ra,36(sp)
           lw      gp,32(sp)       # activated by .cprestore
           move    v0,zero
           jr      ra
           addiu   sp,sp,40
           nop

       PIC Linkage Conventions

          o The MIPS ABI requires register t9 ($25) to be used for
            indirect function calls, so .cpload should always use
            $25.  Noreorder mode must be in effect when the .cpload
            directive is encountered.  Also, make sure that t9 is











                                  - 21 -



            not in use before any function call, as its value will
            be destroyed.

          o If your program uses an indirect jump (jalr), you must
            also use t9 as the jump register.

          o If you have an unconditional jump to an external label:

            j  _cerror

            you have to rewrite it into an indirect jump via t9,
            that is:

            la t9,_cerror
            j  t9

          o If you use a branch-and-link (bal) instruction for
            calling a function in the same file, and the target
            procedure begins with a .cpload, your bal must be to an
            alternate entry point in the function after the
            .cpload:

            foo: .set    noreorder       # callee
                 .cpload $25
                 .set    reorder
            $$1:        ...              # alternate entry point
                 ...
                 j       $31             # foo returns

            bar:        ...              # caller
                 ...
                 bal     $$1             # bypass the .cpload
                 ...

            This is very important because .cpload assumes register
            $25 contains the address of foo, but in this case, $25
            is not set up.  Note that because both foo and bar
            reside in the same file, they must have the same value
            for $gp.  So the .cpload instructions can be and must
            be bypassed.  However, because foo can still be called
            from outside, the .cpload is still required.

            Alternatively, if you don't want to have an alternate
            entry point, you can set up register $25 before the
            bal:

                 la      t9,foo
                 bal     foo

            or, if foo is an external symbol, you can simply use a
            jal (and allow the assembler to set up t9 for you).











                                  - 22 -



            Both of these methods are slightly less efficient than
            adding an alternate entry.

          o  .gpword and .cpadd are used together to implement a
            position-independent jump table (or any table of text
            addresses).  Entries of the address table created by
            .gpword are converted into displacements from the
            context pointer.  To get the correct text address, use
            .cpadd to add the value of gp back to them.  Because
            the gp is updated by the runtime linker, the correct
            text address can be reconstructed regardless of the
            location of the DSO.


       3.3  Library_and_System_Call_Functionality

       The following additions and changes were made to library and
       system call functionality between versions 4.1 and 5.2 of
       the IRIS Development Option.

          o IRIX 4.0 source programming interfaces to system calls
            and system libraries in IRIX 5.0.1 and later are
            compatible with those in IRIX 4.0.  Code that compil ed
            under IRIX 4.0 and uses commonly recognized practices
            for writing portable code should compile without
            modification on IRIX 5.0.1 and later.

          o Recursive versions of some libc functions have been
            provided.  These correspond to the POSIX 1003.4a
            specification for reentrant functions.  These functions
            are present in the default compilation mode-if you are
            compiling in POSIX-compliant mode (_POSIX_SOURCE
            defined), programs should be compiled with the feature
            test macro _SGI_REENTRANT_FUNCTIONS defined.

          o The POSIX 1003.4a specification for making stdio
            multi-thread safe has bee n implemented.  In the
            default compilation mode, all stdio functions are
            thread safe.  In POSIX or ANSI compilation mode, the
            program must define the feature test macr o
            _SGI_MP_SOURCE in order to get the thread safe versions
            of stdio functions
             and macros.

          o The handling of the global error value, errno, has
            changed from IRIX 4.0.  If the program includes
             and defines the feature test macro _SGI_
            MP_SOURCE, references to errno actually reference a
            per-thread errno; otherwise, the global variable errno
            is accessed.  All system calls update both the per-
            thread and global versions of errno.











                                  - 23 -



          o The MIPS ABI mutual exclusion library libmutex.so is
            supported.  The actual implementation of the routines
            is in libc.so.1.  These routines, init_lock,
            acquire_lock, release_lock, and stat_lock, provide
            low-level portable access to a mutual exclusion
            primitive (see abiloc k(3x)).

          o The math library libm.a has been carefully checked to
            ensure its conformance with both the SVID 3rd Edition
            and ANSI X3.159-1989.  Specific information can be
            found in the man pages sinh, exp, bessel, floor, gamma,
            math, hypot, sinh, sqrt, and trig.

          o The interface to the function scalb(3m) has changed to
            conform to SVR4.  In previous releases, the type of the
            second argument to scalb (the exponent) was int.  In
            this release, the type of the second argument is
            double.  In addition, the functions scalb and rint have
            been moved from the math library to the C library.

          o A new option, flush_to_zero, has been added to
            libfpe.a.  On an R4000-based system, using this option
            can improve execution performance if many floating
            point underflows occur.



































 



                                  - 1 -



       4.  Bug_Fixes

       This section lists the significant bugs fixed in the base
       compilers since the IRIX 4.0.1 release.

       4.1  Compiler_Bug_Fixes

       4.1.1  Linker_(ld(1))

          o The default cache size was changed to the size of the
            R4000 cache (8K) in 5.0.1.  This default may still be
            changed by use of the -Xcachesize size option to ld.

          o The size of the bss is now one-half what it was in IRIX
            4.0.1.  The bss region in an a.out is now essentially
            the same size as it would have been in IRIX 3.3.3.

          o Incremental linking using the -A command has been
            fixed.  Adding a -allow_jump_at_eop to an ld -A link is
            no longer necessary.

          o The -Xlocaldata option now works correctly, including
            its special symbols.

          o Many memory leaks in the linker have been fixed.  This
            regains most of the linker performance lost in the
            previous release.

       4.1.2  Run-time_Linker_(rld(1))_and_libdl(3x)  The following
       bugs were fixed in the 5.0.1 release of rld and the dynamic
       linking library libdl.

          o In 5.0.1, dlopen(3x) of a shared object which was
            created with the -init option calls the -init routine
            before dlopen returns.  In 5.0, the -init routine was
            not called at dlopen.

          o In 5.0, libdl routines could call exit(2) under certain
            circumstances (for example, if the desired library
            could not be opened).  In 5.0.1, the libdl routines
            return an error value under these circumstances as
            documented in their manual pages.

          o In the 5.0 release, when a shared object was opened via
            dlopen(3x), its symbols became globally visible.  This
            behavior has been changed to be consistent with SVR4.
            In the 5.0.1 release, objects loaded by one invocation
            of dlopen may not directly reference symbols from
            objects loaded by a different dlopen invocation.  Those
            symbols may, however, be referenced indirectly using
            dlsym(3x).











                                  - 2 -



            See the NOTES section of the dlopen(3x) manual page for
            further information.

       4.1.3  Assembler_(as(1,5))  Several bugs in the assembler
       have been fixed since the previous release.  These include
       bugs in the various assembler optimizations such as software
       pipelining and peephole optimization.

       4.1.4  Optimizer_(uopt(5))  Numerous significant bugs have
       been fixed since the IRIX 4.0.5 release.

       4.1.5  Code_Generator_(ugen(5))  Several problems with code
       generation have been fixed since IRIX 4.0.5.

          o Several problems with unaligned data accesses have been
            fixed. (1127521, 129034)

          o Code generation for FORTRAN's SIGN function has been
            fixed.

          o An overflow problem with Pascal passing large objects
            has been fixed (126986).

       4.1.6  The_Debugger_dbx(1)  In the version of dbx released
       with 5.0, attempts to use the

       stop 

       or

       trace 

       constructs failed. The dbx documentation states:

       ``If an  is given, that expression is assumed to be a
       pointer and the thing-pointed-at is inspected at the
       `appropriate' points.''

       In the 5.0 version, the  was inspected at 'appropriate'
       points, rather than the thing-pointed-at by .   The
       result was an inoperative trace or stop command.

       This problem was fixed in 5.0.1.

       4.1.7  Performance_Tools  The stability of pixie was greatly
       improved in the 5.0.1 release.  In addition, as of 5.0.1 it
       is possible to instrument a multiprocessing program with
       pixie.

       As of the 3.18/5.2 release, prof can now collect statistics
       about dynamic shared libraries.  In addition, multiprocessor











                                  - 3 -



       support is now working.

       4.1.8  Libraries  The following bugs have been fixed in
       libraries.

          o The exception handling library, libexc, has been
            changed to allow for correct functioning of non-local
            GOTOs in Pascal code.  In previous releases, non-local
            GOTOs appearing in Pascal code in a shared object did
            not function correctly.  Due to implementation changes
            in the handling of non-local GOTOs necessary to correct
            this problem, all Pascal code, whether in a shared
            object or not, should be compiled and relinked in 5.0.1
            and later.  If you are certain that none of your Pascal
            code uses non-local GOTOs, you can ignore this
            requirement.

          o The atof and strtod functions now return correctly
            signed HUGE_VAL for arguments too large in magnitude.
            In addition, strtod sets errno to ERANGE.

          o The ldexp function now correctly returns HUGE_VAL and
            sets errno to ERANGE if the result overflows.

          o The precision of conversion between ASCII and binary
            floating point has been significantly improved in this
            release.

          o Rounding into the least-significant digit of an output
            floating point format is now done correctly in all
            cases.  In previous releases, printing .00053 with a
            format of %.3f printed 0.000 instead of the (correct)
            0.001.

          o Various bugs against math library manual pages have
            been fixed.























 



                                  - 1 -



       5.  Known_Problems_and_Workarounds

       This section lists known problems with the 3.18 base
       compiler portion of the IRIS Development Option.

       5.1  Optimizer_(uopt(5))

          o In certain cases (usually with very large subroutines),
            uopt has grown unreasonably large while running (over
            70 MB).  This causes systems with smaller amounts of
            memory to thrash and, in extreme cases, to run out of
            available swap space.  This should be suspected if uopt
            dies with a ``signal 9,'' which means that the process
            was killed externally (for example, by the operating
            system), rather than by a bug that caused an internal
            failure.

            Almost all optimizer problems can be narrowed to to a
            single subroutine.  By identifying the problem
            routine(s), you do not need to suppress optimization on
            the whole program, only on the smaller subset.

          o A considerable number of new optimizations have been
            added to the assembler.  These optimizations are turned
            on at level -O2; if they fail, they tend to look like
            optimizer problems.

       5.2  Performance_Tools

            The following known problems exist in pixie(1):

               - Trace features are currently not supported.  This
                 is to say that they have not been tested and thus
                 cannot be guaranteed to work.

               - Objects loaded using dlopen() cannot be
                 instrumented automatically.

          o The following problem exists in pixstats(1):

            The DSOs must be in or linked to the current directory
            when executing pixstats.

          o The following problems exist in prof(1):
                 prof (-pixie) -testcoverage or -gprof cannot
                 process basic block counts for shared libraries.
                 If you need to process basic block counts, compile
                 the code with -non_shared flag.

               - prof cannot process information from dynamic
                 shared libraries that have been opened with











                                  - 2 -



                 dlopen() and have the same name, but differenct
                 paths, i.e.:

                 /path1/libl.so
                 /path2/libl.so


       5.3  Libraries

       These are known problems in compiler-associated libraries:

          o In general, routines in the -lm43 library might not
            conform to either SVR4 or IEEE with respect to
            diagnostics or return values.  These discrepancies are,
            however, described in the manual pages of the
            constituent functions. (See Section 3.5 for math
            library changes).  The following particular problems
            are known (these problems exist in -lm43 routines, but
            not in -lm routines):

               - The -lm43 functions pow, hypot, and cabs might
                 fail to return NaN when given a NaN argument.  The
                 return value in these cases is Infinity for hypot
                 and cabs and either Infinity or zero for pow.

               - If the magnitude of their argument is greater than
                 one, the -lm43 functions acos and asin return
                 zero, pi/2,  or pi rather than the (correct) NaN.

               - The -lm43 y0, y1, and yn functions return NaN
                 (instead of -Infinity) when the argument is zero.
                 These functions also produce underflow
                 inconsistently (with respect to -lm).

               - The version of gamma in the -lm43 library loops
                 indefinitely if it is given Infinity as an
                 argument.

          o The single-precision version of log, logf, is
            imprecise.  In particular, logf(x) might not
            approximate -logf(1/x) as well as expected.  The
            double-precision version does not exhibit this
            behavior.
















 



                                  - 1 -



       1.  Dynamic_Shared_Objects

       A Dynamic Shared Object, or DSO, is an ELF format object
       file, very similar in structure to an executable program but
       with no "main".  It has a shared component, consisting of
       shared text and read-only data; a private component,
       consisting of data and the GOT (Global Offset Table);
       several sections that hold information necessary to load and
       link the object; and a liblist, the list of other shared
       objects referenced by this object. Most of the libraries
       supplied by SGI are available as dynamic shared objects.

       A DSO is relocatable at runtime; it can be loaded at any
       virtual address.  A consequence of this is that all
       references to external symbols must be resolved at runtime.
       References from the private region (.e.g. from private data)
       are resolved once at load-time; references from the shared
       region (e.g. from shared text) must go through an
       indirection table (GOT) and hence have a small performance
       penalty associated with them.

       Code compiled for use in a shared object is referred to as
       Position Independent Code (PIC), whereas non-PIC is usually
       referred to as non-shared.  Non-shared code and PIC cannot
       be mixed in the same object.

       At Runtime, exec loads the main program and then loads rld,
       the runtime linking loader, which finishes the exec
       operation.  Starting with main's liblist, rld loads each
       shared object on the list, reads that object's liblist, and
       repeats the operation until all shared objects have been
       loaded.  Next, rld allocates common and fixes up symbolic
       references in each loaded object.  (This is necessary
       because we don't know until runtime where the object will be
       loaded.)  Next, each object's init code is executed.
       Finally, control is transferred to "__start".

       For a more complete discussion of DSOs, including answers to
       questions frequently asked about them, see the dso(5) man
       page.