3 New Features for This Release
4 Known Problems and Workarounds
- 1 - 6.0 Power Fortran Accelerator Release Notes - 2 - ________________________________________________ Contributors: Written by Bron Nelson ________________________________________________ c Copyright 1993, Silicon Graphics, Inc. - All rights reserved This document contains proprietary information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. Restricted Rights Legend Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013, and/or in similar or successor clauses in the FAR, or the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. 6.0 Power Fortran Accelerator Release Notes Document Number 007-1669-010 Silicon Graphics, Inc. - 3 - Mountain View, California Silicon Graphics and IRIS are registered trademarks and POWER Fortran Accelerator, POWER Series, Personal IRIS, IRIS Crimson, IRIS Indigo, and IRIX are trademarks of Silicon Graphics, Inc. VMS is a trademark of Digital Equipment Corporation. 1. Introduction The Silicon Graphicsr Power Fortran Accelerator (PFA) optimizes Fortran 77 code for Silicon Graphics' multiprocessor systems. It performs data dependency analysis and inserts special compiler directives to parallelize DO loops where possible. The PFA option includes: o The Power Fortran Accelerator, pfa (this is the 32bit version) o The MIPSPro Fortran 77 parallel front end, fef77p (this is the 64bit version). o The pfa(1) manual page You can use the PFA option (-pfa) with the Fortran 77 compiler f77(1) on any IRIS system. Note: You can develop, compile, and even run parallel code on any Silicon Graphics IRIS workstation. However, the performance benefits of parallel execution on multiple processors are possible only when you run the application program on an IRIS multiprocessor system. This document contains the following chapters: 1. Introduction 2. Installation Information 3. New Features for This Release 4. Known Problems and Workarounds Note: Packaged with this software is a separate sheet that contains the Software License Agreement. This software is provided to - 4 - you solely under the terms and conditions of the Software License Agreement. Please take a few moments to review the Agreement. 1.1 Release_Identification_Information Following is the release identification information for PFA: Software Option Product Power Fortran Accelerator Version 6.0 Product Code SC4-PFTN-6.0 System Software Requirements IRIX 6.0 IRIS Development Option 6.0 1.2 Online_Release_Notes After you install the online documentation for a product (the relnotes subsystem), you can view the release notes on your screen. If you have a graphics system, select ``Release Notes'' from the Tools submenu of the Toolchest. This displays the grelnotes(1) graphical browser for the online release notes. Refer to the grelnotes(1) man page for information on options to this command. If you do not have a graphics system, you can use the relnotes command. Refer to the relnotes(1) man page for accessing the online release notes. 1.3 Product_Support Silicon Graphics, Inc., provides a comprehensive product support maintenance program for its products. If you are in the U.S. or Canada America and would like support for your Silicon Graphics- supported products, contact the Technical Assistance Center at 1-800-800-4SGI. If you are outside the U.S. or Canada, contact the Silicon Graphics subsidiary or authorized distributor in your country. - 5 - - 1 - 6.0 Power Fortran Accelerator Release Notes - 2 - ________________________________________________ Contributors: Written by Bron Nelson ________________________________________________ c Copyright 1993, Silicon Graphics, Inc. - All rights reserved This document contains proprietary information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. Restricted Rights Legend Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013, and/or in similar or successor clauses in the FAR, or the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. 6.0 Power Fortran Accelerator Release Notes Document Number 007-1669-010 Silicon Graphics, Inc. - 3 - Mountain View, California Silicon Graphics and IRIS are registered trademarks and POWER Fortran Accelerator, POWER Series, Personal IRIS, IRIS Crimson, IRIS Indigo, and IRIX are trademarks of Silicon Graphics, Inc. VMS is a trademark of Digital Equipment Corporation. 2. Installation_Information This chapter lists supplemental information to the IRIS Software Installation Guide. The information listed here is product-specific; use it with the installation guide to install this product. 2.1 PFA_Subsystems Following is a description of the PFA subsystems: pfa_dev.sw.pfa The Power Fortran Accelerator executable images pfa_dev.man.pfa The on-line manual page pfa_dev.man.relnotes This document 2.2 PFA_Subsystem_Disk_Space_Requirements The PFA subsystem occupies about 10 Mbytes of disk space. 2.3 Installation_Method All of the subsystems for PFA can be installed using IRIX. You do not need to use the miniroot. Refer to the IRIS Software Installation Guide for complete installation instructions. The procedure for installing the CROSS64 development option (to be installed from the 6.0 IDO CD-ROM onto a machine running 5.2 IRIX) is specialized. For the details of this procedure, consult the 6.0 IRIX Development Option release notes. For information about using the CROSS64 - 4 - development option, see the release notes for the 6.0 Baae Compiler Development Option. 2.4 Prerequisites To use PFA 6.0 you must be running version 6.0 of the MIPSPro Fortran77 compiler, and IRIX release 6.0. - 1 - 6.0 Power Fortran Accelerator Release Notes - 2 - ________________________________________________ Contributors: Written by Bron Nelson ________________________________________________ c Copyright 1993, Silicon Graphics, Inc. - All rights reserved This document contains proprietary information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. Restricted Rights Legend Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013, and/or in similar or successor clauses in the FAR, or the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. 6.0 Power Fortran Accelerator Release Notes Document Number 007-1669-010 Silicon Graphics, Inc. - 3 - Mountain View, California Silicon Graphics and IRIS are registered trademarks and POWER Fortran Accelerator, POWER Series, Personal IRIS, IRIS Crimson, IRIS Indigo, and IRIX are trademarks of Silicon Graphics, Inc. VMS is a trademark of Digital Equipment Corporation. 3. New_Features_for_This_Release This chapter covers the changes and additions to the Power Fortran Accelerator (PFA) since the 3.10 release (IRIX 4.0.5 system release). Other changes of interest to PFA users are documented in the 4.0.1 Fortran 77 Release Notes. o The syntax for the -save option has changed. See the man page for details. o CASEVision/WorkShop Pro MPF is a new programming tool integrated with WorkShop 2.0. It helps you to understand the structure and parallelization of your FORTRAN program. It consists of a new program, the Parallel Analyzer View, cvpav, which reads analysis files generated by version 4.0 (or greater) of PFA, and displays information about the loops in the files in a window. The analysis file contains the information currently shown on PFA's listing file, and some additional, more detailed, information. The Parallel Analyzer View presents that information in a more comprehensible form with a graphical user interface. This information is contained in a file with the extension ".anl" which is generated when the "pfa keep" option is used. The Parallel Analyzer View allows examination of a program's loops in conjunction with a performance experiment on a run compiled for a uniprocessor. (The data for parallelized loops in a run compiled for a multiprocessor can not be properly retrieved at this time.) When run in this mode, the source displays will be annotated with line-level performance data, and the list of loops may be sorted in - 4 - order of performance cost, so that your attention may be focused on the important loops. o It is now legal to have a symbolic PARAMETER name in the SHARE clause of a DOACROSS directive. Previously, only variable names were allowed. - 1 - 6.0 Power Fortran Accelerator Release Notes - 2 - ________________________________________________ Contributors: Written by Bron Nelson ________________________________________________ c Copyright 1993, Silicon Graphics, Inc. - All rights reserved This document contains proprietary information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. Restricted Rights Legend Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013, and/or in similar or successor clauses in the FAR, or the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. 6.0 Power Fortran Accelerator Release Notes Document Number 007-1669-010 Silicon Graphics, Inc. - 3 - Mountain View, California Silicon Graphics and IRIS are registered trademarks and POWER Fortran Accelerator, POWER Series, Personal IRIS, IRIS Crimson, IRIS Indigo, and IRIX are trademarks of Silicon Graphics, Inc. VMS is a trademark of Digital Equipment Corporation. 4. Known_Problems_and_Workarounds This chapter describes known problems with the current release of PFA and how to work around them. o This release is a transitional release, moving from the old scheme of having PFA as a separate pass, and moving towards having PFA be fully integrated with the Fortran front end. In this initial release, 32bit compilations use the old method, while 64bit compilations use the new method. As a result, there are numerous minor differences. In particular, a number of bug fixes have only been applied to the newer version, and not retro-fitted to the old. o The 64-bit loader uses a different syntax for the option to make each thread in an MP program have its own copy of a common block. In the 32-bit loader, the option is -Xlocaldata common_name_ while the 64-bit loader uses -Wl,-Xlocal,common_name_. Also, the 32-bit loader allows list of common_names with the one -Xlocaldata option, while the 64-bit loader allows only one name per -Xlocal option (but does allow multiple -Xlocal options). o Currently, the 64bit version of PFA generates incorrect code if any of the variables in a DO statement (i.e. the loop index, the base, the bound, or the stride) are of type integer*8. Such variables should be declared to be integer*4. The effect of this bug is mitigated somewhat by the related bug that arrays may not contain more than 2**31 elements (see the Fortran release notes) and so an integer*4 index - 4 - should be sufficient. o Source lines with a leading "tab" character are supposed to be allowed to be of any length (VMS extension). However, PFA enforces the line length limit (default = 72) even on lines with leading tabs. To work around this problem, use the -extend_source or -col120 option to f77. This bug is fixed in the 64bit version. o There were considerable changes to the syntax used to do procedure inlining. Most of the old command line syntax is still supported. In particular, the syntax -create -lib=foo still works. However, PFA no longer accepts the reverse ordering (that is, -lib=foo -create is no longer accepted). The 64bit version no longer accepts any of the old inlining syntax. o Occasionally, the Fortran compiler gives the warning assignment to static scalar: xxx in multi processed region. For PFA- generated code, you can ignore this warning safely. It is caused when PFA performs scalar optimizations making a particular variable extraneous, but then fails to delete all assignments to that (now useless) variable. The Fortran compiler becomes suspicious, and so it gives a warning. Although the code is less than perfect, this useless assignment wastes only a little bit of time and does not affect the correctness of the code. o The line numbering directives used for debugging are frequently off by one line. o Syntax errors from an included file are reported using the name of the original file, not the name of the included file. However, the line number used in the error message is relative to the beginning of the included file, not the original file. This is fixed in the 64bit version. o PFA becomes confused if there are line number directives embedded inside a single continued line. This can occur if cpp is used to include the continuation part of a - 5 - line. For example: a(i) = b(i) + #include "more.h" x d(i) where the file more.h contains x c(i) + Here, the source file (and line numbers) change while in the middle of processing a single Fortran statement. To compile something of this nature, use a two-step process: 1. Use the -P option to f77 to run cpp on the source without producing the line numbering directives. 2. Run f77 on the generated .i file. For example: f77 -P foo.f f77 -c -pfa keep foo.i o The limit on the length of a source line is enforced for line number directives, even though it should not be. If you have very long file names (or include files with very long path names) PFA gives a bogus error message. Typically this (bogus) message is: illegal characters in an octal constant To work around this problem, use the -extend_source or -col120 option to f77. This is fixed in the 64bit version. o If a routine uses Fortran style INCLUDEs, and the first line of the file begins with C$, and the file is compiled with -nocpp, and the -I command line option is used, then fcom will be unable to locate the files to be included. This problem only affects the 32bit version of PFA. o PFA tends to ``leak'' memory. The size of its swap image grows during the course of a - 6 - single compile. When compiling a single file of 15 to 20 thousand lines or more, PFA's swap image might exceed the default size of the swap partition. However, PFA generally remains well-behaved with respect to virtual memory paging even in this case. To work around the problem, either break the single large file into two or more smaller files, or compile on a system that has a large swap partition. o There is no direct analogue to the fcom -backslash option for the 32bit PFA. The best you can do is use the -syntax=a option, if that is possible for your application. o If a variable used for a REDUCTION also appears on the left hand side of an assignment statement after the corresponding DOACROSS, the 32bit Fortran compiler will occasionally wrongly complain about an illegal assignment to the reduction variable. This can be worked around by introducing a new temporary variable to do the reduction within the loop, and then assigning this temporary back to the original variable. - 1 - 6.0 Power Fortran Accelerator Release Notes - 2 - Document Number 007-1669-010 ________________________________________________ Contributors: Written by Bron Nelson ________________________________________________ c Copyright 1993, Silicon Graphics, Inc. - All rights reserved This document contains proprietary information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. Restricted Rights Legend Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013, and/or in similar or successor clauses in the FAR, or the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. 6.0 Power Fortran Accelerator Release Notes Document Number 007-1669-010 - 3 - Silicon Graphics, Inc. Mountain View, California Silicon Graphics and IRIS are registered trademarks and POWER Fortran Accelerator, POWER Series, Personal IRIS, IRIS Crimson, IRIS Indigo, and IRIX are trademarks of Silicon Graphics, Inc. VMS is a trademark of Digital Equipment Corporation. 1. Dynamic_Shared_Objects A Dynamic Shared Object, or DSO, is an ELF format object file, very similar in structure to an executable program but with no "main". It has a shared component, consisting of shared text and read-only data; a private component, consisting of data and the GOT (Global Offset Table); several sections that hold information necessary to load and link the object; and a liblist, the list of other shared objects referenced by this object. Most of the libraries supplied by SGI are available as dynamic shared objects. A DSO is relocatable at runtime; it can be loaded at any virtual address. A consequence of this is that all references to external symbols must be resolved at runtime. References from the private region (.e.g. from private data) are resolved once at load-time; references from the shared region (e.g. from shared text) must go through an indirection table (GOT) and hence have a small performance penalty associated with them. Code compiled for use in a shared object is referred to as Position Independent Code (PIC), whereas non-PIC is usually referred to as non- shared. Non-shared code and PIC cannot be mixed in the same object. At Runtime, exec loads the main program and then loads rld, the runtime linking loader, which finishes the exec operation. Starting with main's liblist, rld loads each shared object on the list, reads that object's liblist, and repeats the operation until all shared objects have been loaded. Next, rld allocates common and fixes up symbolic references in each loaded - 4 - object. (This is necessary because we don't know until runtime where the object will be loaded.) Next, each object's init code is executed. Finally, control is transferred to "__start". For a more complete discussion of DSOs, including answers to questions frequently asked about them, see the dso(5) man page.