S1 A Practical Guide to Java Ian G. Angus, Boeing Information and Support Services Level: 50% Beginner, 25% Intermediate, 25% Advanced In the span of just one year, Java has risen from an experimental language to be the Internet programming language du jour. The question naturally arises: What is Java and how is it used? In this tutorial we will go beyond the hype. We will: introduce the Java programming language and explain the concepts (and buzzwords) that define it with careful coverage given to both its strengths and weaknesses; show with simple examples how Java can be used on the WWW and how you can take advantage of it; and demonstrate Java's greater potential with a discussion of elected non-WWW applications for which Java provides unique capabilities. S2 Understanding and Developing Virtual Reality Systems Henry A. Sowizral, Sun Microsystems Level: 50% Beginner, 30% Intermediate, 20% Advanced This tutorial provides an in-depth look at virtual reality (VR) systems and their construction. It introduces virtual reality with an overview of how a VR system operates, a brief history and videos showing a collection of VR applications in operation. It continues with a survey discussing the component parts of a VR system, both hardware and software. It includes information on how those components operate and pointers to suppliers of various products. The tutorial delves into the many topics involved in making the VR experience more "real," such as correcting for errors introduced by the display's optical pathway, correcting for tracker errors and lag, understanding how to use the graphics hardware most effectively, handling scene complexity and inserting an egocentric human model (avatar) into the scene. The tutorial concludes with a description of augmented environments and their operation. S3 An Intensive and Practical Introduction to the Message Passing Interface (MPI) William Saphir, NASA Ames Research Center/MRJ Level: 15% Beginner, 55% Intermediate, 30% Advanced MPI has taken hold as the library of choice for portable high-performance message passing applications. The complexity of MPI presents a short but steep learning curve. This tutorial provides a rapid introduction to MPI, motivating and describing its core functionality. It is suitable both for beginning users who want a rapid introduction to MPI, as well as intermediate users who want more than just the "how to." The approach is practical, rather than comprehensive, explaining what is important, what is not and why. The emphasis will be on obtaining high performance, differentiating between theoretical performance advantages and ones that make sense in the real world and on avoiding common mistakes. There will be an opinionated discussion of datatypes, communication modes, topologies and other advanced MPI features. The tutorial will describe techniques for obtaining high performance, illustrated with numerous examples. For users trying to choose between MPI and PVM, the tutorial will include a comparison of the two libraries and a discussion of porting. It will also cover the most recent developments in MPI-2, including dynamic process management, one-sided communication and I/O. S4 Message-Passing Programming for Scientists and Engineers Cherri M. Pancake, Oregon State University Hugh M. Caffey, Hewlett-Packard Company Level: 70% Beginner, 30% Intermediate In this tutorial, the principles of parallel programming in a message-passing environment will be introduced in terms that make sense to non-computer scientists. Emphasis will be on practical information, with a series of example programs being used to guide newcomers through the important stages in writing and tuning message-passing codes. The tutorial will not address details of parallel architectures, algorithms or theoretical models. Instead, it will offer a minimal-trauma introduction to the issues at stake in deciding whether or not to parallelize an application, basic approaches to adding parallelism, and techniques for debugging, evaluating and tuning parallel programs. S5 High Performance Fortran in Practice Charles Koelbel, Rice University Level: 30% Beginner, 50% Intermediate, 20% Advanced High Performance Fortran (HPF) was defined in 1993 to provide a portable syntax for expressing data-parallel computations in Fortran. A major revision of HPF (termed HPF 2.0) will be completed by SC'96. Since the appearance of the High Performance Fortran Language Specification (available as an issue of Scientific Programming and by ftp, gopher, and WWW), several commercial compilers have appeared. There has also been great interest in HPF as a language for efficient parallel computation. The purpose of this tutorial is three-fold: 1. To introduce programmers to the most important features of HPF 2.0 2. To illustrate how these features can be used in practice on algorithms for scientific computation. 3. To inform users of the future direction of HPF, including recommended extensions to HPF 2.0 in the areas of advanced data mapping, task parallelism and external interfaces. The tutorial will both broaden the appeal of HPF and help users achieve its maximum potential. S6 The Science and Practice of Supercomputing Benchmarking Aad J. van der Steen, University of Utrecht Level: 35% Beginner, 50% Intermediate, 15% Advanced This tutorial presents a scientific approach to benchmarking and follows the methodology of the first "Parkbench" committee report. It defines a clear set of units and symbols, followed by a carefully defined set of performance parameters and metrics and, finally, a hierarchy of parallel benchmarks to measure them. A new theory of performance caling is presented through the concept of "computational similarity," which allows the scaling of an application for all computers and all problem sizes to be represented in a single dimensionless diagram. Benchmarking practice covers the general principles of properly setting up benchmarks and how to assess their results and relate them to other techniques like simulation and machine modeling. Results on current machines like the Cray T90, Cray T3E, Hitachi SR2201, HP/Convex, SPP-1600, Fujitsu VPP700 and NEC SX4 will be discussed. S7 Performance Programming for Scientific Computation Bowen Alpern, IBM T. J. Watson Research Center Larry Carter, University of California at San Diego Level: 30% Beginner, 50% Intermediate, 20% Advanced Performance programming is the design, writing and tuning of programs to sustain near-peak performance. This tutorial will present a unified framework for understanding and overcoming the bottlenecks to high performance. The goal of the course is to make performance programming a science rather than a craft. Development of high performance programs has always required an acute sensitivity to details of processor and memory hierarchy architecture. The advent of modern workstations and supercomputers brings to the fore another concern-parallelism. The tutorial will identify four requirements for attaining high performance at any level of computation. General techniques for satisfying these requirements can be applied to improve performance of the processor, of the memory hierarchy and of parallel processors. Application of the techniques are illustrated with a variety of examples. S8 Memory Consistency Models for Shared-Memory Multiprocessors Sarita V. Adve, Rice University Kourosh Gharachorloo, Digital Equipment Corporation Level: 50% Beginner, 30% Intermediate, 20% Advanced A memory consistency model for a shared-memory system indicates how the memory operations of a program will appear to execute to the programmer. The most commonly assumed memory model is Lamport's sequential consistency (SC) which requires a multiprocessor to appear like a multiprogrammed uniprocessor. While SC provides a familiar interface for programmers, it restricts the use of several common uniprocessor hardware and compiler optimizations, thereby limiting performance. For higher performance, alternate memory models have been proposed. These models, however, present a more complex programming interface. Thus, the choice of the memory model involves a difficult, but important tradeoff between performance and ease-of-use. This tutorial will survey several currently proposed memory models, place them within a common framework, and assess them on the basis of their performance potential and ease-of-use. We will cover: the problem of memory consistency models; implementing sequential consistency; alternative memory models, including models adopted by Digital, IBM and Sun; interaction with other latency hiding techniques; more aggressive implementations of memory consistency models; and relaxed consistency models for software DSM systems (e.g., Munin, Treadmarks, Midway) The tutorial will assume rudimentary knowledge of shared-memory multiprocessor organization. It will cover both the basic problem in detail and advanced issues that represent ongoing research.
M1 Interactive Visualization of Supercomputer Simulations Terry Disz, Michael Papka, Rick Stevens, Argonne National Laboratory; Matthew Szymanski, University of Illinois at Chicago Level: 100% Intermediate This tutorial discusses the integration of interactive visualization environments with supercomputers used for simulation of scientific applications. The topics include an introduction to interactive visualization technology (tracking, display systems, sound, modeling), communication mechanisms (software and hardware) needed for system integration, system performance and the use of multiple visualization systems. The experience of the presenter in using the CAVE Automatic Virtual Environment (CAVE) connected to an IBM SP machine will be used to illustrate concepts. The participants will gain knowledge of how to link massively parallel supercomputing simulations with virtual environments for display, interaction and control. The tutorial will conclude with a discussion of the critical performance points in the coupled supercomputer, virtual environments experience. M2 Tuning MPI Applications for Peak Performance William Gropp, Rusty Lusk, Argonne National Laboratory Level: 50% Intermediate, 50% Advanced MPI is now widely accepted as a standard for message-passing parallel computing libraries. Both applications and important benchmarks are being ported from other message-passing libraries to MPI. In most cases it is possible to make a translation in a fairly straightforward way, preserving the semantics of the original program. On the other hand, MPI provides many opportunities for increasing the performance of parallel applications by the use of some of its more advanced features, and straightforward translations of existing programs might not take advantage of these features. New parallel applications are also being written in MPI, and an understanding of performance-critical issues for message-passing programs, along with an explanation of how to address these using MPI, can provide the applications programmer with the ability to provide a greater percentage of the peak performance of the hardware to the application. This tutorial will discuss performance-critical issues in message passing programs, explain how to examine the performance of an application using MPI-oriented tools and show how the features of MPI can be used to attain peak application performance. We assume attendees will have an understanding of the basic elements of MPI. Experience with message-passing parallel applications will be helpful but not required. M3 Reinventing the Supercomputer center William Kramer, William McCurdy, Horst Simon Lawrence Berkeley National Laboratory Level: 50% Intermediate, 50% Advanced This is a time of change for the nation's large scale computing centers. Supercomputing sites have to deal with industry consolidations, decreasing budgets, changing and expanding missions, consolidation of sites and new technical challenges. This tutorial covers the experience and many practical lessons from large scale computing sites that have gone through dramatic changes. These include understanding the reasons requiring changes, the changing roles of supercomputing, organization decisions, recruiting and staffing, managing client expectations and interactions, setting and measuring new goals and expectations, and all the details of creating a new facility. M4 Introduction to Effective Parallel Computing Marilynn Livingston, Southern Illinois University; Quentin F. Stout, University of Michigan Level: 50% Beginner, 50% Intermediate This tutorial provides a comprehensive overview of parallel computing, focusing on the aspects most relevant to the user. Throughout, the emphasis is on the iterative process of converting a serial program into an increasingly efficient, and correct, parallel program. The tutorial will help people make intelligent planning decisions concerning parallel computers and help them develop efficient application codes for such systems. It discusses hardware and software, with an emphasis on systems that are now (or soon will be) commercially available. Program design principles such as load balancing, communication reduction and efficient use of cache are illustrated through examples selected from engineering, scientific and database applications. M5 Parallel I/O on Highly Parallel Systems Samuel Fineberg, Bill Nitzberg, NASA Ames Research Center Level: 30% Beginner, 60% Intermediate, 10% Advanced Typical scientific applications require vast amounts of processing power coupled with significant I/O capacity. Highly parallel computer systems provide floating point processing power at low cost, but efficiently supporting a scientific workload also requires commensurate I/O performance. In order to achieve high I/O performance, these systems utilize parallelism in their I/O subsystems - supporting concurrent access to files by multiple nodes of a parallel application and striping files across multiple disks. Obtaining maximum I/O performance can, however, require significant programming effort. This tutorial presents a comprehensive survey of the state-of-the-art in parallel I/O from basic concepts to recent advances in the research community. Requirements, interfaces, architectures and performance are all illustrated using concrete examples from commercial offerings (Convex Exemplar, Cray T3E, IBM SP2, Intel Paragon, Meiko CS-2, and high-end workstation clusters), as well as academic and research projects (CHARISMA, Panda, PASSION/VIP-FS, PIOUS, PPFS and Vesta) and emerging MPI-IO standards. M6 Hot Chips for High Performance Computing Subhash Saini, David H. Bailey, NASA Ames Research Center Level: 25% Beginner, 50% Intermediate, 25% Advanced We will discuss several current CMOS-based processors: the DEC Alpha 21164 (used in the Cray T3E); the MIPS R10000 (used in the SGI Power Challenge); the Intel Pentium Pro Processor (used in the first DOE ASCI system); the PowerPC 604 (used in an IBM SMP system); the HP PA-RISC 7200 (used in the Convex Exemplar SPP1600); an NEC proprietary processor (used in NEC SX-4); a Fujitsu proprietary processor, (used in the new Fujitsu VPP700); and a Hitachi proprietary processor (used in the Hitachi SR2201). The architecture of these microprocessors will be presented, followed by a description of supercomputers based on these processors. The various performances of hardware/programming models supported on these systems will also be discussed. The performance of various hardware/programming model combinations (HPF and MPI) will then be compared, based on the latest NAS Parallel Benchmark results, thus providing a cross-machine and cross-model comparison. The tutorial will conclude with a discussion of general trends in high performance computing, including future directions in hardware and software technology as we achieve Tflop/s performance levels and press on to Pflops/s levels in the next decade. M7 Designing and Building Parallel Programs: An Introduction to Parallel Programming Ian Foster, Argonne National Laboratory; Carl Kesselman, California Institute of Technology; Charles Koelbel, Rice University Level: 60% Introductory, 40% Intermediate In this tutorial, we provide a comprehensive introduction to the techniques and tools used to write parallel programs. Our goal is to communicate the practical information required by scientists, engineers and educators who need to write parallel programs or to teach parallel programming. First, we introduce principles of parallel program design, touching upon relevant topics in architecture, algorithms and performance modeling. Then, we describe the parallel programming standards High Performance Fortran and Message Passing Interface and the modern parallel language Compositional C++. Finally, we introduce techniques for coupling HPF and MPI, and the parallel Standard Template Library proposed for HPC++. The tutorial is based on the textbook Designing and Building Parallel Programs (Addison-Wesley, 1995), also available in HTML format on WWW at http://www.mcs.anl.gov/dbpp. M8 Applications of Web Technology and HPCC Geoffrey Fox, Wojtek Furmanski, Nancy McCracken, Syracuse University Level: 100% Intermediate We discuss the role of HPCC and Web technologies in several applications, including health care, finance, education and the delivery of computing services. We assume a knowledge of base Web concepts and summarize key features of Java, VRML, and JavaScript but do not give a tutorial on these base technologies. We will illustrate the possibilities of HPCC Web integration in these real world applications and the role of base technologies and services.SC'96 Home Page | Conference Program Overview | Program at a Glance | Keynote/Plenary/Invited Speakers | Technical Papers | Education Program | Networking Infrastructure | Exhibits | Exhibitor Forum | HPC Challenge | Media Information | Information for Presenters | General Information | Conference Registration | Hotel Registration | Pittsburgh Information | SC'96 Committees | Questions? |