David M. Beazley
beazley@cs.utah.edu
http://www.cs.utah.edu/~beazley
Peter S. Lomdahl
pxl@lanl.gov
http://bifrost.lanl.gov/~pxl
We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. The system is easy to modify, works with existing C code, is memory efficient, and can be used from inexpensive workstations over standard Internet connections. We demonstrate how we have been able to explore data from production MD simulations involving as many as 104 million atoms running on the CM-5 and Cray T3D. We also show how this approach can be used to integrate common scripting languages (including Python, Tcl/Tk, and Perl), simulation code, user extensions, and commercial data analysis packages.
Prior to 1992, MD simulations were usually performed on relatively small systems (often in 2D) involving fewer than 1 million atoms [8]. However, in a span of only 3 years, MD simulation sizes grew to as many as 1 billion atoms in 3D [2,4,5]. Today, production simulations involving tens to hundreds of millions of atoms are possible, but analyzing the resulting data has proven to be extraordinarily difficult. As a result, most MD simulations remain small even though many researchers agree that large-scale simulations are useful for studying certain types of material properties.
In this paper, we describe our efforts in addressing the practical problems of working with very large molecular dynamics simulations. We have taken a computational steering approach in which simulation, data analysis, and visualization are combined into a single package [9,10,11,12]. However, we have adopted an approach that is extremely lightweight--that is, it is memory efficient, simple to use, portable, easily extensible, and not dependent on expensive special purpose hardware. (ie. graphics workstations or high-speed networking).
While the performance of the SPaSM code has been discussed extensively elsewhere, Table 1 shows some recent performance results of the SPaSM code on several different machines [1,2,13]. The table is provided primarily to illustrate the simulation sizes that are possible as well as the computational requirements for such simulations.
Number atoms CM-5
(1024 nodes)T3D
(128 nodes)Power Challenge
(8 nodes)1,000,000 0.39 0.728 8.68 5,000,000 1.60 3.86 40.43 10,000,000 2.98 6.93 80.96 32,000,000 - - 275.60 50,000,000 14.20 33.09 - 75,000,000 - 46.95 - 150,000,000 41.26 - - 300,800,000 90.59 - - 600,000,000 241.73 (SP) - -
Table 1: Time for a single MD timestep (in seconds). Atoms interact according to a Lennard-Jones potential and have been arranged in an an FCC lattice with a reduced temperature of 0.72 and density of 0.8442 [4]. The cutoff is 2.5 sigma. All runs performed in double precision except for (SP).
The images are somewhat misleading. Performing such simulations in practice has proven to be very difficult due to the enormous amount of data that must be analyzed. For example, a single snapshot file from the 38 million atom simulation in Figure 1 was larger than the total system memory available on the largest SGI Onyx workstation at LANL available at the time (in fact, the single image shown required more than 3.5 hours of rendering time). The 104 million particle simulation generated a collection of 40 1.6 Gbyte datafiles (containing only particle positions and kinetic energies stored in single precision). None of SPaSM's users have workstations capable of handing such datasets. In fact, most users don't even have enough local disk space to bring one of these datasets to their local machine. Even if this were possible, users at remote sites would be out of luck--shipping 64 Gbytes of data across the Internet would almost certainly be a nightmare. Unfortunately, the idea of simply offloading data from a supercomputer onto a high-end graphics workstation still seems to run rampant within the supercomputing community. However, in practice this only really works for relatively small simulations involving no more than a few million datapoints. Of course, this really should not come as a surprise, why would any reasonable person expect a workstation to be able to efficiently handle the analysis of data from a very large simulation performed on a 512 processor Connection Machine or T3D?
The large dataset problem has not gone unnoticed by the supercomputing community and some have come to call it the "Data Glut" problem [14]. To visualize some of our very large simulations, we have sometimes used a parallel rendering tool [15]. This is how the 104 million atom picture in Figure 1 was generated (on a 128 processor Cray T3D). Unfortunately, this approach has also proven to be ineffective. Few users know how to run the rendering code, and the system requires several minutes to make a single picture (making the system difficult to use for data exploration). More problematic is the fact that the tool was extremely limited in its data analysis capabilities--a tool which only makes millions of rendered spheres can hardly be called a data analysis package! We really needed a system that could perform visualization coupled with data analysis and feature extraction. We also needed a system that was easy to use and which could be used from our existing workstations.
Unfortunately, it seems that many solutions to the large-dataset problem have become more impractical in recent years. Some have even predicted the "end to batch-processing." [11] The truth of the matter is that large-scale computing is still difficult, still takes hundreds of hours of computing time on the fastest machines available, and still generates an overwhelming amount of data. Some seem to believe that the data-glut problem can be magically eliminated with very expensive special purpose hardware [10,11,12,16]. While this may be possible, our experience in the I-WAY at Supercomputing'95 was a complete disaster (unless you consider showing a pre-rendered MPEG movie of a MD simulation to a half dozen people a success) [17]. Even if the real demo had worked, most scientists we know do not have the resources to go buy a CAVE, a personal Power-Challenge Array, a wall-sized display, and a dedicated OC3 connection to their favorite supercomputing center just to look at their data [17]. Thus, while these efforts may be conceptually interesting, we feel they are of little practical value to scientists who are remotely accessing supercomputing facilities from an ordinary UNIX workstation.
To address these problems, we have adopted the approach of "computational steering" which aims to combine simulation, data analysis, and visualization into a single package. We feel that this combination is important because trying to understand very-large MD simulations is more than just a simulation problem, an analysis problem, a visualization problem, or a user-interface problem. It is a combination of all of these things---and the best solution will be achieved when these elements have been combined in a balanced manner.
In order to build an effective system for large simulations, we feel that it must support the following features.
Previous efforts in computational steering seem to have focused primarily on interactivity and user interfaces while ignoring many of the issues important for large-scale simulation [9,10,11,15]. As a result, we end up with scientific computing "environments" that are unnecessarily complicated and which continue to rely on high-end graphics workstations and high-bandwidth networks--a solution that is simply not that attractive at this time due to both the high cost and limited performance on large-scale problems.
Our approach incorporates all of the features listed above, but with an emphasis on working with very large-simulations and simplicity at all levels. We're not interested in building a large monolithic steering system---in fact, we feel that the best steering "system" is one that you barely notice that you're using! Thus, we will primarily focus on issues related to memory efficiency, long running jobs, extensibility, and building steerable applications from existing code. We do not see our approach as a replacement for other efforts in computational steering. Rather, we see it as an "alternative world view" in which we have tried to build a balanced system that is well-suited for large-scale production simulation and basic scientific research. While most of our efforts have focused on MD, we feel that our approach is applicable to many other large-scale computing situations.
In the new system, the control language is used to glue together different modules for simulation, data analysis and visualization while the entire system is built on top of a message passing, parallel I/O and networking layer that hides hardware dependent implementation details.
On parallel machines, we originally developed our own scripting language based on a simple YACC parser [20]. This language allows the user to access C functions and variables, but also allows loops, conditionals, user-defined functions, and variables to be created on the fly. In reality, the scripting language is not unlike Tcl/Tk, except that we have written the system to work with parallel I/O and have cleaned up the syntax. Internally, the scripting language uses a SPMD style of programming. Each node executes the same sequences of commands, but on different sets of data. The nodes are only loosely synchronized and may participate in message passing operations. While our scripting language was primarily designed to run nicely with message-passing, it turns out that SPaSM can be compiled to use Tcl, Python, or Perl as well (see the next section).
We would like to emphasize the efficiency and simplicity of this approach. Adding a scripting language requires very little memory since parser really only needs a small stack for the LALR(1) parsing method used by YACC [20]. As a result, there is little impact on memory usage. Scripting languages are also easily portable and don't require much network bandwidth to operate. Finally, by having a carefully chosen set of commands and defaults, it is easy to perform complex analysis and visualization without making things too complicated.
%module user %{ #include "SPaSM.h" %} extern void ic_crack(int lx, int ly, int lz, int lc, double gapx, double gapy, double gapz, double alpha, double cutoff); /* Boundary conditions */ extern void set_boundary_periodic(); extern void set_boundary_free(); extern void set_boundary_expand(); extern void apply_strain(double ex, double ey, double ez); extern void set_initial_strain(double ex, double ey, double ez); extern void set_strainrate(double exdot0, double eydot0, double ezdot0); extern void apply_strain_boundary(double ex, double ey, double ez);Code 1: A SPaSM user interface file. This file is automatically translated into C wrapper functions and can be combined with other modules at compile time.
In order to expand the system, the user writes a normal C function--exactly as would have been done without the command language interface. Its ANSI C prototype declaration is then placed into an interface file. When SPaSM is compiled, the interface file is automatically translated into C wrapper functions and a new command is created with the same usage as the underlying C function. This match between commands and C functions is important because typical users of SPaSM are writing C code to implement new physical models and initial conditions. Accessing these functions from the command interface is critical. Our automated tool provides a direct mapping between the command interface and the underlying C functions which is easily understandable. Since our interface file specifications are not specific to any one scripting language, SWIG has been designed to support multiple target languages and can currently build interfaces for Tcl, Python, Perl4, Perl5, Guile, and our own scripting language. Thus, SPaSM can be controlled by any of these languages (although obviously not all scripting languages will work properly on parallel machines).
%module user %{ #include "SPaSM.h" %} %include initcond.i %include graphics.i %include dislocations.i %include particle.i %include debug.iCode 2: SPaSM interface file with modules.
Many files could be placed in a common repository of modules available to all users, but others can be written or customized by the user as needed for a particular simulation. Thus, instead of forcing every user to use the same system, this approach allows each user to customize SPaSM to their individual liking. We feel that this flexibility is critical--especially in an environment where the code is in a constant state of evolution as new physical models and simulations are being developed.
// cull.i. SPaSM interface file for particle culling %{ Particle *cull_pe(Particle *ptr, double pmin, double pmax) { if (!ptr) ptr = Cells[0][0][0].ptr - 1; while ((++ptr)->type >= 0) { if ((ptr->pe >= pmin) && (ptr->pe <= pmax)) return ptr; } return NULL; } %} Particle *cull_pe(Particle *ptr, double pmin, double pmax);Code 3: Simple interface file for culling particles. Note that small C functions can be inlined directly into interface files.
Within a scripting language, we can now write some functions to build and manipulate lists of particles. In this case, we have built SPaSM under the Python scripting language [19].
# Return a list of all particles with pe in [min,max] def get_pe(min,max): plist = []; p = spasm.cull_pe("NULL",min,max) while p != "NULL" : plist.append(p) p = spasm.cull_pe(p,min,max) return plist # Make an image from particles in a list def plot_particles(l): spasm.clearimage(); for i in range(0,len(l)): spasm.sphere(l[i]); spasm.display();Code 4: Python functions for extracting particle data and plotting.
One of the greatest strengths of extensible scripting languages is their ability to easily manipulate structures such as lists and associative arrays. If we wanted to extract two sets of particles with different potential energy ranges and make an image, the user could simply type the following Python commands:
>>> list1 = get_pe(-5.5,-5); >>> list2 = get_pe(-3.5,-3.25); >>> plot_particles(list1+list2);
The ability to work with complex objects has been critical for building more complex modules. While a user may only write a relatively simple interface file for their own C functions, other modules may be quite sophisticated--involving large libraries or C++ code. For example, we have used SWIG to build modules out of MATLAB and the entire Open-GL library--both of which can be imported into the SPaSM code if desired. In short, almost any type of C code can be integrated into our steering system (although clearly not all codes will work on all machines).
In the script, the commands directly map onto the underlying C functions given in the interface file. Scripting also provides a rapid prototyping capability since users can make changes to simulation parameters without recompiling the SPaSM code after every change. Many operations can be first implemented as scripts and recoded in C after they have been sufficiently tested.# # Script for strain-rate experiment # printlog("Crack experiment."); # Set up a morse potential alpha = 7; cutoff = 1.7; init_table_pair(); source("Examples/morse.script"); makemorse(alpha,cutoff,1000); # Create a morse lookup table # Set up initial condition if (Restart == 0) ic_crack(80,40,10,20,5,25.0,5.0, alpha, cutoff); set_initial_strain(0,0.017,0); endif; # Now set up the boundary conditions set_strainrate(0,0,0.001); set_boundary_expand(); output_addtype("pe"); # Run it timesteps(1000,10,50,100);Code 5: A sample SPaSM script file. Commands can also be typed interactively.
Run30 === cm5-4 === Sun Apr 28 10:22:23 1996
SPaSM [30] > open_socket("tjaze",34442);
Connecting...
Socket connection opened with host tjaze port 34442
SPaSM [30] > imagesize(512,512);
Image size set to 512 x 512
SPaSM [30] > colormap("cm15");
Colormap read from file cm15
SPaSM [30] > FilePath="/sda/sda1/beazley/backup/backup";
SPaSM [30] > readdat("Dat36.1");
Setting output buffer to 524288 bytes
Reading 11203040 particles.
11203040 particles { x y z ke } read from /sda/sda1/beazley/backup/backup/Dat36.1
SPaSM [30] > range("ke",0,15);
ke range set to (0,15)
SPaSM [30] > image();
Image generation time : 10.1531 seconds
SPaSM [30] > rotu(70);
Image generation time : 10.7456 seconds
SPaSM [30] > rotr(40);
Image generation time : 10.9436 seconds
SPaSM [30] > down(15);
Image generation time : 10.5469 seconds
SPaSM [30] > Spheres=1;
SPaSM [30] > zoom(400);
Image generation time : 19.8765 seconds
SPaSM [30] > clipx(48,52);
Image generation time : 7.29181 seconds
SPaSM [30] >
With a careful choice of parameters, it is easy to move around in a data set and look at interesting features. Previously defined viewpoints can also be easily saved and recalled. When we tried to work with this same dataset on an SGI Onyx graphics workstation with 256 Mbytes of RAM, it was virtually impossible. Images required as many as 45 minutes to generate and the machine was simple incapable of dealing with a dataset of this size in an interactive manner [15]. However, by using our new system, it is possible to visualize large simulations in less time than that required to perform a single MD timestep (see Table 1).
While this example has been simple, this approach can also be used to interactively set up initial conditions, visualize the data, run the simulation, and perform analysis in real-time as the simulation runs. Periodically, the user can stop the simulation, look at the data in more detail, make changes to various parameters, and continue the simulation. All of this is possible without exiting the SPaSM code or loading a separate analysis tool.
In practice, identifying interesting features in large-scale MD simulations is a hit and miss process that depends on a variety of simulation parameters. This is one of the primary reasons why it is so difficult to work with these datasets on a workstation--we must first work with all of the data to effectively identify interesting features. Interestingly enough, by being able to remotely explore large datasets, it is often possible to reduce the datasets to a size than can be later handled on a workstation. For example, in Figure 4a, a single snapshot file is approximately 700 Mbytes, but by removing the bulk, this can be reduced to only 10-20 Mbytes---a size that is more easily handled. The trick is figuring out which 20 Mbytes of data is interesting!
It is important to emphasize that everything shown in the image has been combined into a single package using our automatic interface generator, yet the SPaSM code is unchanged from the version run on the CM-5 or Cray T3D. Thus, anything added to the SPaSM core in this environment can also be used on those machines. In practice, we have found that developing on a workstation is considerably easier, more reliable, and less frustrating than trying to develop everything on a parallel machine. By having a highly flexible system, we can utilize a wide variety of analysis tools during the development phase and move up to larger, less interactive, production simulations when we are ready.
While it is true that we sometimes need to run interactively on large numbers of processors, a number of steps can be taken to reduce the amount of time required for analysis and visualization:
Currently, we are working on the development of new graphics and data analysis modules for SPaSM. We are also interested in extending our work with the Python scripting language and exploring extensions such as Numerical Python which provide high-level mathematical operations on arrays and matrices [22]. We have no plans to build a sophisticated graphical user interface at this time. If this is desired, we feel that we could probably use any number of existing systems such as the SCIRun steering software developed at the University of Utah [9]. Finally, as data analysis and visualization become commonplace, we feel that data management and organization of results will be critical. Therefore we are quite interested in extending some of our work to scientific databases and data management systems. We feel that this management of data, run parameters, and output, will be more critical than simply providing more interactivity.
[2] P.S.Lomdahl, P.Tamayo, N.Gronbech-Jensen, and D.M.Beazley. 50 Gflops Molecular Dynamics on the CM-5. Proceedings of Supercomputing 93, IEEE Computer Society (1993), p.520-527.
[3] R.C.Giles and P.Tamayo. Proc of SHPCC'92, IEEE Computer Society (1992), p. 240.
[4] S.Plimpton. Fast Parallel Algorithms for Short-range Molecular Dynamics. J Computational Physics, vol 117, (March 1995) p 1-19.
[5] Y.Deng, R. McCoy, R. Marr, R. Peierls, O. Yasar. Molecular Dynamics on Distributed-Memory MIMD Computers with Load Balancing. Applied Math Letters 8, No. 3 (1995), p. 37-41.
[6] M.P.Allen and D.J. Tildesley. Computer Simulations of Liquids. Clarendon Press, Oxford (1987).
[7] MRS Bulletin. Interatomic Potentials for Atomistic Simulations. Vol 21, No. 2 (1996). This volume provides several articles and an overview of atomic potentials.
[8] A.I.Melcuk, R.C.Giles, and H.Gould.
Computers in Physics, May/June 1991, p. 311.
[9] S.G. Parker and C.R. Johnson. SCIRun:
A Scientific Programming Environment for Computational Steering.
Supercomputing `95, IEEE Computer Society, (1995).
[10] G.Eisenhauer, W.Gu, K. Schwan, and
N. Mallavarupu. Falcon-Toward Interactive Parallel Programs: The
On-line Steering of a Molecular Dynamics Application. Proc of the
Third International Symposium on High Performance Distributed
Computing (HPDC-3), IEEE Computer Society (1994), pg. 26-34.
[11] G. Eisenhauer, et al. Opportunities
and Tools for Highly Interactive Distributed and Parallel Computing.
Proc of the Workshop on Debugging and Tuning for Parallel Computer
Systems, Chatham, MA. 1994 (in print).
[12] J.A. Kohl, P. M. Papadopoulos. A
Library for Visualization and Steering of Distributed Simulations
Using PVM and AVS. High Performance Computing Symposium '95, Montreal,
CA,(1995).
[13] D.M. Beazley and P.S. Lomdahl. High
Performance Molecular Dynamics Modeling with SPaSM: Performance and
Portability Issues. Proceedings of the Workshop on Debugging and
Tuning for Parallel Computer Systems, Chatham, MA, 1994 (in
print).
[14] S.Bryson. The Data Glut Revisited.
Computers in Physics, Vol.9, No.5, (1995), p. 525-530.
[15] C.D. Hansen, M. Krogh, and W. White.
Massively Parallel Visualization: Parallel Rendering. Proc. of 7th
SIAM Conference on Parallel Processing for Scientific Computing,
(1994), p. 790-795.
[16] C. Cruz-Neira, et al. Scientist in
Wonderland: A Report on Visualization Applications in the CAVE Virtual
Reality Environment. Proc. of IEEE Symposium on Research Frontiers in
Virtual Reality (1993), p. 59-66.
[17] Virtual Environments and Distributed Computing
at SC'95: GII Testbed and HPC Challenge Applications on the I-WAY. Ed. H. Korab, M. Brown, ACM/IEEE, (1995).
[18] J.K. Ousterhout. Tcl and the Tk
Toolkit. Addison-Wesley (1994).
[19] G. van Rossum. Python Reference Manual. (1995).
[20] J. Levine, T. Mason, and D. Brown. Lex
and Yacc. O'Reilly & Associates, Inc. (1992).
[21] D.M. Beazley. SWIG: An Easy to Use
Tool for Integrating Scripting Languages with C and C++. Proceedings of
The Fourth Annual Tcl/Tk Workshop '96, Monterey, California, July 10-13, 1996.
USENIX Association, p. 129-139.
[22] P. Dubois, K. Hinsen, and J. Hugunin.
Numerical Python. Computers in Physics (to appear 1996).
Author Biographies
David M. Beazley
Dave Beazley is a Ph.D. student in the Department of Computer Science at the University of Utah where he is working in the Scientific Computing and Imaging (SCI) group. Since 1990, he has worked at Los Alamos National Laboratory in the Center for Nonlinear Studies and the Condensed Matter and Statistical Physics Group in the Theoretical Division. Beazley received his M.S. in mathematics from the University of Oregon in 1993 and a B.A. in mathematics from Fort Lewis College in 1991. His research interests include parallel computing, high performance computing architecture, languages, and low-level software development tools for large-scale scientific computing.
Peter Lomdahl is a staff member in the Condensed Matter and Statistical Physics Group in the Theoretical Division at Los Alamos National Laboratory where he has worked on computational condensed-matter and materials-science research since 1985. From 1982 to 1985, he was a postdoctoral fellow in the Center for Nonlinear Studies. Lomdahl received his M.S. in electrical engineering and his Ph.D. in mathematical physics from the Technical University of Denmark in 1979 and 1982. His research interests include parallel computing and nonlinear phenomena in condensed-matter physics and materials science.