Tuning High-Performance Scientific Codes: The Use of Performance Models
to Control Resource Usage During Data Migration and I/O
J. Lee and M. Winslett and X. Ma and S. Yu
In Proceedings of the Fifteenth ACM International Conference on Supercomputing.
Available format:
postscript(short version)
postscript(long version)
Abstract:
Large-scale parallel simulations are a popular tool for investigating
phenomena ranging from nuclear explosions to protein folding. These codes
produce copious output that must be moved to the workstation where it will
be visualized. Scientists have a variety of tools to help them with this
data movement, and often have several different platforms available to
them for their runs. Thus questions arise such as, which
data migration approach is best for a particular code and platform?
Which will provide the best end-to-end response time, or lowest cost?
Scientists also control how much data is output, and how often. From a
scientific perspective, the more output the better; but from a cost and
response time perspective, how much output is too much? To answer these
questions, we built performance models for data migration approaches and
verified them on parallel and sequential platforms. We use a 3D
hydrodynamics code to show how scientists can use the models to predict
performance and tune the I/O aspects of their codes.