Improving MPI-IO Output Performance with Active Buffering Plus Threads
X. Ma and M. Winslett and J. Lee and S. Yu
In Proceedings of the International Parallel and Distributed Processing Symposium, 2003.
Available format:
postscript ,
PDF
Abstract:
Efficient collective output of intermediate results to secondary
storage becomes more and more important for scientific simulations as
the gap between processing power/interconnection bandwidth and the I/O
system bandwidth enlarges. Dedicated servers can offload I/O from compute
processors and shorten the execution time, but it is not always
possible or easy for an application to use them. We propose
the use of active buffering with threads (ABT) for overlapping
I/O with computation efficiently and flexibly without dedicated I/O
servers. We show that the implementation of ABT in ROMIO, a popular
implementation of MPI-IO, greatly reduces the application-visible cost
of ROMIO's collective write calls, and improves an application's
overall performance by hiding I/O cost and saving implicit
synchronization overhead from collective write operations. Further,
ABT is high-level, platform-independent, and transparent to users,
giving users the benefit of overlapping I/O with other processing
tasks even when the file system or parallel I/O library does not
support asynchronous I/O.