Faster Collective Output through Active Buffering
X. Ma and M. Winslett and J. Lee and S. Yu
In Proceedings of the International Parallel and Distributed Processing Symposium, 2002.
Available format:
postscript
Abstract:
Scientific applications often need to write out large arrays and
associated metadata periodically for visualization or restart
purposes. In this paper, we propose active buffering
for collective I/O, in which processors actively organize their idle
memory into a hierarchy of buffers for periodic output data. Active
buffering exploits one-sided communication for I/O processors to fetch
data from compute processors' buffers and performs actual writing in
the background while compute processors are computing. It gracefully
adapts as buffers at different level of the hierarchy fill and empty,
and as new collective I/O requests arrive. Experimental results with
synthetic benchmarks and a real rocket simulation code on the SGI
Origin 2000 and IBM SP show that active buffering improves the
apparent collective write throughput so that it approaches the local
memory bandwidth or the MPI bandwidth under appropriate conditions.
These speedups are due entirely to increased parallelism during I/O,
and are in addition to any performance improvements that may come from
buffering small requests.