CSC 591E: Topics in Performance Study of Parallel Programs

Fall 2003

Course overview Schedule Resources

Course overview

Course description: This is a special topic course focused on an in-depth study of parallel applications' performance, with an emphasis on cross-platform performance comparison and especially, the inclusion of I/O operations in predicting the overall performance of large parallel applications. This course covers a variety of topics in parallel performance analysis and prediction, as well as new research issues in the grid computing environment. Students will be requested to perform literature study, give in-class paper presentation, and collaborate on a group research project.

Instructor: Xiaosong Ma

Time and location: Wednesday 1-3pm, 337 EGRC

Prerequisites: Operating system and computer architecture courses

To register: Please contact the instructor.

Grading policy: written reviews 20%, class participation 30% (10% presentation, 20% in-class discussion), project 50%.

Late policy: Calculated by the time recorded in the assignment emails received to the instructor.
RAs: every half hour after the due time (noon), 10% off the credit.
PAs: every hour after the due time (noon), 10% off the credit.

Course schedule

Week # Date Topics Reading assignment (RA) Project assignement (PA) Deadlines
1 08/20 Course overview, introduction to parallel computing What is the Grid by Ian Foster, 2002
RA1: Matchmaking: Distributed Resource Management for High Throughput Computing by Rajesh Raman, Miron Livny, Marvin Solomon, HPDC 1998.
Get accounts on the course cluster and learn to run parallel programs (get started).
Two sample C programs calculating the value of pi: integral and monte carlo
PA1: test the accuracy and scalability of "integral" and "monte carlo" on the class cluster, using 1, 2, 4, and 8 processors.
2 08/27 Grid resource management, MPI-IO
An MPI-IO tutorial (PPT) by Rajeev Thakur
RA1 Presenter: Jaydeep
RA2: Globus: A Metacomputing Infrastructure Toolkit by Ian Foster and Carl Kesselman, The International Journal of Supercomputer Applications and High Performance Computing, 1996 PA2 RA1, M 08/25
PA1, W 08/27
3 09/03 MPI-IO (continued), parallel I/O performance
RA2 Presenter: Wei
RA3: Data Sieving and Collective I/O in ROMIO by Rajeev Thakur, William Gropp and Ewing Lusk, the Seventh Symposium on the Frontiers of Massively Parallel Computation, 1999 PA3 RA3, M 09/08
PA3, W 09/10
4 09/10
RA3 Presenter: Salil
RA4: Effective File-I/O Bandwidth Benchmark by Rolf Rabenseifner and Alice Koniges, Euro-Par 2000 PA4 RA3, M 09/08
PA3, W 09/10
5 09/17
RA4 Presenter: Salil
RA5: Automated Performance Prediction for Scalable Parallel Computing by Mark Clement and Michael Quinn, Parallel Computing, 23(10), 1997 Preliminary parallel I/O performance modeling RA4, M 09/15
PA4, W 09/17
6 09/24 No lecture RA6: Information and Control in Gray-Box Systems by Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau, SOSP, 2001    
7 10/01 Portable performance modeling
RA5 Presenter: Vikram
RA6 Presenter: Jaydeep
RA7: Latency Metric: An Experimental Method for Measuring and Evaluatin g Parallel Program and Architecture Scalability by Xiaodong Zhang, Yong Yan and Keqiang He, Journal of Parallel and Distributed Computing, 22(3), 1994 MPI-I/O benchmarking on multiple clusters RA5-6, M 09/29
PA5, M 09/29
8 10/08 Portable performance modeling (cont.)
RA7 Presenter: Vikram
RA8: A Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing by Xiaohui Shen and Alok Choudhary, HPDC, 2000   RA7, M 10/06
PA7, W 10/08
9 10/?
RA8 Presenter: Jaydeep
RA9: Dynamic Statistical Profiling of Communication Activity in Distributed Applications by Jeffrey Vetter, SIGMETRICS, 2002   RA8, M 10/13
PA8, W 10/15
10 10/22
RA9 Presenter: Salil
RA10: Predicting Application Run Times Using Historical Information by Warren Smith, Ian Foster, and Valerie Taylor, IPPS Workshop on Job Scheduling Strategies for Parallel Processing, 2002   RA9, M 10/20
PA9, W 10/22
11 10/29
RA10 Presenter: Vikram
RA11: Resource Selection Using Execution and Queue Wait Time Predictions, by Warren Smith and Parkson wong, NAS Technical Report, 2002. PA11 RA10, M 10/27
PA10, W 10/29
12 11/05
RA11 Presenter: Salil
RA12: Using Disk Throughput data in Predictions of End-to-End Grid Data Transfers , by Sudarshan Vazhkudai and Jennifer Schopf, Grid Workshop, 2002. PA12 RA11, M 11/03
PA11, W 11/05
13 11/12
RA12 Presenter: Vikram
RA13: Parallel Simulation of Parallel File Systems and I/O Programs, by Rajive Bagrodia, Stephen Docy, and Andy Kahn, Supercomputing, 1997. PA13 RA12, M 11/10
PA12, W 11/12
15 11/25
RA13 Presenter: Jaydeep
  PA14 RA13, M 11/24
PA13, W 11/24
16 12/03 Project result discussion and summary
    PA14, W 12/03

Useful resources

MPI Standard 1.1 (Use as MPI message passing reference)
MPI Standard 2.0 (Use as MPI-IO reference)
ROMIO (MPICH's MPI-IO implementation)

Recommended books :
Parallel Computer Architecture: A Hardware/Software Approach
The Grid: Blueprint for a New Computing Infrastructure
Parallel I/O for High Performance Computing

Course clusters:
The OS cluster configuration
NCSU HPC center