CS 420 Spring 2023
CS 420 (CSE 402, ECE 492)
Fundamental issues in design and development of parallel programs for various types of parallel computers. Various programming models according to both machine type and application area. Cost models, debugging, and performance evaluation of parallel programs with actual application examples. 3 undergraduate hours. 3 or 4 graduate hours.
Prerequisite: CS 225
CSE concentration: Option to add CSE concentration to your transcript
Grading
Type | % |
---|---|
HWs | 20% |
MPs | 35% |
Midterm exam | 20% |
Final exam | 25% |
- For 4 credit option, the above determines 75% of the grade and a final project determines 25%.
- All (but the final project) require individual work. You can discuss an MP before starting to program, but you program on your own.
- No credit for late assignment submissions.
- The CS Dept Honor Code: http://cs.illinois.edu/academics/honor-code
Learning Goals
- Acquire basic knowledge of CPU architecture: execution pipeline, dependencies, caches; learn to tune performance by enhancing locality and leveraging compiler optimizations.
- Understand vector instructions and learn to use vectorization.
- Acquire basic knowledge of multicore architectures: cache coherence, true and false sharing and their relevance to parallel performance tuning.
- Learn to program using multithreading, parallel loops, and multitasking using a language such as OpenMP. Learn to avoid concurrency bugs.
- Learn to program using message passing with a library such as MPI.
- Understand simple parallel algorithms and their complexity.
- Learn to program accelerators using a language such as OpenMP.
- Acquire basic understanding of parallel I/O and of frameworks for data analytics, such as map-reduce.
Topic List
- Introduction: Course introduction. Importance of parallel computing with the end of Moore’s Law
- Basic CPU architecture and performance bottlenecks. Tuning for locality and leveraging optimizing compilers.
- Vector instructions and compiler vectorization.
- Basic multicore architecture and performance bottlenecks. False sharing.
- OpenMP: Mutithreading model; parallel sections, parallel loops, tasks and task dependencies. Races and atomicity. Deadlock avoidance
- Basic parallel algorithms: matmul, stencils, sparseMV
- Basic cluster architecture and performance bottlenecks
- MPI: Point-to-point, one-sided, collectives.
- Basic distributed memory algorithms: matmul, stencils, sparseMV, sorting; data distribution.
- Parallel programming patterns: divide-and-conquer, pipeline
- Basic GPU architecture; programming GPUs with OpenMP
- Basics of parallel I/O
- Basics of data analysis using map-reduce