Logistics
Zoom link Will be communicated
Piazza Will be communicated
CSE concentration Option to add CSE concentration to your transcript

Grading

Type %
HWs 20%
MPs 35%
Midterm exam 20%
Final exam 25%
  • For 4 credit option, the above determines 75% of the grade and a final project determines 25%.
  • All (but the final project) require individual work. You can discuss an MP before starting to program, but you program on your own.
  • No credit for late assignment submissions.
  • See The CS Dept Honor Code at http://cs.illinois.edu/academics/honor-code

Official Description

Fundamental issues in design and development of parallel programs for various types of parallel computers. Various programming models according to both machine type and application area. Cost models, debugging, and performance evaluation of parallel programs with actual application examples. Course Information: Same as CSE 402 and ECE 492. 3 undergraduate hours. 3 or 4 graduate hours.

Prerequisite: CS 225.

Learning Goals

  1. Prepare non-CS science and engineering students to the use of parallel computing in support of their work.
  2. Acquire basic knowledge of CPU architecture: execution pipeline, dependencies, caches; learn to tune performance by enhancing locality and leveraging compiler optimizations.
  3. Understand vector instructions and learn to use vectorization.
  4. Acquire basic knowledge of multicore architectures: cache coherence, true and false sharing and their relevance to parallel performance tuning.
  5. Learn to program using multithreading, parallel loops, and multitasking using a language such as OpenMP. Learn to avoid concurrency bugs.
  6. Learn to program using message passing with a library such as MPI.
  7. Understand simple parallel algorithms and their complexity.
  8. Learn to program accelerators using a language such as OpenMP.
  9. Acquire basic understanding of parallel I/O and of frameworks for data analytics, such as map-reduce.
  10. Team project.

Topic List

  1. Introduction: Course introduction. Importance of parallel computing with the end of Moore’s Law
  2. Basic CPU architecture and performance bottlenecks. Tuning for locality and leveraging optimizing compilers.
  3. Vector instructions and compiler vectorization.
  4. Basic multicore architecture and performance bottlenecks. False sharing.
  5. OpenMP: Mutithreading model; parallel sections, parallel loops, tasks and task dependencies. Races and atomicity. Deadlock avoidance
  6. Basic parallel algorithms: matmult, stencils, sparseMV
  7. Basic cluster architecture and performance bottlenecks
  8. MPI: Point-to-point, one-sided, collectives.
  9. Basic distributed memory algorithms: matmult, stencils, sparseMV, sorting; data distribution.
  10. Parallel programming patterns: divide-and-conquer, pipeline
  11. Basic GPU architecture; programming GPUs with OpenMP
  12. Basics of parallel I/O
  13. Basics of data analysis using map-reduce
  14. Mid-term and final exams