|CSE concentration||Option to add CSE concentration to your transcript|
|Class notes||Amdahl’s law (notebook)|
|MPs (every 2 weeks)||35%|
- For 4 credit option, the above determines 75% of the grade and a final project determines 25%.
- All (but the final project) require individual work. You can discuss an MP before starting to program, but you program on your own.
- No credit for late assignment submissions.
- See The CS Dept Honor Code at http://cs.illinois.edu/academics/honor-code
Fundamental issues in design and development of parallel programs for various types of parallel computers. Various programming models according to both machine type and application area. Cost models, debugging, and performance evaluation of parallel programs with actual application examples. Course Information: Same as CSE 402 and ECE 492. 3 undergraduate hours. 3 or 4 graduate hours.
Prerequisite: CS 225.
- Prepare non-CS science and engineering students to the use of parallel computing in support of their work. (1,2,6)
- Acquire basic knowledge of CPU architecture: execution pipeline, dependencies, caches; learn to tune performance by enhancing locality and leveraging compiler optimizations. (2,6)
- Understand vector instructions and learn to use vectorization (2,6)
- Acquire basic knowledge of multicore architectures: cache coherence, true and false sharing and their relevance to parallel performance tuning (2,6)
- Learn to program using multithreading, parallel loops, and multitasking using a language such as OpenMP. Learn to avoid concurrency bugs. (2,6)
- Learn to program using message passing with a library such as MPI. (2,6)
- Understand simple parallel algorithms and their complexity. (1,6)
- Learn to program accelerators using a language such as OpenMP (2,6)
- Acquire basic understanding of parallel I/O and of frameworks for data analytics, such as map-reduce. (6)
- Team project (1,2,3,5,6)
- Introduction: Course introduction. Importance of parallel computing with the end of Moore’s Law
- Basic CPU architecture and performance bottlenecks. Tuning for locality and leveraging optimizing compilers.
- Vector instructions and compiler vectorization.
- Basic multicore architecture and performance bottlenecks. False sharing.
- OpenMP: Mutithreading model; parallel sections, parallel loops, tasks and task dependencies. Races and atomicity. Deadlock avoidance
- Basic parallel algorithms: matmult, stencils, sparseMV
- Basic cluster architecture and performance bottlenecks
- MPI: Point-to-point, one-sided, collectives.
- Basic distributed memory algorithms: matmult, stencils, sparseMV, sorting; data distribution.
- Parallel programming patterns: divide-and-conquer, pipeline
- Basic GPU architecture; programming GPUs with OpenMP
- Basics of parallel I/O
- Basics of data analysis using map-reduce
- Mid-term and final exams
G. Hager and G. Wellein. Introduction to High Performance Computing for Scientists and Engineers.