3 Cr. (Hrs.:3 Lec.)
Provides an overview of computer hardware, software, and numerical methods that are useful for scientific computing on workstations and high performance computing (HPC) systems. Topics include HPC architectures, parallel programming, software tools and packages, algorithm design, characteristics of commonly used numerical methods, mapping of solution methods to modern multi-processor systems, and performance analysis. Prerequisites: CSCI 332 and (M 426 or CSCI 477) (2nd)
Course generally offered spring (2nd) semester.
E1. Know how to work in a UNIX/Linux environment to manipulate files, use and integrate existing software packages and libraries, and can compile/execute custom programs.
E2. Understand the basics of the algorithmic analysis – asymptotic Big-O complexity. (CSCI 332)
E3. Student should understand how the formal steps to create a mathematical or computation model. (M 426 or CSCI 477)
R1. Be familiar with basic computer architecture principles, including the SIMD & MIMD execution models, data cache, shared & distributed memory, multi-core processors, and graphical processing units (GPUs).
R2. Be able to set up a virtual machine and install multiple operating systems on it.
R3. Understand basic concepts of parallel programming, including local vs. shared data, data dependencies, race conditions, multi-threaded programming with OpenMP, multi-process programming with MPI, and GPU computing.
R4. Know how to develop, analyze (Big-O complexity), and code both serial and parallel algorithms to solve scientific problems.
R5. Know how to test and debug both serial and parallel programs.
R6. Know how to submit programs for execution on a multi-user HPC system through a job queuing system.
R7. Understand how to measure, interpret, and report the performance of their code, including the speedup on a multiprocessor system.
R8. Understand basic compiler optimization options and know how to use them to evaluate and improve code performance.
R9. Learn about cloud computing options and Map/Reduce computational paradigm.
R10. Design and implement a non-trivial serial and parallel program and analyze the algorithmic performance and identify performance barriers such as data contention, bottleneck, and dependency and discuss strategies for solving these performance barriers.