This course focuses on the application of high performance computing (HPC) to bioinformatics analysis. The main target is to provide a background on how to effectively use HPC clusters for running computationally or data intensive bioinformatics applications. TThis would include, e.g. how to optimize the use of available compute nodes, and how to adapt the application to the available resources on each compute node.The course will cover both how to efficiently use parallelism when writing your own programs, as well as how to adapt and wrap existing tools in manner that efficiently exploits resources available on parallel architectures.
- The basic structure of HPC clusters, and how to run jobs on a cluster
- How to evaluate the use of resources on a cluster, and how to optimize the use of memory and CPUs
- When to use parallelization and distribution.
- How to adapt or write wrappers around existing tools to process large datasets efficiently using parallellisation