13th and 14th February 2020, University of Cambridge, Cambridge, UK

Instructors

Description

Analysis of whole genome data unearths a multitude of variants of different classes, which need to be filtered, annotated and validated to arrive at a causative variant for a disease. The current short length sequences, whilst being excellent at identifying single nucleotide variants and short insertions/deletions, struggle to correctly map structural variants (SVs). Long-read sequencing technologies offer improvements in the characterisation of genetic variation and regions that are difficult to assess with short-read sequences. The aim of this course is to familiarise participants with long read sequencing technologies, their applications and the bioinformatics tools used to assemble this kind of data. Lectures will introduce this technology and provide insight into methods for the analysis of genomic data, while the hands-on sessions will allow participants to run analysis pipelines, focusing on data generated by the Oxford Nanopore Technologies (ONT) platform.

Prerequisites

The course is suitable for complete beginners and assumes no prior programming experience. Basic knowledge of R and UNIX would be an advantage. An introduction to the the Unix system and shell is available here.

Detailed aims

This course will provide:

Objectives

After this course you should be able to:

Contents

  1. Introduction
  2. Sample preparation
  3. Quality control
  4. Alignment
  5. Variant calling
  6. Haplotype phasing
  7. Methylation

The data used in this course can be found here.

Links to nanopore practicals

Registration

https://www.training.cam.ac.uk/event/3327123

We thank Oxford Nanopore Technologies for providing the reagents and flow cells used in this course