

BIT 815, Deep Sequencing Data Analysis
Instructor: Dr. Ross Whetten

Course Description
The course BIT815, Analysis of Deep Sequencing Data, is
designed to introduce biologists to the Linux command-line computing
environment, to cloud computing, and to open-source software for analysis of
next-generation sequencing data.
Class sessions consist of two-hour blocks, each beginning with presentation
and discussion of a specific topic, followed by hands-on cloud computing
exercises using model datasets. A total of 21 two-hour blocks are scheduled
over a seven- to eight-week period, and the course is offered once per calendar
year. The importance of cloud computing is emphasized, due to the increasing
demands for RAM and storage space required for analysis and storage of high-throughput
DNA sequencing data, and the cost-effectiveness and flexibility provided by cloud computing solutions.
Applications of sequencing discussed include genome sequencing (both de-novo
and resequencing), transcriptome analysis, discovery of sequence and structural variations, ChIP-seq
methods for mapping DNA-protein interactions, and genotyping by sequencing (GBS
and RAD-seq methods). For each application of
sequencing technology, discussion topics include experimental design
strategies, methods for library construction, sources of experimental and
biological variation, and analytical approaches available in open-source
software packages. Computing exercises utilize the software discussed, and
provide participants with the opportunity to carry out analysis of sample
datasets using the Amazon Web Services EC2 cloud environment and a CloudBioLinux
machine image customized to provide the
software described during the presentation sections of each class period.
The objective of the course is not to make course participants experts in every
aspect of sequence analysis, but instead to empower participants to learn the
specific skills they need by teaching basic skills in command-line Linux and
cloud computing, and providing an introduction to the literature and on-line
resources. The course is directed at graduate students, but has also attracted
participation from faculty, post-doctoral researchers, and research technicians
interested in expanding their skills in the area of sequence data analysis.
Course materials
for Spring 2013
Course materials
for Spring 2012
last modified 14 March 2013 by Ross Whetten