Next Generation Data Analysis Workshop

December 6-10, 2012

Overview
This 5-day workshop is for users who want to acquire the skills required to analyze next generation sequence (NGS) and other large-scale data sets independently and in a proficient manner. Most workshop modules will use the data analysis environment R/Bioconductor which is nowadays the Lingua Franca of data driven research. No prior knowledge of R is required for attending this workshop, but beginners should sign up for the introductory sections (on Thu & Fri) that will provide the basics required for the applied data analysis sections of this event. In addition, users will learn to run command-line tools under Linux, such as NGS aligners/assemblers, which is an indispensable skill for analyzing modern genome-scale data sets. The last module (Mon afternoon) will introduce the web-based NGS data analysis environment Galaxy that requires no special computer knowledge. 

Location
Genomics Lecture Hall, UC Riverside [ Map ]
Parking: Non-UCR participants can obtain parking information and permits at the entrance to the UCR campus.

Schedule 

Thu, Dec 6, 2012

10:00-01:00 PM - (1) Introduction to R (Instructor: Thomas Girke)    Slides ]   [ Exercises ]   [ Manual ]
01:00-02:00 PM - Lunch Break
02:00-06:00 PM - (2) Programming in R (Instructor: Thomas Girke)     Slides ]   [ Exercises ]   [ Manual ]

Fri, Dec 7, 2012

09:00-10:30 PM - (3a) Microarray Analysis with R/Bioconductor (Instructor: Thomas Girke)     [ Slides ]   [ Exercises ]   [ Manual 
10:30-12:00 PM - (3b) Clustering and Data Mining with R/Bioconductor (Instructor: Thomas Girke)     [ Slides ]   [ Exercises ]   [ Manual 
12:00-01:00 PM - Lunch Break
01:00-04:00 PM - (4) Linux Part I: Linux Essentials (Instructors: Grant Brady & Thomas Girke)    [ Slides ]   [ Exercises ]   [ Manual ]  
04:00-06:00 PM - (5) Linux Part II: Using IIGB's Linux Cluster (Instructor: Grant Brady)    [ Slides ]   [ Manual ] 

Sat, Dec 8, 2012

09:00-12:30 PM - (6) Basics on Analyzing Next Generation Sequencing Data with R/Bioconductor  (Instructor: Thomas Girke)    Slides ]   [ Exercises ]   [ Manual ] 
12:30-01:30 PM - Lunch Break
01:30-06:00 PM - (7) Analysis of RNA-Seq Data with R/Bioconductor (Instructor: Thomas Girke)     Slides ]   [ Exercises ]   [ Manual ] 

Sun, Dec 9, 2012

09:00-12:30 PM - (8) Analysis of ChIP-Seq Data with R/Bioconductor (Instructor: Thomas Girke)    Slides ]   [ Exercises ]   [ Manual ] 
12:30-01:30 PM - Lunch Break
01:30-06:00 PM -  (9) Analysis of SNP/Var-Seq Data with R/Bioconductor (Instructor: Rebecca Sun)     Slides ]   [ Exercises ]   [ Manual ] 

Mon, Dec 10, 2012

08:30-09:30 AM - (10a) Cheminformatics Overview for Chemical Genomics Screens (Instructor: Thomas Girke)     Slides ]
09:30-11:30 AM - (10b) Cheminformatics in R for Analyzing Chemical Genomics Screens (Instructor: Tyler Backman)     Slides ]   [ Exercises ]   [ Manual ]
01:00-02:00 PM - Lunch Break
02:00-06:00 PM - (11) Web-based Analysis of Next Generation Sequence Data (Instructor: Rebecca Sun)     Slides ]  Manual ] 


Laptop and Software Requirements

Laptop Requirements
    • Users are expected to bring a laptop with a functional wireless connection and a recent internet browser version (e.g. Firefox, Chrome or Safari) preinstalled. Wireless guest accounts will be provided for non-UCR participants. Also, don't forget to bring a power supply for your laptop to run it for an entire day!
    • In addition, please follow the software install instructions for each event as outlined below. If you encounter problems then please come 30 min earlier, so that we can assist you with the installation.
Software Installs for R Events
    • Install latest R Version 2.15.2 from here: http://www.r-project.org/
    • Next, install RStudio from here: http://rstudio.org/
    • IGV (Integrative Genomics Viewer) will be used in some parts of the NGS analysis sections: http://www.broadinstitute.org/igv/
    • To install the R libraries required for the different course modules, copy & paste the following commands into the RStudio (or the R) console and execute them with the enter key:
      • Modules: Introduction to R and Programming in R
install.packages(c("ggplot2", "lattice"))
      • Modules: Microarray Analysis and Clustering with R/Bioconductor
source("http://bioconductor.org/biocLite.R")
biocLite(c("affy", "limma", "affycoretools", "arrayQualityMetrics", "affyQCReport", "GOstats", "GO.db", "Ruuid", "graph", "Category", "plier", "ath1121501.db", "ath1121501cdf", "ath1121501probe"))
biocLite("BiocUpgrade")
install.packages(c("ape", "pvclust", "biclust", "modeltools", "som", "flexclust", "cluster", "scatterplot3d", "gplots", "e1071", "kernlab"))
      • Modules: Basic NGS, RNA-Seq, ChIP-Seq, SNP-Seq, and Cheminformatics
source("http://bioconductor.org/biocLite.R")
biocLite(c("ShortRead", "Biostrings", "IRanges", "BSgenome", "rtracklayer", "biomaRt", "chipseq", "ChIPpeakAnno", "Rsamtools", "BayesPeak", "PICS", "GenomicRanges", "DESeq", "edgeR", "leeBamViews", "GenomicFeatures", "BSgenome.Celegans.UCSC.ce2", "BSgenome.Athaliana.TAIR.TAIR9", "DEXSeq", "BCRANK", "ChemmineR", "fmcsR", "VariantAnnotation"))
biocLite("BiocUpgrade")

Software Installs for Linux Events
    • On Windows Laptops
      • Terminal application PuTTY (putty.exe)
      • WinSCP software for file exchange
    • LINUX OSs
      • Please make sure you have an ssh client installed, which is usually the case.