Bioinfo Labs  |   CEPCEB  |   IIGB  |   UC Riverside

Workshops


Next Generation Data Analysis Workshop (Dec 5-8, 2014) 

!!! This event is booked out !!!
http://manuals.bioinformatics.ucr.edu/workshops/dec-5-8-2014

General Information: This 4-day workshop is for users who want to acquire the skills required to analyze the Next-Generation Sequencing (NGS) and other large-scale data sets independently and in a proficient manner. The event can be attended by internal (UCR) as well as external (non-UCR) participants. It contains 8 modules and participants can choose to attend any combination of them. Most workshop modules will use the data analysis environment R/Bioconductor which is currently the Lingua Franca of data intensive research. No prior knowledge of 'R' is required for attending this workshop, but beginners should sign up for the introductory sections (on Friday and Saturday morning) which will provide the basics required for the applied data analysis sections of this event. The last module on Monday afternoon will introduce the web-based NGS data analysis environment Galaxy that requires no special computer knowledge. 

Sign-up: To sign up for this event, please visit the sign-up page by clicking the button on the right. Since the event is entirely booked out now, the sign-up page has been disabled!
http://manuals.bioinformatics.ucr.edu/workshops
Directions to UCR: For participants from external institutions, the airport closest to UC Riverside is Ontario International Airport and nearby hotels can be found on this map. Some hotels offer UCR discounts.
Organizers: Thomas GirkeRakesh Kaundal, Neerja KatiyarJordan Hayes and Viet PhamFor questions, please send an e-mail.

Workshop Topics:

Introduction to R (December 5, 2014)

Date: Fri, Dec 5, 2014 (9:00am - 12:00pm)
Location: Genomics Lecture Hall, UC Riverside
Instructor: Thomas Girke (UCR)
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all experimental and quantitative scientific areas. The associated Bioconductor project provides access to over 900 R extension packages for the analysis of modern biological and biomedical data sets, such as next generation sequences, comparative genomics, network modeling and statistical analysis. The R software is free and runs on all common operating systems. This workshop module provides an elementary-level introduction into the R environment to equip users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations.
Maximum number of participants: 75 
Schedule and Teaching Material ]
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Programming in R (December 5, 2014)

Date: Thu, Dec 5, 2014 (1:00pm - 6:00pm)
Location: Genomics Lecture Hall, UC Riverside
Instructor: Thomas Girke (UCR)
Description: In recent years the R language has become the Lingua Franca of data intensive research, and is now by far the most widely used data analysis programming language in bioinfomatics. One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any data type. This workshop provides an overview of the basic knowledge for writing beginner level functions and programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing custom functions, (4) calling external software, (5) running and debugging R programs, and (6) building custom R packages. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Maximum number of participants: 75 
Schedule and Teaching Material ]  
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Basics on Analyzing Next Generation Sequencing Data with R/Bioconductor (December 6, 2014)

Date: Sat, Dec 6, 2014 (9:00am - 12:30pm)
Location: Genomics Lecture Hall, UC Riverside
Instructors: Thomas Girke (UCR)
Description: R and Bioconductor provide extensive utilities for analyzing sequence data from traditional and next generation sequencing technologies (e.g. Sanger or Illumina). This workshop module will cover the following topics: (1) basics on handling sequence, base call quality and annotation ranges in R; (2) demultiplexing of pooled samples; (3) generation of detailed quality reports of FASTQ files; (4) quality/adaptor trimming and filtering of reads; (5) parsing sequences by annotation ranges; (6) handling alignment and coverage objects such as SAM/BAM files in R; (7) interacting with external programs from R, such as short read aligners and peak callers; (8) read count and density analysis; and (9) genome visualization routines. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Maximum number of participants: 75
Schedule and Teaching Material ]     
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Analysis of RNA-Seq Data with R/Bioconductor (December 6, 2014)

Date: Sat, Dec 6, 2014 (1:30pm - 6:00pm)
Location: Genomics Lecture Hall, UC Riverside
Instructors: Thomas Girke (UCR)
Description: This workshop will cover the most common RNA-Seq data analysis routines. It will include the following topics: (1) read mapping with intron/splice junction aware aligners; (2) generation of read count data for genes, exons or other genome annotation ranges; (3) normalization methods of read count data; (4) statistical tests for identifying differentially expressed genes (DEGs); (5) enrichment analysis of GO terms; and (7) visualization of read pileups using R graphics and the IGV genome browser; and (6) enrichment analysis of GO terms. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Maximum number of participants: 75 
Schedule and Teaching Material ]   
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Analysis of ChIP-Seq Data with R/Bioconductor (December 7, 2014) 

Date: Sun, Dec 7, 2014 (9:00pm - 12:30pm)
Location: Genomics Lecture Hall, UC Riverside
Instructors: Thomas Girke (UCR)
Description: This workshop will cover the most common ChIP-Seq data analysis routines. It will include the following topics: (1) short read alignment against reference genomes; (2) efficient handling of read coverage data; (3) peak calling with various algorithms implemented in R; (4) integration of data from external peak callers; (5) annotating peaks with genomic context information; (5) statistical analysis of differential binding; (6) peak viewing using R graphics, Gviz and the IGV genome browser; (7) identification of enriched DNA motifs in peak sequences. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Maximum number of participants: 75 
Schedule and Teaching Material ] 
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Analysis of SNP/Var-Seq Data with R/Bioconductor (December 7, 2014) 

Date: Sun, Dec 7, 2014 (1:30pm - 6:00pm)
Location: Genomics Lecture Hall, UC Riverside
Instructors: Neerja Katiyar (UCR)
Description: This workshop will cover the most common SNP/Var-Seq data analysis routines. It will include the following topics: (1) read mapping with variant aware aligners (e.g. BWA, GSNAP); (2) SNP/indel calling (VariantTools, GATK, BCFtools); (3) handling of standard variant data formats such as VCF; (4) annotating variants with genomic context information including variant mapping to genes, intergenic regions; (5) identification of synonymous/non-synonymous SNPs; (6) variant viewing using R graphics (ggbio) and the IGV genome browser. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Maximum number of participants: 75 
Schedule and Teaching Material ] 
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Analysis of Drug-like Small Molecules and High-Throughput Screens with R/Bioconductor (December 8, 2014)

Date: Mon, Dec 8, 2014 (9:00am - 12:00am)
Location: Genomics Lecture Hall, UC Riverside
Instructor: Tyler Backman & Thomas Girke
Description: This workshop introduces various R packages useful for analyzing drug-like compound and screening data sets. This includes the packages ChemmineR, ChemmineOBfmcsR, eiR and bioassayR. Efficient R functions will be introduced for handling/analyzing SDF/MOL files, interfacing with PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, visualization functions for compound clusters and chemical structures will be introduced. The last part will cover the analysis of large publicly available high-throughput screening data sets like PubChem BioAssay. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Maximum number of participants: 75
Schedule and Teaching Material ]   
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Participants will work during the course from their own laptops (Win, Mac or Linux) with a functional wireless connection. The laptops should have a recent R version pre-installed. The installation instructions for the R software are provided at the end of this page

Web-based Analysis of Next Generation Sequence Data (December 8, 2014)

Date: Mon, Dec 16, 2016 (1:00pm - 5:00pm)
Location: Genomics Lecture Hall, UC Riverside
Instructor: Rakesh Kaundal (UCR)
Description: This workshop will introduce basic NGS data analysis routines using the web-based Galaxy environment. It will include the following topics: (1) assessment of read qualities; (2) read trimming and filtering routines; (3) aligning reads to reference genomes; (4) generation of read counts for RNA-Seq data; (5) peak calling for ChIP-Seq experiments and (6) visualization routines of read pileups along with annotation information using the freely and very easy-to-use IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop. 
Maximum number of participants: 75
Schedule and Teaching Material ]   
How to sign up: Sorry, this event is booked out!
Registration fee*: no charge for participants from registered labs, $23 for UC members, $75 for participants from external academic institutions, and $101.59 for participants from commercial organizations and academic institutions outside of the US.
Laptop requirements: Users are expected to bring a laptop with a functional wireless connection and a recent internet browser version (e.g. Firefox, Chrome or Safari) preinstalled. 


*Question: What are registered labs?
Answer: These are research groups that pay an annual subscription fee to gain access to our high performance compute infrastructure including all workshop events.

Archive of Past Workshops

Date Time Location Instructors Institution Title and Description
Dec 12, 20139:00am - 12:00pmGenomics Lecture HallThomas GirkeUCRIntroduction to R
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all experimental and quantitative scientific areas. The associated Bioconductor project provides access to over 700 R extension packages for the analysis of modern biological and biomedical data sets, such as next generation sequences, comparative genomics, network modeling and statistical analysis. The R software is free and runs on all common operating systems. This workshop module provides an elementary-level introduction into the R environment to equip users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations. 
Maximum number of participants: 75
Dec 12, 20131:00pm - 6:00pmGenomics Lecture HallThomas GirkeUCRProgramming in R
Description: In recent years the R language has become the Lingua Franca of data intensive research, and is now by far the most widely used data analysis programming language in bioinfomatics. One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any data type. This workshop provides an overview of the basic knowledge for writing beginner level functions and programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing custom functions, (4) calling external software, (5) running and debugging R programs, and (6) building custom R packages. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Maximum number of participants: 75
Dec 13, 20139:00am - 12:00pmGenomics Lecture Hall Thomas GirkeUCRVisualizing and Clustering High-Throughput Data with R/Bioconductor
Description: R is one of the most powerful environments for visualizing and clustering scientific data and creating beautiful publication quality graphics in a programmable and highly reproducible manner. This workshop will give an overview of the following topics: (1) introduction to R's base and grid graphics; (2) usage of high-level graphics libraries including lattice and ggplot2; (3) data pre-processing for efficient visualization; (4) writing of functions to generate customized graphics and automating image outputs; (5) visualization of genome and next generation sequencing data with ggbio and Gviz; (6) overview of the most common clustering algorithms used for profiling data, such as hierarchical clustering, fuzzy K-means clustering, principal component analysis, multidimensional scaling, biclustering and quality assessment of clustering results.
Maximum number of participants: 75 
Dec 13, 20131:00pm - 4:00pmGenomics Lecture Hall Jordan Hayes & Thomas GirkeUCRLinux Part I: Linux Essentials
Description: The majority of bioinformatics software, especially in the next generation sequence analysis field, is only available for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications for high-throughput data analysis. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on LINUX systems. During the workshop users will work on IIGB's Linux cluster by logging in remotely from their laptops. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, BWA, BLAST, HMMER, PHYLIP, EMBOSS, etc.  
Maximum number of participants: 75
Dec 13, 20134:00pm - 6:00pmGenomics Lecture HallJordan Hayes & Thomas GirkeUCRLinux Part II: Using IIGB's Linux Cluster
Description: This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades.   
Maximum number of participants: 75
Dec 14, 20139:00am - 12:30pmGenomics Lecture HallThomas GirkeUCRBasics on Analyzing Next Generation Sequencing Data with R/Bioconductor
Description: R and Bioconductor provide extensive utilities for analyzing sequence data from traditional and next generation sequencing technologies (e.g. Sanger or Illumina). This workshop module will cover the  following topics: (1) basics on handling sequence, base call quality and annotation ranges in R; (2) demultiplexing of pooled samples; (3) generation of detailed quality reports of FASTQ files; (4) quality/adaptor trimming and filtering of reads; (5) parsing sequences by annotation ranges; (6) handling alignment and coverage objects such as SAM/BAM files in R; (7) interacting with external programs from R, such as short read aligners and peak callers; (8) read count and density analysis; and (9) genome visualization routines. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Maximum number of participants: 75
Dec 14, 20131:30pm - 6:00pmGenomics Lecture HallThomas GirkeUCRAnalysis of RNA-Seq Data with R/Bioconductor
Description: This workshop will cover the most common RNA-Seq data analysis routines. It will include the following topics: (1) read mapping with intron/splice junction aware aligners; (2) generation of read count data for genes, exons or other genome annotation ranges; (3) normalization methods of read count data; (4) statistical tests for identifying differentially expressed genes (DEGs); (5) enrichment analysis of GO terms; and (7) visualization of read pileups using R graphics and the IGV genome browser; and (6) enrichment analysis of GO terms. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop.
Schedule and Teaching Material
Maximum number of participants: 75
Dec 15, 20139:00pm - 12:30pmGenomics Lecture HallThomas GirkeUCRAnalysis of ChIP-Seq Data with R/Bioconductor
Description: This workshop will cover the most common ChIP-Seq data analysis routines. It will include the following topics: (1) short read alignment against reference genomes; (2) efficient handling of read coverage data; (3) peak calling with various algorithms implemented in R; (4) integration of data from external peak callers; (5) annotating peaks with genomic context information; (5) statistical analysis of differential binding; (6) peak viewing using R graphics, Gviz and the IGV genome browser; (7) identification of enriched DNA motifs in peak sequences. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Schedule and Teaching Material
Maximum number of participants: 75 
Dec 15, 20131:30pm - 6:00pmGenomics Lecture HallThomas GirkeUCRAnalysis of SNP/Var-Seq Data with R/Bioconductor
Description: This workshop will cover the most common SNP/Var-Seq data analysis routines. It will include the following topics: (1) read mapping with variant aware aligners (e.g. BWA, GSNAP); (2) SNP/indel calling (VariantTools, GATK, BCFtools); (3) handling of standard variant data formats such as VCF; (4) annotating variants with genomic context information including variant mapping to genes, intergenic regions; (5) identification of synonymous/non-synonymous SNPs; (6) variant viewing using R graphics (ggbio) and the IGV genome browser. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Schedule and Teaching Material
Maximum number of participants: 75 
Dec 16, 20139:00am - 12:00amGenomics Lecture HallTyler Backman & Thomas GirkeUCRAnalysis of Drug-like Small Molecules and High-Throughput Screens with R/Bioconductor
Description: This workshop introduces various R packages useful for analyzing drug-like compound and screening data sets. This includes the packages ChemmineRChemmineOBfmcsReiR and bioassayR. Efficient R functions will be introduced for handling/analyzing SDF/MOL files, interfacing with PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, visualization functions for compound clusters and chemical structures will be introduced. The last part will cover the analysis of large publicly available high-throughput screening data sets like PubChem BioAssay. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Schedule and Teaching Material
Maximum number of participants: 75 
Dec 16, 20131:30pm - 5:30pmGenomics Lecture HallNeerja Katiyar & Thomas GirkeUCRWeb-based Analysis of Next Generation Sequence Data
Description:
 This workshop will introduce basic NGS data analysis routines using the web-based Galaxy environment. It will include the following topics: (1) assessment of read qualities; (2) read trimming and filtering routines; (3) aligning reads to reference genomes; (4) generation of read counts for RNA-Seq data; (5) peak calling for ChIP-Seq experiments and (6) visualization routines of read pileups along with annotation information using the freely and very easy-to-use IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop.
Teaching Material
Maximum number of participants: 75 
July 18, 20132:00pm - 5:00amFred Hutchinson Cancer Research Center - Seattle, WAThomas GirkeUCRCheminformatics of Drug-like Small Molecules
Description: This lab session will introduce several Bioconductor packages (ChemmineR, fmcsR and eiR) for analyzing drug-like small molecule and high-throughput screening data in R. These packages contain utilities for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound screening libraries, and bioactivity data with a wide spectrum of algorithms. In addition, they offer visualization functions for compound clusters and chemical structures. 
Teaching Material
Maximum number of participants: 75
Dec 6, 201210:00am - 1:00pmGenomics Lecture HallThomas GirkeUCRIntroduction to R
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all experimental and quantitative scientific areas. The associated Bioconductor project provides access to over 600 R extension packages for the analysis of modern biological and biomedical data sets, such as next generation sequences, comparative genomics, network modeling and statistical analysis. The R software is free and runs on all common operating systems. This workshop module provides an elementary-level introduction into the R environment to equip users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations.
Schedule and Teaching Material
Maximum number of participants: 75
Dec 6, 20122:00pm - 6:00pmGenomics Lecture HallThomas GirkeUCRProgramming in R
Description: In recent years the R language has become the Lingua Franca of data intensive research, and is now by far the most widely used data analysis programming language in bioinfomatics. One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any data type. This workshop provides an overview of the basic knowledge for writing beginner level functions and programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing custom functions, (4) calling external software, (5) running and debugging R programs, and (6) building custom R packages. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop.
Schedule and Teaching Material
Maximum number of participants: 75
Dec 7, 20129:00am - 12:00pmGenomics Lecture HallThomas GirkeUCRMicroarray Analysis and Clustering with R/Bioconductor
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for microarray data analysis. The environment integrates the most advanced analysis tools that are currently available for this task. In addition, R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This workshop will cover the following topics: (1) microarray data import/export using Affymetrix data as sample sets, (2) background correction and normalization procedures, (3) array quality assessment, (4) identification of differentially expressed genes, (4) enrichment analysis of Gene Ontology (GO) terms, and (5)  an overview on the usage of the most common clustering algorithms used for profiling data, such as hierarchical clustering, fuzzy K-means clustering, principal component analysis, multidimensional scaling, biclustering and quality assessment of clustering results. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Schedule and Teaching Material
Maximum number of participants: 75
Dec 7, 20121:00pm - 4:00pmGenomics Lecture HallGrant Brady & Thomas GirkeUCRLinux Part I: Linux Essentials
Description: The majority of bioinformatics software, especially in the next generation sequence analysis field, is only available for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications for high-throughput data analysis. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on LINUX systems. During the workshop users will work on IIGB's Linux cluster by logging in remotely from their laptops. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, BWA, BLAST, HMMER, PHYLIP, EMBOSS, etc. 
Schedule and Teaching Material
Maximum number of participants: 75
Dec 7, 20124:00pm - 6:00pmGenomics Lecture HallGrant Brady & Thomas GirkeUCRLinux Part II: Using IIGB's Linux Cluster
Description: This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. 
Schedule and Teaching Material
Maximum number of participants: 75
Dec 8, 2012 9:00am - 12:30pmGenomics Lecture HallThomas GirkeUCRBasics on Analyzing Next Generation Sequencing Data with R/Bioconductor
Description: R and Bioconductor provide extensive utilities for analyzing sequence data from traditional and next generation sequencing technologies (e.g. Sanger or Illumina). This workshop module will cover the following topics: (1) basics on handling sequence, base call quality and annotation ranges in R; (2) demultiplexing of pooled samples; (3) generation of detailed quality reports of FASTQ files; (4) quality/adaptor trimming and filtering of reads; (5) parsing sequences by annotation ranges; (6) handling alignment and coverage objects such as SAM/BAM files in R; (7) interacting with external programs from R, such as short read aligners and peak callers; (8) read count and density analysis; and (9) genome visualization routines. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Schedule and Teaching Material
Maximum number of participants: 75
Dec 8, 20121:30pm - 6:00pmGenomics Lecture HallThomas GirkeUCRAnalysis of RNA-Seq Data with R/Bioconductor
Description: This workshop will cover the most common RNA-Seq data analysis routines. It will include the following topics: (1) read mapping with intron/splice junction aware aligners; (2) generation of read count data for genes, exons or other genome annotation ranges; (3) normalization methods of read count data; (4) statistical tests for identifying differentially expressed genes (DEGs); (5) enrichment analysis of GO terms; and (7) visualization of read pileups using R graphics and the IGV genome browser; and (6) enrichment analysis of GO terms. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Schedule and Teaching Material 
Maximum number of participants: 40
Dec 9, 20129:00pm - 12:30pmGenomics Lecture HallThomas GirkeUCRAnalysis of ChIP-Seq Data with R/Bioconductor
Description: This workshop will cover the most common ChIP-Seq data analysis routines. It will include the following topics: (1) short read alignment against reference genomes; (2) efficient handling of read coverage data; (3) peak calling with various algorithms implemented in R; (4) integration of data from external peak callers; (5) annotating peaks with genomic context information; (5) statistical analysis of differential binding; (6) peak viewing using R graphics and the IGV genome browser; (7) identification of enriched DNA motifs in peak sequences. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Schedule and Teaching Material 
Maximum number of participants: 75
Dec 9, 20121:30pm - 6:00pmGenomics Lecture HallRebecca SunUCRAnalysis of SNP/Var-Seq Data with R/Bioconductor
Description: This workshop will cover the most common SNP/Var-Seq data analysis routines. It will include the following topics: (1) read mapping with variant aware aligners; (2) SNP/indel calling; (3) handling of standard variant data formats such as VCF; (4) annotating variants with genomic context information including variant mapping to genes, intergenic regions; (5) identification of synonymous/non-synonymous SNPs; (6) injecting identified variants into reference genome/proteome; and (7) variant viewing using R graphics and the IGV genome browser. Basic knowledge of the R software including sequence handling routines, as introduced in the "Introduction to R" and Basic NGS tutorials, will be expected in this workshop. 
Schedule and Teaching Material 
Maximum number of participants: 75
Dec 10, 20128:30am - 11:30amGenomics Lecture HallTyler Backman & Thomas GirkeUCRCheminformatics in R for Analyzing Chemical Genomics High-Throughput Screens
Description: This workshop introduces various R packages useful for analyzing drug-like compound and screening data sets. This includes the packages ChemmineRfmcsR, eiR and bioassayR. Efficient R functions will be introduced for handling/analyzing SDF/MOL files, interfacing with PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, visualization functions for compound clusters and chemical structures will be introduced. The last part will cover the analysis of large publicly available high-throughput screening data sets like PubChem BioAssay. Basic knowledge of the R software, as introduced in the "Introduction to R" tutorial, will be expected in this workshop. 
Schedule and Teaching Material 
Maximum number of participants: 75
Dec 10, 2012 2:00pm - 6:00pmGenomics Lecture HallRebecca Sun & Thomas Girke UCRWeb-based Analysis of Next Generation Sequence Data
Description: This workshop will introduce basic NGS data analysis routines using the web-based Galaxy environment. It will include the following topics: (1) assessment of read qualities; (2) read trimming and filtering routines; (3) aligning reads to reference genomes; (4) generation of read counts for RNA-Seq data; (5) peak calling for ChIP-Seq experiments and (6) visualization routines of read pileups along with annotation information using the freely and very easy-to-use IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop.
Schedule and Teaching Material 
Maximum number of participants: 75
July 23-25, 20128:00am-6:00pmNM-AISTThomas GirkeThe Nelson Mandela African Institute of Science and Technology (NM-AIST), Tanzania, AfricaIntroduction to Bioinformatics
Description: This workshop provides an introductory overview of important bioinformatics data analysis concepts related to genome sequencing, database techniques, structural biology, comparative genomics, next generation sequencing, such as RNA-Seq profiling, and small molecule/drug discovery.
Schedule, Slides and Exercises 
Maximum number of participants: 40
Feb 27, 2012 12:00pm-2:30pmFHCRCThomas GirkeFred Hutchinson Cancer Research Center - Seattle, WAChIP-Seq Analysis with R and Bioconductor
Description: This workshop provides an introduction to ChIP-Seq analysis in R including short read alignments, handling of coverage data, peak calling routines, peak annotations, differential peak analysis, peak viewing in a genome browser, and analysis of enriched binding motifs. 
Manual, Slides and Exercises 
Dec 12, 201110:00am-1:00pmNeill Campbell Science Learning Laboratory, UC RiversideTyler Backman & Thomas GirkeUCRAnalysis of Small Molecule Data with R and Bioconductor
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The new version of this R package contains functions for handling/analyzing SDF/MOL files, interfacing with PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine Tools service and allows bidirectional communications  between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction to R" course, will be required for attending this workshop. 
Manual for this workshop
Maximum number of participants: 40
Dec 11, 20111:30pm-6:00pmGenomics Lecture HallTyler Backman, Rebecca Sun & Thomas Girke UCRAnalysis of RNA-Seq, ChIP-Seq and SNP-Seq Data with R/Bioconductor
Description: This workshop will apply the knowledge covered in the basic sequence analysis event to the most common applications in the NGS field, including RNA-Seq, ChIP-Seq and SNP-Seq. This includes normalization methods and statistical tests for identifying differentially expressed genes (DEGs), peak calling methods and SNP/Indel calling methods. Basic knowledge of the R software and sequence analysis routines as introduced in the previous tutorials, will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster. 
Manual for this workshop 
Maximum number of participants: 40
Dec 11, 20119:00pm-12:30pmGenomics Lecture HallTyler Backman, Rebecca Sun & Thomas Girke UCRBasics on Analyzing Next Generation Sequencing Data with R/Bioconductor
Description: R and Bioconductor provide extensive utilities for analyzing sequence data from traditional and next generation sequencing technologies (e.g. Sanger or Illumina). This workshop will cover the following topics: (1) basic sequence and string handling; (2) sequence quality assessment/filtering utilities; (3) adaptor trimming; (4) parsing sequences by location; (5) pairwise and multiple sequence alignments; (6) interacting with external short read alignments programs from R, e.g. BWA, Bowtie; (7) range operations (8) read density analysis; and (9) visualization routines. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster. 
Manual for this workshop 
Maximum number of participants: 40
Dec 10, 20114:00pm-5:00pmGenomics Lecture HallChris Webber & Thomas GirkeUCRLinux Part II: Using IIGB's Linux Cluster
Description:
 This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event.  
Manual for this workshop 
Maximum number of participants: 40 
Dec 10, 2011 1:00pm-4:00pmGenomics Lecture HallThomas GirkeUCRLinux Part I: Linux Essentials
Description:
 The majority of freely available bioinformatics software is designed for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and  up-to-date applications in the field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local Windows, Mac or Linux computer. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, BWA, BLAST, HMMER, PHYLIP, EMBOSS, etc. 
Manual for this workshop 
Maximum number of participants: 40
Dec 10, 20119:00am-12:00pmGenomics Lecture HallThomas GirkeUCRClustering and Data Mining in R
Description:
 R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, quality assessment of clustering results, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be required for attending this workshop. 
Manual for this workshop 
Maximum number of participants: 40 
Dec 9, 20111:00pm-6:00pmGenomics Lecture HallThomas GirkeUCRMicroarray Analysis with R & Bioconductor
Description:
 The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods, and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be expected in this workshop.
Manual for this workshop
Maximum number of participants: 40
Dec 9, 20119:00am-12:00pmGenomics Lecture HallThomas GirkeUCRGraphics and Data Visualization in R
Description: R is one of the most powerful environments for visualizing scientific data and creating beautiful publication quality graphics in a programmable and highly reproducible manner. This workshop will give an overview of the following topics: (1) introduction to R's base and grid graphics; (2) usage of high-level graphics libraries including lattice and ggplot2; (3) data pre-processing for efficient visualization; and (4) writing of functions to generate customized graphics and automating image outputs.  
Manual for this workshop 
Maximum number of participants: 40 
Dec 8, 2011 2:00pm-6:00pmGenomics Lecture HallThomas GirkeUCRProgramming in R
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs, (7) object-oriented programming in R and (8) how to build R packages. Knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be required for attending this workshop. 
Manual for this workshop 
Maximum number of participants: 40 
Dec 8, 2011 10:00am-1:00pmGenomics Lecture HallThomas GirkeUCRIntroduction to R
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations.
Manual for this workshop 
Maximum number of participants: 40
May 29, 20112:00pm-6:00pmGenomics Lecture Hall
Tyler Backman & Thomas Girke UCR
Analysis of Small Molecule Data with R and Bioconductor
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The new version of this R package contains functions for handling/analyzing SDF/MOL files, interfacing with PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine Tools service and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction to R" course, will be required for attending this workshop.
Manual for this workshop
Maximum number of participants: 40
May 29, 2011
10:00am-1:00pm
Genomics Lecture Hall
Tyler Backman, Rebecca Sun & Thomas Girke
UCR
GUI-based Exploration and Visualization of Next Generation Sequence Data
Description: This workshop will introduce the basics of aligning next generation sequence (NGS) data to reference genomes/transcriptomes using cloud/web-based applications from the Galaxy service. Analysis and visualization of the read pileups along with annotation information will be performed in the free and very easy-to-use IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop.
Manual for this workshop
Maximum number of participants: 40
May 28, 2011
1:00pm-5:00pmGenomics Lecture Hall
Tyler Backman, Rebecca Sun & Thomas Girke
UCR
Analysis of RNA-Seq, ChIP-Seq and SNP-Seq Data with R/Bioconductor
Description: This workshop will apply the knowledge covered in the basic sequence analysis event to the most common applications in the NGS field, including RNA-Seq, ChIP-Seq and SNP-Seq. This includes normalization methods and statistical tests for identifying differentially expressed genes (DEGs), peak calling methods and SNP/Indel calling methods. Basic knowledge of the R software and sequence analysis routines as introduced in the previous tutorials, will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Manual for this workshop
Maximum number of participants: 40
May 28, 2011
9:00am-12:00pmGenomics Lecture Hall
Tyler Backman, Rebecca Sun & Thomas GirkeUCR
Basic Analysis of Next Generation Sequencing Data with R/Bioconductor
Description: R and Bioconductor provide extensive utilities for analyzing sequence data from traditional and next generation sequencing technologies (e.g. Sanger or Illumina). This workshop will cover the following topics: (1) basic sequence and string handling; (2) sequence quality assessment/filtering utilities; (3) adaptor trimming; (4) parsing sequences by location; (5) pairwise and multiple sequence alignments; (6) interacting with external short read alignments programs from R, e.g. BWA, Bowtie; (7) range operations (8) read density analysis; and (9) visualization routines. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Manual for this workshop
Maximum number of participants: 40
May 27, 2011
4:00pm-5:00pmGenomics Lecture Hall
Alex Levchuk, Tyler Backman & Thomas GirkeUCR
Linux Part II: Using IIGB's Linux Cluster
Description: This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event. 
Manual for this workshop
Maximum number of participants: 40
May 27, 2011
1:00pm-4:00pm Alex Levchuk, Tyler Backman & Thomas GirkeUCR
Linux Part I: Linux Essentials
Description: The majority of freely available bioinformatics software is designed for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications in the field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local Windows, Mac or Linux computer. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, BWA, BLAST, HMMER, PHYLIP, EMBOSS, etc.
Manual for this workshop
Maximum number of participants: 40
May 27, 2011
10:00am-12:00pmGenomics Lecture Hall
Thomas Girke
UCR
Introduction to R
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations.
Manual for this workshop
Maximum number of participants: 40
May 15, 2011
2:00pm-5:00pmGenomics Lecture Hall
Thomas Girke
UCR
Clustering and Data Mining in R
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, quality assessment of clustering results, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be required for attending this workshop.
Manual for this workshop
Maximum number of participants: 40
May 15, 2011
10:00am-1:00pmGenomics Lecture Hall
Thomas Girke
UCR
Microarray Analysis with R & Bioconductor
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods, and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be expected in this workshop.
Manual for this workshop
Maximum number of participants: 40
May 14, 2011
2:00pm-5:00pmGenomics Lecture Hall
Thomas Girke
UCR
Graphics and Data Visualization in R
Description: R is one of the most powerful environments for visualizing scientific data and creating beautiful publication quality graphics in a programmable and highly reproducible manner. This workshop will give an overview of the following topics: (1) introduction to R's base and grid graphics; (2) usage of high-level graphics libraries including lattice and ggplot2; (3) data pre-processing for efficient visualization; and (4) writing of functions to generate customized graphics and automating image outputs. 
Manual for this workshop
Maximum number of participants: 40
May 14, 2011
10:00am-1:00pmGenomics Lecture Hall
Thomas Girke
UCR
Programming in R
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs, (7) object-oriented programming in R and (8) how to build R packages. Knowledge of the R software, as introduced in the previous tutorial "Introduction to R", will be required for attending this workshop.
Manual for this workshop
Maximum number of participants: 40
May 13, 2011
2:00pm-6:00pmGenomics Lecture Hall
Thomas Girke
UCR
Introduction to R
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data analysis operations.
Manual for this workshop
Maximum number of participants: 40
Nov 20, 2010
2:00-5:00pm
Genomics Lecture Hall Tyler Backman & Thomas Girke UCR
Analysis of Small Molecule Data with R and Bioconductor
Manual for this workshop
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The new version of this R package contains functions for handling/analyzing SDF/MOL files, bioactivity data from PubChem, structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine Tools service and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction to R" course, will be required for attending this workshop.
Maximum number of participants: 40
Nov 20, 2010
10:00-1:00pm Genomics Lecture Hall Thomas Girke
UCR
Introduction to R
Manual for this workshop
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations.
Maximum number of participants: 40
Oct 31, 2010
10:00-1:00pm Genomics Lecture Hall Tyler Backman, Rebecca Sun & Thomas Girke
UCR
GUI-based Exploration and Visualization of Next Generation Sequence Data
Manual for this workshop
Description: This workshop will introduce the basics of aligning next generation sequence (NGS) data to reference genomes/transcriptomes using cloud/web-based applications. Analysis and visualization of the read pileups along with annotation information will be performed in the free and very easy-to-use IGV genome browser from the Broad Institute. The material will be useful for both complete beginners and intermediate users (e.g. attended previous R workshop on NGS data analysis). No special computer knowledge is required for this workshop.
Maximum number of participants: 40
Oct 30, 2010
10:00-4:00pm Genomics Lecture Hall Tyler Backman, Rebecca Sun & Thomas Girke UCR
Analysis of Next Generation Sequencing Data with R and Bioconductor
Manual for this workshop
Description: R and Bioconductor provide extensive utilities for analyzing next generation sequence data (NGS) from technologies such as Illumina (Solexa). This workshop will cover the following topics: (1) basic sequence and string handling; (2) quality assessment utilities; (3) quality filtering; (4) adaptor trimming; (5) sequence alignments; (6) interacting with external alignments programs from R, e.g. SOAP, Maq, Bowtie; (7) read density and SNP analysis; and (8) visualization of genome-scale mapping data. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Maximum number of participants: 40
Oct 29, 2010
4:00-5:00pm Genomics Lecture Hall Alex Levchuk, Tyler Backman, Thomas Girke UCR
Linux Part II: Using IIGB's Linux Cluster
Manual for this workshop
Instructors: Alex Levchuk, Tyler Backman & Thomas Girke (UCR)
Description: This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event. 
Maximum number of participants: 40
Oct 29, 2010
1:00-4:00pm Genomics Lecture Hall
Alex Levchuk, Tyler Backman, Thomas Girke UCR
Linux Part I: Linux Essentials
Manual for this workshop
Description: The majority of freely available bioinformatics software is designed for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications in the field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local Windows, Mac or Linux computer. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, Soap, BLAST, HMMER, PHYLIP, EMBOSS, etc.
Maximum number of participants: 40
Oct 3, 2010
2:00-5:00pm Genomics Lecture Hall
Thomas Girke
UCR
Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 40
Oct 3, 2010
10:00-1:00pm Genomics
Lecture Hall
Thomas Girke
UCR
Microarray Analysis with R & Bioconductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 40
Oct 2, 2010
10:00-3:00pm Genomics Lecture Hall
Thomas Girke
UCR
Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs, and (7) object-oriented programming in R. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 40
Oct 1, 2010
2:00-6:00pm Genomics Lecture Hall
Thomas Girke
UCR Introduction to R
Manual for this workshop
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations.
Maximum number of participants: 40
Mar 6, 2010
2:00-5:00pm
Genomics Lecture Hall Thomas Girke UCR
Searching and Clustering Drug-like Compounds in R
Manual for this workshop
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The R package contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine database and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction into R" course, will be required for attending this workshop.
Maximum number of participants: 40
Mar 6, 2010
10:00-12:00pm Genomics Lecture Hall Thomas Girke
UCR
Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 40
Mar 5, 2010 2:00-5:00pm HMNSS1500 (Humanities) Thomas Girke
UCR
Microarray Analysis with R & Bioconductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 40
Mar 5, 2010 9:00-12:00pm Genomics Lecture Hall
Thomas Girke
UCR Introduction to R
Manual for this workshop
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations.
Maximum number of participants: 40
Feb 26, 2010 2:00-6:00pm
Genomics Lecture Hall Tyler Backman, Rebecca Sun & Thomas Girke UCR
Analysis of HT Sequencing Data with R and Bioconductor
Manual for this workshop
Description: R and Bioconductor provide extensive utilities for analyzing high-throughput sequencing data from next generation technologies, such as Illumina (Solexa). This workshop will cover the following topics: (1) basic sequence and string handling; (2) quality assessment utilities; (3) quality filtering; (4) adaptor trimming; (5) sequence alignments; (6) interacting with external alignments programs from R, e.g. SOAP, Maq, Bowtie; (7) read density and SNP analysis; and (8) visualization of genome-scale mapping data. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Maximum number of participants: 40
Feb 19, 2010
4:00-5:00pm
Genomics Lecture Hall Alex Levchuk, Tyler Backman, Thomas Girke UCR
Linux Part II: Using IIGB's Linux Cluster
Manual for this workshop
Description: This seminar-style presentation will provide an introduction into the usage of the different load balancing and parallel computing tools available on IIGB's Linux cluster. A discussion will follow to determine the need for future hardware and software upgrades. PI's and users from UCR are invited to attend this event.  
Maximum number of participants: 40
Feb 19, 2010
1:00-4:00pm Genomics Lecture Hall
Alex Levchuk, Tyler Backman, Thomas Girke
UCR
Linux Part I: Linux Essentials
Manual for this workshop
Description: The majority of freely available bioinformatics software is designed for Unix/Linux-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications in the field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local Windows, Mac or Linux computer. The following topics will be covered: (1) overview of the Linux operating system, (2) file system organization, (3) getting around, (4) basic Shell commands and scripts, (5) available software, (6) running software like Bowtie, Soap, BLAST, HMMER, PHYLIP, EMBOSS, etc.
Maximum number of participants: 40
Jan 30, 2010 10:00-3:00pm
Genomics Lecture Hall Thomas Girke
UCR
Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 40
Jan 29, 2010 2:00-6:00pm Genomics Lecture Hall
Thomas Girke
UCR Introduction to R
Manual for this workshop
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations.
Maximum number of participants: 40
July 23, 2009 2:30-7:00pm
1104 Batchelor Hall Tyler Backman & Thomas Girke UCR
Analysis of High-Throughput Sequencing Data with R and Bioconductor
Manual for this workshop
Description: R and Bioconductor provide extensive utilities for analyzing high-throughput sequencing data from next generation technologies, such as Illumina (Solexa). This workshop will cover the following topics: (1) basic sequence and string handling; (2) quality assessment utilities; (3) quality filtering; (4) adaptor trimming; (5) sequence alignments; (6) interacting with external alignments programs from R, e.g. SOAP, Maq, Bowtie; (7) read density and SNP analysis; and (8) visualization of genome-scale mapping data. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Maximum number of participants: 20
June 25, 2009  2:30-7:00pm 1104 Batchelor Hall
Thomas Girke
UCR
Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
 June 5, 2009  2:30-7:00pm 1104 Batchelor Hall  Thomas Girke
UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) conditional executions, (2) loops, (3) writing functions, (4) techniques for improving speed/memory performance, (5) calling external software, (6) running and debugging R programs. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
April 24, 2009 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 20
March 26, 2009 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Analysis of High-Throughput Sequencing Data with R and Bioconductor
Manual for this workshop
Description: R and Bioconductor provide extensive utilities for analyzing high-throughput sequencing data from next generation technologies, such as Illumina (Solexa). This workshop will cover the following topics: (1) basic sequence and string handling; (2) quality assessment utilities; (3) quality filtering; (4) adaptor trimming; (5) sequence alignments; (6) interacting with external alignments programs from R, e.g. SOAP, Maq, Bowtie; (7) read density and SNP analysis; and (8) visualization of genome-scale mapping data. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop. Due to the computational demands of analyzing next generation sequencing data with reasonable speed, users will work during this workshop on a Linux cluster.
Maximum number of participants: 20
Feb 24, 2009 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: R (http://www.r-project.org) is a versatile data analysis environment that has a broad application spectrum in all science areas. The associated Bioconductor project provides access to hundreds of additional R packages for the analysis of modern biological and biomedical data sets, such as microarrays, next generation sequencing data, genome annotations, networks, etc. The R software is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment to prepare users with the knowledge required for the subsequent events of this R workshop series. The following topics will be covered by this R introduction: (1) command syntax, (2) basic functions, (3) data import/export, (4) data/object types, (5) graphical display, (6) usage of R packages/libraries (e.g. Bioconductor) and (7) using R for basic data mining operations.
Maximum number of participants: 20
Dec 18, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Searching and Clustering Drug-like Compounds in R
Manual for this workshop
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The R package contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine database and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction into R" course, will be required for attending this workshop.
Maximum number of participants: 20
Nov 25, 2008 2:30-7:00pm Commons Rm 379 (Map) Thomas Girke UCR Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering with bootstrap, K-means, PAM, fuzzy clustering, QT, SOM, principal component analysis, multidimensional scaling, biclustering, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
Sept 25, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on Linux clusters. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
Aug 28, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 20
July 30, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 20
Feb 28, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Searching and Clustering Drug-like Compounds in R
Manual for this workshop
Description: This workshop introduces the ChemmineR package for mining drug-like compound and screening data sets. The R package contains functions for structural similarity searching, clustering of compound libraries with a wide spectrum of algorithms and utilities for managing complex compound data sets. In addition, it offers visualization functions for compound clusters and chemical structures. The package is well integrated with the online ChemMine database and allows bidirectional communications between the two services. The integration of chemoinformatic tools with the R programming environment has many advantages, such as easy access to a wide spectrum of statistical methods, machine learning algorithms and graphic utilities. Knowledge of the R software, as introduced in the "Introduction into R" course, will be required for attending this workshop.
Maximum number of participants: 20
Jan 31, 2008 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering, bootstrap, K-means, PAM, QT, SOM, principal component analysis, multidimensional scaling, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
Nov 29, 2007 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of numeric data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on our Linux cluster. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 20
Oct 25, 2007 2:30-7:00pm 1104 Batchelor Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 20
Sept 27, 2007 2:00-6:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 20
Sept 27, 2007 9:00-1:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 20
Apr 25, 2007 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Clustering and Data Mining in R
Manual for this workshop
Description: R contains a comprehensive set of functions and libraries for clustering large multidimensional data sets. This course will provide an overview on the usage of the most common clustering techniques in R, such as hierarchical clustering, bootstrap, K-means, PAM, QT, SOM, principal component analysis, multidimensional scaling, etc. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 12
Jan 25, 2007 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of numeric data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on our Linux cluster. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 12
Nov 16, 2006 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description:The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 12
Oct 26, 2006 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description:The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (5) using R for data mining, (6) graphical display and (7) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 12
July 27, 2006 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of numeric data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on our Linux cluster. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 12
June 8, 2006 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are freely available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 12
May 25, 2006 2:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (6) using R for data mining, (7) graphical display and (8) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 12
April 18, 2006 8:00-2:00pm Loma Linda University Bioinformatic Specialists NCBI NCBI Mini-Course
Flyer for this workshop
Description: (A) Making Sense of DNA & Protein Sequence: Participants will find a gene within a eukaryotic DNA sequence, predict the function of the implied protein product, and find a 3D modeling template for this protein sequence using NCBI resources. (B) BLAST QuickStart: A practical introduction to the BLAST family of sequence-similarity search programs. Participants will perform simple and specialized searches and learn creative uses of BLAST programs.
Organizer: Aileen Gonzales (LLU)
Maximum number of participants: 12
Mar 9, 2006 3:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of numeric data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on our Linux cluster. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this workshop.
Maximum number of participants: 12
Jan 26, 2006 3:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are free and available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 12
Dec 22, 2005 3:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (6) using R for data mining, (7) graphical display and (8) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 12
Nov 22, 2005 3:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Programming in R
Manual for this workshop
Description: One of the outstanding strengths of the R language is the ease of programming extensions to automate the analysis and mining of almost any type of numeric data. This workshop provides an overview of the basic knowledge for writing beginner-level programs in R. The following topics will be introduced: (1) basic syntax, (2) executing R programs, (3) calling external software, (4) regular expressions, (5) writing functions, (6) control structures (loops) and (7) running R programs on our Linux cluster. Knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be required for attending this work shop.
Maximum number of participants: 12
Oct 27, 2005 3:00-7:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Microarray Analysis with R & BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project have become the "gold standard" for the analysis of dual color microarrays and Affymetrix chips. The environment integrates the most advanced analysis tools that are currently available for profiling data. All software components are free and available for all operating systems. This workshop will cover the following topics: (1) data import/export, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) array quality inspection, (4) identification of differentially expressed genes, (4) visualization of genomic information, (5) overview on clustering methods and (6) Gene Ontology (GO) analysis. Basic knowledge of the R software, as introduced in the previous tutorial "Introduction into R", will be expected in this workshop.
Maximum number of participants: 12
Sept 29, 2005 3:00-6:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction to R
Manual for this workshop
Description: The open source software R (http://www.r-project.org) has revolutionized the statistical data analysis for most bioscience and chemistry disciplines. The required time to learn the R software is well invested, since the R environment covers an unmatched spectrum of statistical tools including an efficient programming language for automating time-consuming analysis routines. The fully integrated BioConductor project contains many additional R packages, in particular for the analysis of functional genomics and microarray data. Due to their popularity, R and BioConductor are continuously updated and extended with the latest analysis tools that are available in the different research fields. The R environment is completely free and runs on all common operating systems. This workshop provides an elementary-level introduction into the R environment covering the following topics: (1) command syntax, (2) basic functions, (3) data import/export, (4) data types, (6) using R for data mining, (7) graphical display and (8) usage of R packages and libraries (e.g. BioConductor).
Maximum number of participants: 12
Jul 7, 2005 3:00-6:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction into EMBOSS: A Free Open Source Sequence Analysis Package
Manual for this workshop
Description: The only free and comprehensive sequence analysis packages is EMBOSS. It contains over 150 very useful command-line tools for analyzing DNA and protein sequences including pattern searching, phylogenetic analysis, data management, feature predictions, proteomics and more. A detailed description of all its applications can be found on this page. The workshop will provide an introduction into the functionality and usage of the different EMBOSS modules. Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop.
Maximum number of participants: 12
Jun 23, 2005 3:00-6:30pm 1007 Noel T. Keen Hall Thomas Girke UCR LINUX Essentials
Manual for this workshop
Description: The majority of freely available bioinformatics software is designed for UNIX/LINUX-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications in this field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local PC, Mac or LINUX computer. The following topics will be covered: (1) the power of UNIX/LINUX, (2) file system organization, (3) getting around, (4) the Shell, (5) available software, (6) How to run software like BLAST, HMMER, PHYLIP, EMBOSS, etc.
Maximum number of participants: 12
May 31, 2005 3:00-6:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Fast and User Friendly Chip Analysis with R and BioConductor
Manual for this workshop
Description: The statistics software R and the associated BioConductor project simplify and standardize the analysis dual color microarrays and Affymetrix chips by integrating the differnt analysis levels into one environment. The software is free and available for all operating systems. This workshop will cover the following topics: (1) a brief introduction into R, (2) background correction and normalization procedures for Affymetrix and cDNA arrays, (3) identification of differentially expressed genes, (4) visualization of genomic information, (5) hierarchical clustering and (6) Gene Ontology (GO) analysis.
Maximum number of participants: 12
Apr 28, 2005 1:00-4:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Expression Profiling Analysis with R and Bioconductor
Manual for this workshop
R (http://www.r-project.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide a short introduction into R and the second part will focus on the usage of Bioconductor packages for the analysis of Affymetrix chips and dual-color microarrays (RMA, GCRMA, LIMMA, SAM, etc).
Maximum number of participants: 12
Mar 29-30, 2005 1:00-4:00pm Loma Linda University Bioinformatic Specialists NCBI NCBI Field Guide Workshop
Schedule: Download Flyer
Directions to lecture and lab rooms
Maximum number of participants: 12
Mar 17, 2005 2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Expression Profiling Analysis with R and Bioconductor
Manual for this workshop
R (http://www.r-project.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide a short introduction into R and the second part will focus on the usage of Bioconductor packages for the analysis of Affymetrix chips and dual-color microarrays (RMA, GCRMA, LIMMA, SAM, etc).
Maximum number of participants: 12
Feb 22-23, 2005 9:00-5:00pm City of Hope Bioinformatic Specialists NCBI NCBI Workshops
First day: "A Field Guide to GenBank & NCBI Molecular Biology Resources"
Second day: "Exploring 3D Molecular Structures Using NCBI Tools & NCBI QuickScripts"
Sign up fee per computer lab session: $25
Maximum number of UCR participants: 15
Oct 29, 2004 2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction into R and Bioconductor
Manual for this workshop
R (http://www.r-project.org) is a complete statistical software package and programming language for data manipulation, calculation and professional graphical display. The fully integrated Bioconductor project covers many additional R packages for statistical data analysis in biosciences, such as tools for the analysis of SNP and transcriptional profiling data derived from SAGE, cDNA microarrays, Affymetrix chips, etc. This workshop will be divided into two sections: the first part will provide an introduction into the basic R commands under Linux and the second part will focus on the usage of Bioconductor packages for Affymetrix chip analysis (RMA, GCRMA, QC display, SAM, etc). Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop.
Maximum number of participants: 12
July 22, 2004 2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke & Josh Lauricha UCR Large-Scale Computing on our Bioinfo LINUX Cluster
Manual for this workshop
Our facility is currently maintaining a 64-CPU LINUX cluster to significantly reduce the running time of computationally expensive bioinformatics applications. For instance HMM searches of 100,000 protein sequences against the Pfam database can be finished on this cluster within 3-4 days as opposed to 'impractical' ~200 days on a single processor machine! This seminar will provide an introduction into the usage of the different load ballancing and parallel computing systems that are available on our cluster. A discussion will follow to determine the need for future hardware growth and software requirements in this area.
Maximum number of participants: 12
July 8, 2004 2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Introduction into EMBOSS: A Free Open Source Sequence Analysis Package
Manual for this workshop
The only free and comprehensive sequence analysis packages is EMBOSS. It contains over 150 very useful command-line tools for analyzing DNA and protein sequences including pattern searching, phylogenetic analysis, data management, feature predictions, proteomics and more. A detailed description of all its applications can be found on this page. The workshop will provide an introduction into the functionality and usage of the different EMBOSS modules. Knowledge of the basics UNIX commands, as introduced in our 'LINUX Essentials' course, is required for attending this workshop.
Maximum number of participants: 12
June 24, 2004 2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR LINUX Essentials
Manual for this workshop
The majority of freely available bioinformatics software is designed for UNIX/LINUX-based operating systems. Basic knowledge about its usage provides free access to the most powerful and up-to-date applications in this field. The workshop will teach beginners the basic command-line syntax for running applications on large data sets on our LINUX servers and clusters from a local PC, Mac or LINUX computer. The following topics will be covered:
  • The power of UNIX/LINUX
  • File system organization
  • Getting around
  • The Shell
  • Available software
  • How to run software like BLAST, HMMER, PHYLIP, EMBOSS, etc.
Maximum number of participants: 12
April 15, 2004 2:00-3:30pm 1007 Noel T. Keen Hall Jennifer Le Page ChemBridge ChemBridge & CRL: Integrative Chemistry Solutions for Drug Discovery
Topics of presentation:
  • ChemBridge and CRL Corporate Overview
  • Discovery Chemistry Products
  • GPCR and Kinase Targeted Libraries
  • Discovery Chemistry Services
  • Models for Strategic Collaborations
Maximum number of participants: 12
Mar 25, 2004
Apr 5, 2004
2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR Large Scale Data Management for Biologists with the Database Software MS Access
Manual and execises for this workshop
In response to the high demand, the bioinformatics facility is offering a workshop for biologists with interest in learning a user-friendly database software for managing complex data sets from DNA array, proteomic, large-scale sequencing and other high-throughput technologies. MS Access provides a simple but efficient database environment for organizing large data sets without knowing any programming languages. It is also a very useful tool to pre-design data structures for future import into more powerful database engines like MySQL, PostgreSQL or Oracle. The software is usually preinstalled on every Windows computer with MS Office. A Mac version is not available yet. This introductory workshop will cover the following topics: data im/export, interoperability with spreadsheet programs like Excel, table relationships, filters, queries, calculations, table joining, duplicate removal, reports and forms. Active Server Pages (ASP) for designing web interfaces will not be covered.
Maximum number of participants: 12
Mar 4, 2004 2:00-4:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Chemical Genetics Seminar
Discussion of future screens and introduction into our compound database.
Jan 7, 2004 1:00-3:00pm 1007 Noel T. Keen Hall Azhar Alavi Silicon Genetics Introduction into GeneSpring
This seminar provides a general introduction into GeneSpring, a data mining package for expression profiling. New features of the latest version 6.1 will be introduced which include new filtering tools and clustering techniques.
Oct 30, 2003
Nov 20, 2003
2:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR GCG Basics (SeqWeb, SeqLab & Command Line)
Manual & Exercises for this workshop
These two workshops cover the basics on how to access GCG from SeqWeb, SeqLab and the command line. Since SeqLab is the most powerful and user-friendly GCG environment, the workshop focuses on this interface and gives an overview on the various sequence analysis tools which are available in GCG. This includes pattern searches, multiple alignments, phylogenetic trees, remote homology detection with HMMER, high-throughput sequence analysis such as BLAST searches in batch mode and how to make personal BLASTable sequence databases. Participants are encouraged to provide their own real-life problems during the exercise section. Maximum number of participants: 12
Oct 21-22, 2003 9:00-4:00pm City of Hope Bioinformaticians from NCBI NCBI NCBI Workshop
Schedule
Oct 2-3, 2003 9:00-4:00pm City of Hope Bioinformaticians from NCBI NCBI NCBI Minicourses
Schedule
Wed, July 16, 2003 2:00-5:00pm 1007 Noel T. Keen Hall Thomas Girke UCR Phred/Phrap/Consed
Manual for this workshop
April 17 & May 1, 2003 3:00-5:30pm 1007 Noel T. Keen Hall Thomas Girke UCR UNIX/LINUX Essentials for Beginners
Manual for this workshop
Topics: The Power of Unix, File System Organization, Getting Around, The Shell, Running Applications, Text Editors
March 7, 2003 Seminar: 10:30-12:00am
Workshop: 1:00-4:00pm
Seminar: Science Library, Rm 240
Workshop: Sproul Hall, Rm 2225
Richard Hughey UC Santa Cruz An Introduction to Hidden Markov Models (seminar)
Since their introduction to the biological sequence analysis community, profile hidden Markov models have become a standard for high-performance sequence search, classification, and alignment. These basic functions can also form core components of protein structure prediction and genome analysis. This tutorial presents an introduction to the process of creating and using profile hidden Markov models, followed by a discussion of the iterative search methods that enable particularly distant remote homology detection. The emphasis will be on gaining a qualitative understanding the underlying technology, with only a small amount of mathematics.

The SAM HMM Software System (hands-on workshop)
The Sequence Alignment and Modeling Software System (SAM) is the HMM system developed at UCSC in Haussler and Krogh's seminal work. It has been continuously improved since the introduction of profile HMMs, and now forms the core of our protein structure prediction efforts, and is used by many other research sites. In this workshop, we will use the SAM programs to create and examine HMMs, align sequences, and search databases. We will also use the SAM web servers for protein structure prediction. Attendees are encouraged to bring their own sequence or sequences of interest for building and using HMMs (web version).
Dec 4, 2002 9am-4pm Watkins, Rm 2101 Mike Troutman Affymerix Affymetrix Hands-On Training (limited to 12 participants)
This workshop will cover the Affymetrix image analysis tool MAS 5.0, the data mining software DMT and the database MicroDB.
Nov 8-9, 2002 9am-4pm UCR Extension Lukasz Jaroszewski & Dimitrios Morikis UCSD & UCR Protein Modeling Workshop:
1. Day: "Protein structure, domain databases, sequence and structure analysis"
2. Day: "Homology modeling and rational drug design"
Oct 8-10, 2002 8am-2pm City of Hope Bioinformaticians from NCBI NCBI NCBI Workshop (Summary):
Oct 8th: "A Field Guide to GenBank and NCBI Molecular Biology Resources"
Oct 9th: "Have a BLAST! A Practical Course on the Basic Local Alignment Search Tool (BLAST) from the NCBI"
Oct 10th: "Making Sense of DNA and Protein Sequences"
Oct 2, 2002 10am-12pm Science Library, Rm 240 Kyle O'Connor Affymerix A Demonstration of the Affymetrix Data Analysis Software
July 24, 2002 9am-3pm City of Hope Accelrys Accelrys Accelrys Biosequence Analysis Workshop
DS Gene and SeqWeb (GCG)
July 22, 2002 10am-4pm Sproul Hall, Rm 2225 InforMax InforMax Vector NTI Training
Vector NTI database structure & organization, molecule reports, maps, features, primer design, back translation, vector design & construction, import, export, BioPlot, AlignX, Contig Express.
May 29, 2002
May 22, 2002
9am - 12pm Sproul Hall, Rm 2225 Thomas Girke Center for Plant Cell Biology, UCR GCG Workshop
Manual for this workshop
Covers the basics on how to access GCG from SeqWeb, SeqLab and the command line. Since SeqLab is the most powerful and user-friendly GCG environment, the workshop will focus on this application and give an overview on the various sequence analysis tools which are available in GCG. This will include high-throughput sequence analysis such as BLAST searches in batch mode and how to make personal BLASTable sequence databases.
April 9, 2002 10am - 12pm Surge Bdg, Rm 284 Michael Gribscov San Diego Supercomputer Center Genomic analysis of plant protein kinases