Projects and Tutorials

Big Data Analysis
Grosmannia clavigera
Document Video

CellLines: Workshop Files and References

User Ratings :
Machine Learning for Transcriptomics Data Workshop
A subset from the "Modeling Precision Medicine Treatment Selection for Patients Based on Multi-Omics Biomarkers" series.
In collaboration with Georgetown University Systems Medicine Program. Presented by Dr. Vladimir Galatenko and with support from the Tauber Bioinformatics Research Center and Pine Biotech, Inc.
 

During the workshop, participants will work with gene expression data and learn to process and analyze it in the context of biomedical research. We will use supervised (SVM, swLDA, RandomForest) and unsupervised (PCA, k-means clustering, Hierarchical clustering) machine learning methods to detect interesting patterns in the dataset and train a classification model. The workshop is designed to last between 1.5 and 2 hours. 

 
For this workshop, we will use a “precision medicine” dataset with cancer subtypes categorized using multi-omics data (Daemon et al., 2013, “Modeling precision treatment of breast cancer”: an analysis of over 70 different Breast Cancer cell lines and over 90 different therapeutic agents. The project included SNP Array (a type of microarray), RNA-seq (which looks at the whole transcriptome), Exome-seq (exome capture, which looks at all of the expressed genes at a given point in time), and genome-wide methylation (epigenetics) data). This workshop will focus on RNA-seq data. The workshop is intended for students and faculty that are interested in omics data and bioinformatics.