Projects and Tutorials

Big Data Analysis Data Integration and Modeling Machine Learning RNA-seq
Arabidopsis thaliana
Document Video

RNA-SEQ ANALYSIS WITH ARADIOPSIS THALIANA – PART 2, MACHINE LEARNING

User Ratings :

For information about how to get to the data needed to run machine learning, please view Part 1. In this video, we go through several types of machine learning options on the T-BioInfo Platform: Quantile Normalization, Factor Analysis, and PCA. Before normalization, we organize for factors, and I'd like to walk you through determining the factors, mapping them out, and organizing the data. The data we're analyzing is from GSE57953, Direct roles of SPEECHLESS in the specification of stomatal self-renewing cells [RNA-seq]. Looking at the GEO database entry, we can see the two factors and the levels of each factor: Wild Type and iSPCH, and 0 hours, 6 hours, and 8 hours in beta-estradiol. We can also see that most have two replicates. The data we're using is output data from our RNA-Seq pipeline. Using the SRA Run Selector for GSE57953, we can rename the columns from their SRR run names to something more meaningful. I used the following column names and organization, and you can download the already renamed and organized by factors file for input into Quantile Normalization.

Col-0h	SPCH-0h-1	SPCH-0h-2	Col-6h-1	Col-6h-2	SPCH-6h-1	SPCH-6h-2	Col-8h-1	Col-8h-2	SPCH-8h-1	SPCH-8h-2

When inputting into Quantile Normalization, I use a threshold of 12.0. Using the results of Quantile Normalization, I can run Factor Analysis. As we have two factors, one with two levels and one with three levels, we will create six groups: 1, 2, 2, 2, 2, 2. I will also allow the system to know that we have three levels of factor A (0h, 6h, 8h), and two levels of factor B (WT, iSPCH), and that we are using columns 2-12. The resulting Factor Analyzed data can be viewed. Separately, using the results of Quantile Normalization again, I can run PCA analysis on columns 2-12 for my 11 samples. I can take the result file and input it into our PCA viewer, and so can you. Just download the PCA result file and upload it to the viewer. Screenshot 2015-10-30 09.54.52