Projects and Tutorials

Big Data Analysis Data Integration and Modeling Machine Learning NGS Data RNA-seq
Grosmannia clavigera
Document Video

RNA-Seq Data Analysis – Grosmannia clavigera

User Ratings :

What is RNA-seq?

RNA Sequencing, also known as RNA-seq, is a method for processing Next Generation Sequencing (NGS) data. RNA transcription occurs in an organism when genes are expressed, and the goal of RNA-seq is to determine the factors that cause genes to become more expressed at certain timepoints and in different environments. For example, a comparison between levels of RNA expression in an organism in two different environments can reveal how the organism reacts to different environments. Another example is to take snapshots of an organism’s gene expression levels at different timepoints. Extracted RNA samples are placed into an Illumina machine and it reads the RNA sequences, which are recorded as FastQ files. FastQ files contain ASCII text and read quality information. Some reads contain errors due to the Illumina sequencing method, so the first step in RNA-seq is error correction. To perform this, reads are aligned and the errors are identified and corrected. Afterwards the reads are mapped onto the genome and the ends of the genes (exons) are detected. Organisms can contain gene sequences with small variations called isoforms, these need to be constructed by detecting paths across links provided by the reads. After the reads are mapped, the expression levels of individual isoforms are measured. In some cases, differential expression is then performed, measuring change in expression levels over time. Finally, using data mining techniques like PCA and clustering, the results can be analyzed and interpreted.

What is Grosmannia clavigera?

Grosmannia clavigera is a fungus that is carried by Mountain Pine Beetles. When the Mountain Pine Beetle infests a pine tree, the fungus is brought in beneath the bark. It contaminates the stem, eventually causing the tree to die. Grosmannia clavigera has a symbiotic relationship with the beetle and the larva it lays underneath the bark. It does so by converting the monoterpenes, which are normally toxic to most organisms, into a carbon source for the larva, essential for its survival.

What did the authors of the paper do, and what were their results?

A specialized ABC efflux transporter GcABC-G1 confers monoterpene resistance to Grosmannia clavigera, a bark beetle-associated fungal pathogen of pine trees, showed that a fungal ABC efflux transporter (a number of proteins around the membranes of a cell which expel toxic substances from the cell) was a major mechanism by which Grosmannia clavigera copes with monoterpenes being introduced into its cells. In a followup paper, Gene Discovery for Enzymes Involved in Limonene Modification or Utilization by the Mountain Pine Beetle-Associated Pathogen Grosmannia clavigera, the same authors describe a second mechanism that allows Grosmannia to deal with monoterpenes, a method that modifies or degrades them. It was also suggested that the initial step of lemonene, one of the monoterpenes in pine oleoresin, degradation might be carried out by cytochrome P450. The authors made their data available publicly on the SRA database.

What did we do with T-BioInfo to get results?

All RNA-seq analysis follows through the same steps, but on T-BioInfo a user can compare existing and newly developed algorithms into flexible pipelines which allows for greater accuracy in error correction, mapping, genome annotation. Especially important are the machine learning algorithms used on the T-BioInfo platform that help to compress the resulting data. Using the data mining procedures of T-BioInfo, we were able to generate a network of associations between genes with strongly intra-linked sub-networks. The gene ontology and pathway annotation showed that the adaptation allowing Grosmannia clavigera to process and thrive in monoterpenes is based on the coordination of four cellular processes: specific stress response, intensive membrane remodeling and lipid biosynthesis, fatty acid catabolism, and active efflux of toxic compounds.

What is the importance of the results we obtained as opposed to the authors?

The results we obtained are important compared to the results of the authors of the paper because our methods were able to identify more genes that are important to what was observed and by annotating these genes, we identified processes that were not mentioned in the paper. For example, we did not find that cytochrome P450 monooxygenases played a significant role in monoterpene catabolism (the metabolic breakdown of complex molecules into simpler ones, often resulting in a release of energy). Our analysis demonstrated higher levels of expression of acyltransferases in the presence of monoterpenes. Treatment by monoterpenes causes the activation of oxysterol-binding protein genes: CMQ_2241 and CMQ_4678. The F0X7A7 gene shows an association with monoterpenes, and it was demonstrated that mutation of the gene lowers the growth rate of Grosmannia clavigera. We can conclude from this information that G. clavigera reacts on monoterpenes by a wide range of enzymes and biochemical pathways. Activation of these pathways helps the fungus to maintain safely low concentrations of monoterpenes in cells and, in addition, gain energy from them.If your looking for more information about how we obtained these exciting results -click the link below.

Biological Relevance