This internship offers an in-depth, hands-on introduction to the processing and interpretation of RNA-Seq data using R and the Bioconductor ecosystem. Participants will learn how to handle raw RNA sequencing data, normalize gene expression values, and identify differentially expressed genes (DEGs). The internship emphasizes real-world analysis using statistical pipelines, visualizations, and functional annotation tools.
🎯 Learning Objectives
By the end of this internship, participants will:
-
Understand the RNA-Seq workflow from raw data to biological interpretation
-
Import, preprocess, and normalize RNA-Seq count data
-
Perform differential gene expression analysis using DESeq2 and edgeR
-
Visualize expression patterns using heatmaps, volcano plots, and PCA
-
Conduct functional enrichment analysis using GO and KEGG databases
-
Use Bioconductor packages effectively for reproducible analysis
🧩 Program Structure
Introduction to RNA-Seq and R Environment
-
What is RNA-Seq and its biological applications
-
Overview of RNA-Seq data formats
-
Setting up R and RStudio for bioinformatics
-
Introduction to Bioconductor
Importing and Exploring RNA-Seq Data
-
Count matrix vs raw FASTQ
-
Input formats: GTF, CSV, TXT
-
Importing and visualizing count data using readr, tximport
Quality Control & Normalization
-
Filtering low-count genes
-
Normalization methods: TPM, RPKM, TMM, DESeq2’s size factor
-
Tools: edgeR, DESeq2, limma
Differential Expression Analysis
-
Experimental design and statistical models
-
Running DE analysis using DESeq2
-
Extracting significant DEGs with adjusted p-values (FDR)
Data Visualization Techniques
-
MA plot, volcano plot, heatmap, PCA plot
-
Libraries: ggplot2, pheatmap, EnhancedVolcano
-
Practice with real dataset (e.g., cancer, COVID-19 RNA-Seq)
Gene Annotation and Enrichment Analysis
-
Mapping gene IDs to gene symbols
-
Functional enrichment using clusterProfiler, biomaRt, GOstats
-
Pathway analysis with KEGG and Reactome
Case Study & Biological Interpretation
-
Case Study: Differential expression in cancer (e.g., BRCA vs Normal)
-
Annotate top DEGs and connect to biological function
Automating Workflows & Reproducibility
-
Writing R scripts for RNA-Seq pipelines
-
R Markdown for report generation
Final Project & Presentation
-
Participants analyze a real dataset from GEO or TCGA
-
Submit annotated result file, interpretation summary, and visualizations
-
Present findings in a short presentation or PDF report
Tools & Packages Used
-
R & RStudio
-
DESeq2, edgeR, limma – Differential analysis
-
tximport, biomaRt, clusterProfiler, ggplot2, pheatmap, EnhancedVolcano
-
GEOquery, org.Hs.eg.db, KEGG.db, AnnotationHub
📂 Assignments & Final Project
-
Analyze and interpret a real-world RNA-Seq dataset
-
Identify DEGs and create publication-ready plots
-
Annotate genes and perform enrichment analysis
-
Submit a short bioinformatics report
📜 Certification
Participants will receive:
-
Certificate of Completion
-
Project Experience Letter upon successful submission of project
👥 Target Audience
-
Postgraduate/PhD students in biotechnology, molecular biology, genomics
-
Bioinformatics and computational biology learners
-
Researchers working with transcriptomics or gene expression profiling
Reviews
There are no reviews yet.