Aim
A Hands-On Course for Genome Data Analysis teaches practical workflows for analyzing genome data from raw reads to results. Learn QC, alignment, variant calling concepts, annotation, basic visualization, and reproducible reporting using real datasets.
Program Objectives
- Genomics Basics: reads, reference genomes, file formats (FASTQ, BAM, VCF).
- Quality Control: read QC, trimming, contamination checks (workflow).
- Alignment: mapping concepts and post-alignment processing.
- Variant Calling: SNP/indel calling pipeline (concept + practice).
- Annotation: basic functional interpretation and filtering.
- Comparisons: cohort summaries and variant prioritization (intro).
- Visualization: QC plots and variant summaries.
- Capstone: complete an end-to-end genome analysis report.
Program Structure
Module 1: Genome Data + Tools Setup
- Genome analysis overview: raw reads → variants → interpretation.
- Compute setup: Linux basics, conda, and project folders.
- File formats: FASTQ, FASTA, SAM/BAM, VCF, BED.
- Reproducibility: logs, metadata, and naming conventions.
Module 2: Read Quality Control (QC)
- QC metrics: quality scores, GC content, adapters.
- Trimming and filtering: when and why.
- Post-QC checks and summary reporting.
- Common issues: low quality, overrepresented sequences.
Module 3: Alignment to a Reference
- Alignment concepts: mapping rate, coverage, duplicates.
- SAM/BAM processing: sort, index, mark duplicates.
- Coverage basics and depth summaries.
- QC after alignment.
Module 4: Variant Calling Workflow (SNPs/Indels)
- Variant calling logic: pileups and confidence.
- Calling SNPs/indels (workflow view).
- Filtering: quality thresholds and hard filters (intro).
- VCF basics: fields, genotypes, and variant stats.
Module 5: Variant Annotation and Prioritization
- Annotation concepts: gene effect, coding impact, known variants (overview).
- Filtering workflow: frequency, impact, and evidence.
- Comparing samples: shared vs unique variants.
- Prioritization checklist for research use.
Module 6: Population/Cohort Summaries (Intro)
- Basic cohort stats: allele counts, missingness, heterozygosity (intro).
- PCA concept (overview) and sample relatedness (intro).
- Variant burden summaries per sample/group (intro).
- Batch effects and QC flags.
Module 7: Visualization and Reporting
- QC plots: read quality, coverage distribution, variant counts.
- Genome browser concept (overview) for locus inspection.
- Creating tables: top variants and summary metrics.
- Writing a clear methods + results report.
Module 8: Workflow Automation (Intro)
- Pipeline thinking: inputs/outputs and checkpoints.
- Simple automation using bash scripts (intro).
- Workflow tools concept: Snakemake/Nextflow overview.
- Versioning and sharing results.
Final Project
- Analyze a provided dataset (single sample or small cohort).
- Deliverables: QC report + BAM/VCF summaries + annotated shortlist + final write-up.
- Include: assumptions, filters used, and limitations.
Participant Eligibility
- UG/PG students in Biotechnology, Genetics, Bioinformatics, Life Sciences
- Researchers working with sequencing data
- Basic biology knowledge required; Linux basics helpful
Program Outcomes
- Work confidently with common genomics file formats.
- Run QC and alignment workflows and interpret outputs.
- Perform variant calling and basic annotation workflows.
- Prepare a reproducible genome analysis report.
Program Deliverables
- e-LMS Access: lessons, datasets, scripts.
- Toolkit: QC checklist, variant filtering template, reporting outline.
- Capstone Support: guidance on workflow and interpretation.
- Assessment: certification after capstone submission.
- e-Certification and e-Marksheet: digital credentials on completion.
Future Career Prospects
- Genomics Data Analyst (Entry-level)
- Bioinformatics Analyst (NGS) - Entry-level
- Research Assistant (Genomics)
- Clinical Genomics Trainee (non-diagnostic analytics)
Job Opportunities
- Genomics Labs/CROs: sequencing data QC, analysis, reporting support.
- Hospitals/Research Centers: research genomics analytics support.
- Biotech/Pharma: biomarker and genomics data processing teams.
- Universities: genomics research projects and pipeline support.







Reviews
There are no reviews yet.