
Satellite Image Analysis: A Hands-On Workshop
International Workshop on Deep Learning for Satellite Imagery and Earth Observation
Skills you will gain:
About Program:
Vision Transformers for Remote-Sensing Images is a cutting-edge international workshop designed to teach participants how to apply state-of-the-art Vision Transformer architectures to satellite and aerial imagery. With growing applications in climate research, defense, agriculture, and urban planning, transformers are enabling a leap forward in geospatial image analysis.
This hands-on program will introduce the theory of transformers, their adaptation to vision tasks (e.g., ViT, Swin Transformer), and how they outperform traditional CNNs in capturing long-range dependencies and spatial relationships in high-resolution imagery. Participants will work on real datasets (Sentinel, Landsat, DOTA, etc.) using frameworks like PyTorch, Hugging Face, and TIMM.
Aim:
To equip participants with the skills to apply Vision Transformers (ViTs) to remote-sensing image analysis, focusing on tasks like land cover classification, object detection, climate pattern recognition, and disaster mapping using advanced deep learning methods.
Program Objectives:
-
Introduce Vision Transformers and their application in satellite image analysis
-
Enable hands-on experimentation with publicly available geospatial datasets
-
Teach model customization and fine-tuning techniques
-
Promote responsible AI usage in environmental and humanitarian applications
-
Foster interdisciplinary innovation at the intersection of AI and Earth science
What you will learn?
Day 1: Transformers vs CNNs in Remote Sensing
Beyond CNNs: Vision Transformers for Scene Classification
🔹 Topics:
- Review of CNN architectures in remote sensing (ResNet, UNet, etc.)
- Introduction to Vision Transformers (ViT): How they work
- Why ViTs are suited for remote-sensing imagery (large context, less inductive bias)
- Comparison: ViT vs CNN in scene classification
🔹 Hands-on/Demo:
- Colab demo using pretrained ViT and CNN for a sample land scene classification task using EuroSAT or BigEarthNet dataset
Day 2: Land-Cover Change Detection Using Transformers
Tracking the Earth: Transformers for Change Detection
🔹 Topics:
- Problem of land-cover change detection (LCCD) and its applications (urbanization, deforestation)
- Architectures adapted for temporal change detection (Siamese ViTs, TimeSFormer)
- Pipeline: Preprocessing → Patch Embedding → Transformer Blocks → Classification head
🔹 Hands-on/Case Study:
- Visual result comparison (before/after images and heatmaps)
Day 3: Fine-Tuning Vision Transformers on Small Labeled Sets
Efficient Learning: Adapting ViTs with Limited Data
🔹 Topics:
- Challenges of training ViTs with small labeled data
- Strategies: Transfer learning, self-supervised learning (DINO, MAE), adapter layers
- Case studies in remote sensing: Agriculture crop mapping, disaster response
🔹 Hands-on:
- Colab demo: Fine-tuning a ViT model on a small custom dataset
Mentor Profile
Fee Plan
Get an e-Certificate of Participation!

Intended For :
-
Geospatial and remote-sensing professionals
-
AI/ML engineers and computer vision researchers
-
Earth scientists, environmental engineers, and urban planners
-
Students and researchers in space science, climate, or deep learning
-
Government/NGO professionals working with Earth observation data
Career Supporting Skills
Program Outcomes
-
Understand the fundamentals of Vision Transformers and how they compare to CNNs
-
Process and analyze high-resolution satellite imagery using deep learning
-
Train and fine-tune ViTs for various geospatial applications
-
Build a portfolio project on remote sensing with ViT-based models
-
Receive a certificate demonstrating proficiency in AI + remote sensing
