← Back to list
Registration: 20.02.2026

Pratik Shrestha

Specialization: Computer Vision
— Software Coordinator with experience managing large-scale technical events and leading open-source projects. — Computer Engineering graduate specializing in Computer Vision and Deep Learning, with a strong research focus on 3D Reconstruction, Multimodal AI, and Medical Image Analysis. — Proven track record of winning international and national AI competitions, contributing to journal publications, and developing end-to-end AI systems from research to deployment. Honors and awards: — 2nd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge C) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — 3rd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge A) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — Best students: Applied Data Science Elective Course - Samsung and Pulchowk Campus, Jan 2025. — Winner: LLM Category, Dataverse - LOCUS, Feb 2024. — Winner: Data Insights Category - Jan 2022. — Winner: Climate Change Category - Nov 2021.
— Software Coordinator with experience managing large-scale technical events and leading open-source projects. — Computer Engineering graduate specializing in Computer Vision and Deep Learning, with a strong research focus on 3D Reconstruction, Multimodal AI, and Medical Image Analysis. — Proven track record of winning international and national AI competitions, contributing to journal publications, and developing end-to-end AI systems from research to deployment. Honors and awards: — 2nd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge C) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — 3rd Position: SAGES CVS Lighthouse Challenge (Sub-Challenge A) - *International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2025), Sep 2025. — Best students: Applied Data Science Elective Course - Samsung and Pulchowk Campus, Jan 2025. — Winner: LLM Category, Dataverse - LOCUS, Feb 2024. — Winner: Data Insights Category - Jan 2022. — Winner: Climate Change Category - Nov 2021.

Skills

Python
Pytorch
Numpy

Work experience

Research Assistant
since 05.2025 - Till the present day |Nepal Applied Mathematics and Informatics Institute for research
Pytorch, OpenCV, MMCV
● Research and apply State of the Art AI techniques on medical imaging, specifically on Medical Image Segmentation, Out of Distribution Detection and Federated Learning.
Computer Vision Research Engineer
since 06.2025 - Till the present day |Redev AI
Pytorch, OpenCV, Hugging Face
● Experiment with State of the Art VLMs for Automatic Annotation, Training Detection and Classification Models.
Software Coordinator
since 06.2024 - Till the present day |LOCUS
CRM
● Managed software-related events for LOCUS 2025, including the Software Fellowship, Hackathons, and AI Competitions. ● Oversaw and led various software projects under LOCUS and the LOCUS Open Source Team (LOST). ● Worked on screening and guiding software projects for the exhibition at LOCUS, Nepal's largest technological festival
Vice President
08.2019 - 02.2020 |St. Xavier's College Computer Club
CRM
● Organized club events and competitions, including the CSP Olympiad, Web Development Competition, and the SXC Computer Festival.
Computer Vision
Projects: Reconstruction of Heritage Structures
Python, PyTorch, 3D Gaussian Splatting, gsplat, Computer Vision, 3D Reconstruction
● Researched dataset capturing methodologies for the 3D reconstruction of large heritage structures. ● Utilized state-of-the-art Gaussian Splatting techniques (Hierarchical Gaussian Splatting, CityGaussian, 3DGS, gsplat) to reconstruct and compare 3D models for their effectiveness in this task. ● Investigated the impact of using masks and bilateral grids on the reconstruction quality of 3D models. ● Researched and worked on methods to build virtual walkthroughs of heritage structures.
Computer Vision
Projects: Virtual Try-On of Shoes
Python, PyTorch, SAM (Segment Anything Model), Gaussian Splatting, Snapchat LensStudio, AR, Image Segmentation, Dataset Curation
● Collected over 100 videos of shoes and sampled 30 images per video to create a dataset of 3000 shoe images from various orientations, resolutions, lighting conditions, and backgrounds. ● Created an automatic annotation pipeline with SAM to produce a large-scale shoe segmentation dataset. ● Trained different segmentation models on the created dataset. ● Utilized Gaussian Splatting to generate 3D models of the shoes from segmented images. ● Built an AR application using Snapchat's LensStudio to overlay the generated 3D shoe models onto a user's foot.
Computer Vision
Projects: Cattle Muzzle Pattern Matching
Python, PyTorch, Siamese Networks, ViT, ResNet, EfficientNet, Grad-CAM
● Researched existing methodologies, pre-processing techniques, models, and metrics for pattern matching. ● Trained a Siamese Model with various backbones (ViT, ResNet, EfficientNet) and compared their performance. ● Utilized Grad-CAM visualizations to interpret model results. ● Experimented with different datasets, models, and pre-processing techniques, performing ablation studies.
Computer Vision
Projects: SCOPE - Semantic Captioning for Optimized Photo Exploration
Python, PyTorch, Image Captioning, ViT, DeiT, GPT-2, BERT, Model Quantization, Knowledge Distillation, Embeddings
● Researched deep learning methods for Image Captioning. ● Experimented with combinations of different Vision Encoders (ViT, DeiT) and Text Decoders (GPT-2, distilGPT). ● Evaluated the performance of large and small models under time-constrained training scenarios. ● Researched downsizing methods for large models (Knowledge Distillation, Quantization) to enable edge-device deployment. ● Quantized GPT-2, BERT, and ViT models to reduce their space and compute requirements. ● Developed a system that generates image captions, processes them through BERT to create embeddings, and saves them as metadata to enable text-based image searches.
Computer Vision
Project
Python, PyTorch, CNNs, Medical Image Analysis, Image Classification, Dataset Curation, Transformers, Deep Learning Fundamentals
Personal Project: ● Trained classification models with different CNN backbones to classify fundus images into normal, three disease categories (Diabetes, Glaucoma, Cataract), or other. ● Analyzed model performance and dataset, proposing measures to increase accuracy. ● Searched for additional data and wrote transformation scripts to align the new dataset with the old one. ● Identified and corrected inconsistencies in the dataset labels by writing correction scripts. Other Projects: ● Mahakabi: Explored generative models for creative text generation. ● Vision Transformer Implementation: Built a Vision Transformer model from scratch. ● Anode: A novel project concept and implementation.
Teaching Assistant
01.2025 - 01.2025 |NAAMII
CRM
● Volunteered as a Teaching Assistant, supporting the Foundational Models lab and guiding participants through a project on SimCLR.
Lead, Children in Technology / Lead, Software Fellowship
07.2024 - 11.2024 |LOCUS
CRM
● Led a team of six to educate high school students on internet safety and technology. ● Conducted sessions in eight schools across four districts of Nepal. ● Organized a 10-day workshop on software fundamentals for over 250 beginner IT students. ● Led a team of over 30 volunteers in designing the syllabus and preparing teaching materials covering Web Design, Frontend, Backend, Deployment, and Data Science. ● Oversaw sponsorships, scheduling, and logistics for the event.
Assistant, Seminar on Generative AI
06.2023 - 06.2023 |IT Club
CRM
● Assisted participants in understanding generative model concepts and guided them through coding and executing the examples.

Educational background

Computer Engineering (Bachelor’s Degree)
2021 - 2025
Pulchowk Campus, Tribhuvan University

Languages

EnglishAdvancedNepaliNative