Research Assistant
since 05.2025 - Till the present day |Nepal Applied Mathematics and Informatics Institute for research
Pytorch, OpenCV, MMCV
● Research and apply State of the Art AI techniques on medical imaging, specifically on Medical Image Segmentation, Out of Distribution Detection and Federated Learning.
Computer Vision Research Engineer
since 06.2025 - Till the present day |Redev AI
Pytorch, OpenCV, Hugging Face
● Experiment with State of the Art VLMs for Automatic Annotation, Training Detection and Classification Models.
Software Coordinator
since 06.2024 - Till the present day |LOCUS
CRM
● Managed software-related events for LOCUS 2025, including the Software Fellowship, Hackathons, and AI Competitions.
● Oversaw and led various software projects under LOCUS and the LOCUS Open Source Team (LOST).
● Worked on screening and guiding software projects for the exhibition at LOCUS, Nepal's largest technological festival
Vice President
08.2019 - 02.2020 |St. Xavier's College Computer Club
CRM
● Organized club events and competitions, including the CSP Olympiad, Web Development Competition, and the SXC Computer Festival.
Computer Vision
Projects: Reconstruction of Heritage Structures
Python, PyTorch, 3D Gaussian Splatting, gsplat, Computer Vision, 3D Reconstruction
● Researched dataset capturing methodologies for the 3D reconstruction of large heritage structures.
● Utilized state-of-the-art Gaussian Splatting techniques (Hierarchical Gaussian Splatting, CityGaussian, 3DGS, gsplat) to reconstruct and compare 3D models for their effectiveness in this task.
● Investigated the impact of using masks and bilateral grids on the reconstruction quality of 3D models.
● Researched and worked on methods to build virtual walkthroughs of heritage structures.
Computer Vision
Projects: Virtual Try-On of Shoes
Python, PyTorch, SAM (Segment Anything Model), Gaussian Splatting, Snapchat LensStudio, AR, Image Segmentation, Dataset Curation
● Collected over 100 videos of shoes and sampled 30 images per video to create a dataset of 3000 shoe images from various orientations, resolutions, lighting conditions, and backgrounds.
● Created an automatic annotation pipeline with SAM to produce a large-scale shoe segmentation dataset.
● Trained different segmentation models on the created dataset.
● Utilized Gaussian Splatting to generate 3D models of the shoes from segmented images.
● Built an AR application using Snapchat's LensStudio to overlay the generated 3D shoe models onto a user's foot.
Computer Vision
Projects: Cattle Muzzle Pattern Matching
Python, PyTorch, Siamese Networks, ViT, ResNet, EfficientNet, Grad-CAM
● Researched existing methodologies, pre-processing techniques, models, and metrics for pattern matching.
● Trained a Siamese Model with various backbones (ViT, ResNet, EfficientNet) and compared their performance.
● Utilized Grad-CAM visualizations to interpret model results.
● Experimented with different datasets, models, and pre-processing techniques, performing ablation studies.
Computer Vision
Projects: SCOPE - Semantic Captioning for Optimized Photo Exploration
Python, PyTorch, Image Captioning, ViT, DeiT, GPT-2, BERT, Model Quantization, Knowledge Distillation, Embeddings
● Researched deep learning methods for Image Captioning.
● Experimented with combinations of different Vision Encoders (ViT, DeiT) and Text Decoders (GPT-2, distilGPT).
● Evaluated the performance of large and small models under time-constrained training scenarios.
● Researched downsizing methods for large models (Knowledge Distillation, Quantization) to enable edge-device deployment.
● Quantized GPT-2, BERT, and ViT models to reduce their space and compute requirements.
● Developed a system that generates image captions, processes them through BERT to create embeddings, and saves them as metadata to enable text-based image searches.
Computer Vision
Project
Python, PyTorch, CNNs, Medical Image Analysis, Image Classification, Dataset Curation, Transformers, Deep Learning Fundamentals
Personal Project:
● Trained classification models with different CNN backbones to classify fundus images into normal, three disease categories (Diabetes, Glaucoma, Cataract), or other.
● Analyzed model performance and dataset, proposing measures to increase accuracy.
● Searched for additional data and wrote transformation scripts to align the new dataset with the old one.
● Identified and corrected inconsistencies in the dataset labels by writing correction scripts.
Other Projects:
● Mahakabi: Explored generative models for creative text generation.
● Vision Transformer Implementation: Built a Vision Transformer model from scratch.
● Anode: A novel project concept and implementation.
Teaching Assistant
01.2025 - 01.2025 |NAAMII
CRM
● Volunteered as a Teaching Assistant, supporting the Foundational Models lab and guiding participants through a project on SimCLR.
Lead, Children in Technology / Lead, Software Fellowship
07.2024 - 11.2024 |LOCUS
CRM
● Led a team of six to educate high school students on internet safety and technology.
● Conducted sessions in eight schools across four districts of Nepal.
● Organized a 10-day workshop on software fundamentals for over 250 beginner IT students.
● Led a team of over 30 volunteers in designing the syllabus and preparing teaching materials covering Web Design, Frontend, Backend, Deployment, and Data Science.
● Oversaw sponsorships, scheduling, and logistics for the event.
Assistant, Seminar on Generative AI
06.2023 - 06.2023 |IT Club
CRM
● Assisted participants in understanding generative model concepts and guided them through coding and executing the examples.