Nikita Kandpal

← Back to list

Registration: 15.07.2025

Specialization: Data Analyst

PostgreSQL PyTorch Spark Tableau AWS Python

Skills

Python

SQL

NoSQL

PySpark

PostgreSQL

Snowflake

BigQuery

ETL

Data Warehousing

Spark

Hadoop

Kafka

Databricks

TensorFlow

PyTorch

Scikit-learn

NumPy

Pandas

Keras

Deep Learning

NLP

EDA

AWS EC2

AWS Lambda

AWS S3

AWS RDS

AWS Redshift

AWS API Gateway

GCP

Microsoft Azure

Docker

Kubernetes

Tableau

Power BI

Matplotlib

Seaborn

Airflow

Terraform

Jenkins

Git

CI/CD

Agile

Microservices

Work experience

Data Analyst

since 08.2023 - Till the present day |IU Libraries

CSV, Json, XML, Python, Pandas, NumPy, SQLite, Excel

● Accelerated data migration of digital library records and unstructured data from CSV, Json and XML using Python (Pandas, NumPy, SQLite), migrating 40% more records in 75% less time than expected. ● Identified data patterns and trends; built Python pipelines for bulk imports, reducing manual effort by 25%. ● Applied rule-based validation and cleaning with Python and Excel to improve data accuracy and reduce cataloging errors.

Data Science Research Assistant

01.2025 - 05.2025 |Indiana University

LLM, BERT, RoBERTa, GPT

● Built an LLM-powered pipeline to scan and filter 10,000+ research papers, achieving 85% accuracy in surfacing high-quality studies using transformer models (BERT, RoBERTa, GPT), analogous to object detection in vision tasks. ● Fine-tuned models and optimized data ingestion workflows, boosting theme classification accuracy by 30% and reducing runtime by 40%, showcasing scalable deep learning for high-throughput NLP pipelines.

ML Engineer

05.2024 - 08.2024 |Shure Incorporated

Random Forest, XGBoost, LSTM, Python, Airflow, Tableau, Power App

● Developed ML models (Random Forest, XGBoost, LSTM) for supplier trend forecasting and risk classification, boosting accuracy by 20% and accelerating issue resolution by 35%. ● Built automated data pipelines using Python and Airflow to deliver clean, timely training data from drop tests and quality logs, improving pipeline reliability by 40%. ● Integrated ML outputs into interactive Tableau dashboards to visualize failure patterns and risk scores, enabling QA teams to prioritize high-impact supplier issues. ● Enhanced Power Apps-based Quality Lab system by integrating bulk upload and new test configurations, streamlining global lab processes and reducing manual data entry by 60%.

Data Analyst

09.2020 - 08.2023 |Standard Chartered Bank

PySpark, Scikit-learn, XGBoost, AWS, Airflow, Redshift, S3, Kafka, AWS Lambda, Tableau, CI/CD, Jenkins, Docker, Terraform

● Contributed to fraud detection and revenue forecasting models using PySpark, Scikit-learn, and XGBoost, improving signal accuracy by 20% and enabling data-driven financial planning. ● Managed scalable ML pipelines on AWS using Airflow, Redshift, S3, and containerized PySpark, ensuring reliable training and inference workflows. ● Supported real-time fraud detection via ML-driven anomaly pipelines using Kafka, AWS Lambda, and Tableau, improving risk visibility and reducing detection lag by 15%. ● Streamlined CI/CD workflows for ML model training and deployment using Jenkins, Docker, and Terraform, reducing release time by 65% and improving system reliability.

Data Scientist

05.2019 - 08.2019 |Rebel Foods

AdaBoost, CART, LDA, R

● Devised a predictive model for order prep time using kitchen load, staff availability, and order history, improving wait time estimates and boosting customer satisfaction by 80%. ● Enhanced kitchen energy efficiency by integrating AdaBoost, CART, and LDA to predict peak operational hours, cutting energy use by 40%, and deploying a real-time monitoring dashboard via Shiny in R.

Educational background

Data Science (Masters Degree)

Till 2025

Indiana University

Information Technology (Bachelor’s Degree)

Till 2020

SRM Institute of Science and Technology

Languages

EnglishProficient