Nachiket Bhavsar
Portfolio
Flight-Price-Prediction
For this project, I acted as the Full-Stack Data Scientist and Developer, building an end to end flight fare prediction system that integrates machine learning with a functional web interface. My primary focus was on the Machine Learning lifecycle, where I performed extensive feature engineering on a complex dataset extracting temporal features from flight timings and applying one-hot encoding to handle categorical variables like airlines and routes. I developed and optimized a regression model (leveraging libraries like Scikit-learn, Pandas, and NumPy) to accurately predict ticket prices based on historical trends. On the development side, I architected the application using a unique hybrid stack: I used Python and Flask to serve the predictive model as a REST API, while incorporating PHP to manage backend data connectivity or user interactions. This allowed me to deliver a complete solution from raw data cleaning and model training to deploying a user-friendly web dashboard where travelers can receive real-time price estimates.
TITANIC-CLASSIFICATION
For this project, I served as the Lead Data Scientist, developing a robust machine learning model to solve the classic Titanic survival classification challenge. Following a structured Data Science lifecycle, I managed everything from initial Exploratory Data Analysis (EDA) to model deployment. I focused heavily on feature engineering, where I handled missing data in the "Age" and "Embarked" columns and transformed categorical variables—like gender and passenger class into machine-ready formats using one-hot encoding and label encoding. To ensure high predictive performance, I experimented with multiple classification algorithms, including Logistic Regression, Random Forest, and Support Vector Machines (SVM). My final model was optimized through hyperparameter tuning and cross validation, achieving significant accuracy in predicting survival outcomes based on passenger demographics. By visualizing key correlations, such as the "women and children first" survival trend, I successfully translated complex statistical patterns into a clear, binary classification system.
Women' s-E-Commerce Clothing Reviews-Text-Mining
For this project, I acted as the Lead Data Scientist, where I developed an end to end text mining solution to bridge the gap between unstructured customer feedback and business strategy. Following the CRISP-DM framework, I engineered a Python based pipeline to process and analyze over 23,000 e-commerce reviews. I handled the entire lifecycle from performing complex text preprocessing and lemmatization to building high accuracy classification models, where my Logistic Regression model achieved a 93% accuracy rate in predicting customer recommendations. To provide deeper insights, I implemented LDA topic modeling to automatically categorize reviews into themes like "Fit & Sizing" and "Fabric Quality," and utilized UMAP and t-SNE to visualize high-dimensional customer sentiment data, ultimately transforming raw text into a clear roadmap for product improvement.
