
Data Scientist & AI/ML Engineer
Transforming complex data into intelligent solutions that drive innovation and create measurable business impact across industries.
About Me
Results-driven Data Scientist with a proven track record of delivering enterprise-grade AI solutions that drive measurable business impact and operational efficiency.
Background
Master's in Data Science from UMass Dartmouth with 2+ years of hands-on experience architecting and deploying production-ready AI systems. Demonstrated expertise in reducing operational costs by 40% through intelligent automation and improving system accuracy by 28% through advanced ML techniques.
Led cross-functional teams in developing CNN and CGAN-based diagnostic systems for healthcare applications, achieving 94% accuracy in medical image classification. Architected battery optimization systems for electric vehicles that improved fault detection by 33% and extended battery life predictions.
Specialized in Large Language Model integration, RAG architectures, and prompt engineering for enterprise applications. Successfully deployed systems processing millions of daily transactions with sub-second response times and 99.9% uptime.
Core Competencies
Machine Learning Engineering
Production-scale ML systems with 99.9% uptime, serving millions of predictions daily across healthcare and automotive sectors.
Cloud Architecture
AWS-certified solutions architect with expertise in scalable, cost-optimized infrastructure reducing operational costs by 40%.
Data Engineering
Real-time data pipelines processing 10M+ records daily with sub-second latency and automated quality monitoring.
AI Research & Development
Published researcher in computer vision and generative AI, with models achieving state-of-the-art performance metrics.
Technology Stack
Professional Experience
Building impactful AI solutions across healthcare and automotive industries
Key Achievements
- Designed CNN and CGAN-based pipelines improving breast cancer detection accuracy by 5%
- Achieved 40% faster training using GPU-accelerated AWS EC2 clusters with DDP
- Enabled real-time inference (3x faster) using TorchScript and model pruning
- Reduced deployment time by 60% using Docker and AWS Lambda
- Integrated LLM-powered diagnostic explanations for physician-facing outputs
Technologies Used
Key Achievements
- Developed LightGBM models achieving 18% higher accuracy for battery SOC and RUL prediction
- Built real-time anomaly detection system identifying battery faults 33% earlier
- Improved fault detection accuracy by 28% using feature importance analysis
- Reduced testing cycle time by 20% through ML-driven insights collaboration
- Designed containerized ETL pipelines using Apache Airflow and Docker
Technologies Used
Featured Projects
Production-grade AI systems delivering measurable business impact across healthcare, automotive, and enterprise sectors.
April 2025 – May 2025
60% faster content generation
Built a real-time visualization tool for LLM agent decision-making using Graphiti Agent integration, tracking execution flows and toolchain interactions through an interactive Streamlit interface.
Key Achievements
- Integrated the Graphiti Agent from the Ottomator framework to visualize the decision-making flow of LLM agents in real-time.
- Used GraphitiTracer to hook into agent lifecycle events like on_agent_action, on_tool_end, and on_chain_end to track execution steps
- Parsed intermediate reasoning steps, tool usage, and model outputs to dynamically construct a visual workflow graph
- Leveraged Streamlit to build an interactive front-end for visualizing agent toolchains and trace paths
- Made the tool extensible and model-agnostic, supporting integration with different LLMs and custom toolchains.
Technologies
March 2025 – April 2025
Pipeline Acceleration
Developed automated ETL pipelines with Apache Airflow and Spark, implementing real-time feature engineering that reduced processing latency by 75% while ensuring robust monitoring and error handling.
Key Achievements
- Built custom Airflow tasks using the @task decorator and managed dependencies using task chaining within a DAG
- Implemented real-time feature engineering with Apache Spark, reducing latency by 75%
- Configured Airflow connections and secrets management for API and database access using Astro Runtime.
- Managed task scheduling, retries, and logging through Airflow's native UI for robust and transparent pipeline monitoring
Technologies
Feb 2025 – Mar 2025
89.7% prediction accuracy
Machine learning model to predict customer purchase behavior using Amazon SageMaker with 89.7% accuracy.
Key Achievements
- Built XGBoost model achieving 89.7% accuracy in customer purchase classification
- Implemented end-to-end MLOps pipeline using AWS SageMaker and S3
- Deployed real-time API for seamless prediction serving
- Fine-tuned hyperparameters for optimal model performance
- Enhanced understanding of cloud-based model training and scalable AI deployment
Technologies
Jan 2025 – Feb 2025
Realistic fashion image generation
Deep Convolutional GAN (DCGAN) trained on fashion dataset to generate realistic clothing images.
Key Achievements
- Trained Deep Convolutional GAN (DCGAN) on Fashion-MNIST dataset
- Generated realistic fashion images using adversarial training
- Fine-tuned hyperparameters for improved image quality
- Explored latent space representations of fashion items
- Demonstrated AI applications in fashion design and synthetic data generation
Technologies
September 2024 – December 2024
86.34% diagnostic accuracy
HIPAA-compliant medical imaging platform with 94% diagnostic accuracy, processing 500+ scans daily.
Key Achievements
- Developed CNN architecture achieving 94% accuracy in medical image classification
- Implemented CGAN-based data augmentation, improving model robustness by 20%
- Built HIPAA-compliant infrastructure with end-to-end encryption and audit trails
- Deployed containerized solution with automated CI/CD, reducing deployment time by 80%
Technologies
May 2022 – July 2023
33% improvement in fault detection
ML-driven predictive maintenance system for electric vehicles, improving fault detection by 33% and extending battery life.
Key Achievements
- Developed LightGBM ensemble models with 92% accuracy for battery life prediction
- Implemented real-time anomaly detection using Isolation Forest, reducing false positives by 40%
- Built automated data collection system processing 1M+ sensor readings per hour
- Achieved $2M+ annual savings through predictive maintenance optimization
Technologies
Technical Skills
Comprehensive technical expertise spanning the entire AI/ML development lifecycle, from research and prototyping to production deployment and monitoring.
- Deep Learning (PyTorch, TensorFlow)
- Computer Vision & NLP
- Large Language Models
- MLOps & Model Deployment
- Generative AI (GANs, VAEs)
- Reinforcement Learning
- AWS (SageMaker, EC2, S3, Lambda)
- Docker & Kubernetes
- CI/CD Pipelines
- Infrastructure as Code
- Microservices Architecture
- Auto-scaling & Load Balancing
- Apache Airflow & Kafka
- Real-time Data Processing
- ETL/ELT Pipelines
- Data Warehousing
- Stream Processing
- Data Quality & Governance
- Python (Advanced)
- SQL & NoSQL Databases
- JavaScript/TypeScript
- API Development (FastAPI, Flask)
- Version Control (Git)
- Software Architecture
Industry-recognized credentials demonstrating expertise and commitment to continuous learning
AWS Certified Cloud Practitioner
Amazon Web Services
2024
Machine Learning Specialization
Stanford University
2023
Deep Learning Specialization
DeepLearning.AI
2023
Data Engineering with Apache Airflow
IBM
2024
Advanced SQL for Data Scientists
Coursera
2023
Generative AI with Large Language Models
DeepLearning.AI
2024
Latest Blog Posts
Sharing insights on AI, machine learning, and data science
Write complex optimization problems exactly like mathematical formulas
Master distributed data processing with Python's most powerful big data tool
Artificial Intelligence is revolutionizing fields from autonomous vehicles to natural language processing...
Understanding when, why, and how to reduce dimensions in your data science projects
Battery Intelligence Research
Breakthrough research in machine learning-driven battery optimization that transformed industry standards and earned recognition as outstanding undergraduate research.
Research Impact
The Challenge
Traditional battery management systems were failing across the industry. Faults went undetected until catastrophic failure, and data processing couldn't keep up with real-time demands.
The Innovation
Developed advanced machine learning algorithms using LightGBM and Isolation Forest to predict battery failures before they occurred. Created containerized ETL pipelines with Apache Airflow.
Industry Recognition
Awarded "Outstanding Undergraduate Research Recognition" by the Department of Electrical and Electronics Engineering. Research now influences battery monitoring in commercial applications.
Key Achievements
18% Accuracy Improvement
LightGBM models for battery life prediction
33% Earlier Detection
Real-time anomaly detection system
28% Better Performance
Fault detection through feature analysis
Outstanding Research Award
Recognition for breakthrough work
Research Timeline
May 2022 - Selection
Chosen from 120 candidates for research opportunity
2022-2023 - Development
15-month intensive research and development phase
July 2023 - Recognition
Outstanding Research Award and industry adoption
Get In Touch
Let's discuss opportunities in AI, data science, or potential collaborations
Contact Information
Connect with me
Open to Opportunities
Currently seeking full-time Data Scientist positions and research collaborations in AI/ML, particularly in healthcare and automotive applications.