MEng Candidate — Electrical & Computer Engineering, University of Toronto (graduating December 2026)
MEM — Data Analytics & Product Innovation, University of Ottawa
P.Eng | PMP
I build end-to-end machine learning and data analytics pipelines at the intersection of AI, wireless communications, and research methods. My portfolio spans foundation model fine-tuning with LoRA for scenario-adaptive 6G beam prediction, deep reinforcement learning for network optimization, real-world mmWave beam prediction on live V2V measurements, production RAG systems for technical document retrieval, real-time streaming ML pipelines with full MLOps infrastructure, transformer-based NLP for bibliometric analysis, mixed-methods research combining survey statistics with large-scale text analysis, and cloud-based big data engineering on Apache Spark and Microsoft Azure.
Current focus: parameter-efficient foundation model adaptation for 6G wireless systems, production MLOps pipelines, and LLM-powered standards retrieval — with an emphasis on reproducible, end-to-end systems that bridge academic research and engineering practice.
Machine Learning & AI
Deep Reinforcement Learning · DQN · CNN · LSTM · LSTM Autoencoder · RNN · SVM · Random Forest · BERTopic · Supervised Learning · Reward Regression · scikit-learn · PyTorch · Stable-Baselines3 · sentence-transformers
Foundation Models & Transfer Learning
LoRA/PEFT · HuggingFace Transformers · Large Wireless Model (LWM) · ONNX Runtime · INT8 Quantization · Weights & Biases (W&B) · DeepMIMOv3 · Transfer Learning · Rank Ablation · Model Compression
MLOps & Production ML
MLflow · Apache Airflow · Evidently AI · Docker · FastAPI · Model Serving · Experiment Tracking · Drift Monitoring · CI/CD · Feature Stores
Streaming & Data Engineering
Apache Kafka · Apache Spark Structured Streaming · DuckDB · dbt · Real-Time Pipelines · Sliding Window Features · Confluent Cloud
NLP & Text Analytics
BERTopic · VADER Sentiment Analysis · spaCy · NLTK · Topic Modeling · Qualitative Coding · UMAP · HDBSCAN
Retrieval-Augmented Generation & LLM Engineering
LangChain · LangChain LCEL · ChromaDB · OpenAI Embeddings · GPT-4o-mini · RAG Pipelines · Vector Databases · FastAPI · Streamlit · RAGAS Evaluation · LangSmith · Prompt Engineering
Data Analytics & Statistics
Bibliometric Analysis · Chi-Square · ANOVA · Ordinal Logistic Regression · Cronbach's Alpha · Mann-Whitney U · OLS Regression · Time Series Forecasting · Feature Engineering · pandas · NumPy · scipy · statsmodels
Big Data & Cloud Engineering
Apache Spark (RDD & DataFrame APIs) · Spark SQL · Scala · Databricks · Microsoft Azure Synapse Analytics · Azure Data Lake · Hadoop Ecosystem · NoSQL Databases (MongoDB · Cassandra)
API Integration & Data Collection
REST API pipelines · OpenAlex API · Semantic Scholar API · Reddit .json endpoints · Pagination & Rate Limiting
Wireless & Communications
mmWave Beam Prediction · Beamforming Codebook Optimization · V2V Communications · Massive MIMO · 6G AI-RAN · Power Allocation · Spectral Efficiency · Gymnasium Environments · DeepSense 6G
Visualization & Reporting
Matplotlib · Seaborn · Plotly (interactive) · Tableau · Power BI · Publication-quality figures (300 DPI)
Tools & Platforms
Python · Scala · SQL · Git · Jupyter · Google Colab · Microsoft Azure · AWS · Databricks
| Project | Description | Stack |
|---|---|---|
| LWM-LoRA: Scenario-Adaptive mmWave Beam Prediction | LoRA fine-tuning of the Large Wireless Model (LWM v1.1, 2.47M-param Transformer) for 64-beam mmWave prediction across 3 DeepMIMO city scenarios (12,658 samples). Custom LoRA injection into 49 attention layers (4.82% trainable params). Rank ablation r∈{2,4,8,16}; r=4 optimal at 76.8% top-1 accuracy (+7.4% vs baseline). Cross-scenario transfer with 20% target data matches full fine-tuning within 0.3%. ONNX INT8 deployment: 5.51× latency reduction, 69.5% size reduction. | PyTorch · HuggingFace · LoRA/PEFT · DeepMIMOv3 · ONNX Runtime · W&B |
| Real-Time Anomaly Detection MLOps Pipeline | End-to-end streaming ML pipeline on the Numenta Anomaly Benchmark (NAB): Kafka ingestion → Spark Structured Streaming → DuckDB feature store → LSTM autoencoder training → FastAPI serving → Airflow orchestration → Evidently AI drift monitoring. 270,723 sliding windows across 38 time-series; ROC-AUC 0.64; 0/7 features drifted between train and test distributions. | PyTorch · Kafka · Spark · MLflow · FastAPI · Airflow · Evidently AI · DuckDB |
| 3GPP Specification Assistant — Production RAG System | End-to-end RAG system for querying 3GPP 5G/6G technical specifications using natural language. Indexes 14 Release 18/19 specs (4,493 pages, 18,187 chunks) with OpenAI embeddings and ChromaDB. Evaluated with RAGAS: faithfulness 0.675, context recall 0.750. Served via FastAPI backend and Streamlit chat interface with LangSmith tracing. | LangChain · ChromaDB · GPT-4o-mini · FastAPI · Streamlit · RAGAS |
| DeepSense 6G Beam Prediction — CNN, LSTM, RNN, SVM, RF & DQN | End-to-end mmWave beam prediction on 112,189 real-world V2V measurements (Scenarios 36–39, 60 GHz). Random Forest achieved 22.6% Top-1 / 43.9% Top-3 accuracy — 14.2x above random baseline — outperforming all deep learning models. DQN analysis identified feature compression as the key RL bottleneck for high-cardinality beam selection. | PyTorch · scikit-learn · DeepSense 6G · Gymnasium |
| 6G Massive MIMO Resource Allocation — DQN vs Supervised Learning | DQN vs supervised learning for dynamic power allocation in a 7-cell, 70-user Massive MIMO environment. 4.4x reward improvement over random baseline; CNN and RNN matched DQN controller performance via reward regression — demonstrating supervised learning as a computationally efficient alternative to RL for 6G AI-RAN. | PyTorch · Stable-Baselines3 · Gymnasium |
| AI-in-Education Bibliometric + NLP Analysis | Dual-API pipeline collecting 4,403 papers via OpenAlex & Semantic Scholar. BERTopic discovered 27 research clusters. Chi-square confirmed significant post-ChatGPT topic shift (χ²=323.87, p<0.0001). Exponential publication growth modeled at 30.7%/year (R²=0.844). | BERTopic · VADER · OpenAlex API · pandas · scipy |
| AI in the Classroom — Mixed-Methods Survey + Reddit Analysis | Convergent-parallel mixed-methods pipeline integrating two survey datasets (n=625) with 465 Reddit posts. Ordinal regression identified Attitude Toward Use as dominant predictor of AI adoption (OR=8.32, p<0.001). BERTopic + VADER surfaced a utility paradox — surveys show 7.44/10 utility ratings while AI writing tools discourse scored lowest sentiment (0.155). | BERTopic · VADER · spaCy · statsmodels · Reddit API |
| Ontario Electricity Demand Forecasting | End-to-end ML pipeline on 109,000+ hourly records from 4 integrated data sources (IESO, Environment Canada, NASA POWER). 13 engineered features including temporal lags, degree days, and cyclical encodings. 50.7% RMSE improvement over baseline (R²=0.9928) using 3-layer neural network across 5 model comparison. | scikit-learn · XGBoost · PyTorch · pandas |
| Big Data Analytics — Apache Spark & Azure Synapse | Distributed data processing using Apache Spark RDD and DataFrame APIs (Scala/Databricks) and cloud-scale SQL analytics on Microsoft Azure Synapse Analytics. Covers multi-file text corpus processing, retail transaction analysis across daily partitioned CSVs, and bike rental analytics with multi-table joins on a 24MB real-world dataset. | Apache Spark · Scala · Databricks · Azure Synapse · T-SQL |
- 🔧 Latest build: LoRA-adapted Large Wireless Model for scenario-adaptive 6G beam prediction — custom LoRA injection, rank ablation, cross-scenario transfer, ONNX edge deployment
- 📡 Publishing reproducible ML pipelines across wireless communications and AI research domains
- 🔬 Research Analyst at ISTEP, University of Toronto (2025) — co-authored GenAI adoption study (n=124), mixed-methods analysis, UTERC 2025 poster presentation
- 🎓 MEng Candidate, Electrical & Computer Engineering, University of Toronto — graduating December 2026