Skip to content

yh67737/machine-learning-coursework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

San Diego Wildfire Prediction & Machine Learning Coursework

This repository contains my comprehensive work for the Machine Learning course. It includes my final capstone project on wildfire prediction and a collection of weekly assignments and experiments covering various ML and DL algorithms.

1. Capstone Project: San Diego Wildfire Risk & Intensity Prediction

This project utilizes machine learning techniques to predict wildfire risks and potential fire intensity in San Diego County, based on historical meteorological data and satellite hotspot data.

Project Overview

The model consists of two prediction phases:

  1. Wildfire Risk Prediction (Binary Classification):
    • Objective: Predict whether a new wildfire will occur on a given day.
    • Models: Compared Logistic Regression, XGBoost, and Neural Networks.
    • Result: The Neural Network model achieved the best performance in terms of AUC-ROC and AUPRC.
  2. Fire Intensity Prediction (Multi-class Classification):
    • Objective: Classify fire intensity into "Small," "Medium," or "Large" based on Fire Radiative Power (FRP).
    • Challenges: ADDressed severe class imbalance using SMOTE (Synthetic Minority Oversampling Technique).
    • Result: The model performed reasonably well on small fires but faced challenges in distinguishing between medium and large intensity fires due to data limitations.

Data Sources

  • Fire Data: NASA FIRMS (VIIRS satellite data), including latitude, longitude, time, and FRP.
  • Weather Data: Visual Crossing Weather, containing hourly historical records (temperature, humidity, wind speed, etc.) for San Diego from 2020 to 2025.

2. Coursework & Experiments

This section documents the experiments conducted throughout the course, ranging from classical machine learning to advanced deep learning models.

Key Topics Covered

  • Regression & Classification:
    • Implementation of Linear and Polynomial Regression using Gradient Descent (Batch, Stochastic, Mini-batch).
    • Logistic Regression and Softmax Regression.
    • Binary and Multiclass classification tasks using datasets like MNIST and Fashion-MNIST.
  • Support Vector Machines (SVM):
    • Linear and Non-linear SVM classification and regression.
    • Application of Kernel tricks (Polynomial, RBF).
  • Ensemble Learning:
    • Implementation of Voting classifiers, Bagging, and Pasting.
    • Random Forests, AdaBoost, and Gradient Boosting algorithms.
  • Deep Learning & Neural Networks:
    • Training Deep Neural Networks (DNN) with techniques like He initialization, Batch Normalization, and Dropout.
    • Optimization algorithms including Nesterov Accelerated Gradient, RMSProp, and Adam.
  • Sequence Models & NLP:
    • Time series forecasting using RNNs, LSTMs, and GRUs (e.g., predicting Chicago transit ridership).
    • Natural Language Processing tasks including sentiment analysis and text generation (Char-RNN).
    • Encoder-Decoder architectures and Attention mechanisms.

Tech Stack

  • Language: Python 3.8
  • Data Processing: Pandas, NumPy
  • Visualization: Matplotlib, Seaborn
  • Machine Learning: Scikit-learn, XGBoost, Imbalanced-learn
  • Deep Learning: TensorFlow (Keras)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages