CU Boulder [CSCA 5622] Introduction to Machine Learning : Supervised Learning - Final Project
This repository contains the final project for the CSCA 5652 : Introduction to Machine Learning: Supervised Learning course at CU Boulder.
The primary goal of this project is to apply supervised machine learning techniques to a classification problem. Specifically, the project focuses on Stellar Classification, utilizing a dataset of astronomical observations to classify celestial objects (like stars, galaxies, and quasars).
The project includes:
- Exploratory Data Analysis (EDA) to understand the dataset's characteristics.
- Model Analysis and evaluation of various supervised learning algorithms.
The core of the project is contained within a single Jupyter Notebook.
Stellar_Classification_Project.ipynb: The main notebook containing all the data loading, EDA, feature engineering, model training, and performance analysis.- (You may need to add a folder here for
data/if the dataset is included in the repository.)
This project is built using Python. You will need a working Python environment (3.x) and the following libraries.
Python 3.xJupyter NotebookorJupyter Lab- Standard ML Stack:
pandas,numpy,scikit-learn,matplotlib,seaborn
-
Clone the repository:
git clone https://github.com/tejasphatak/CSCA-5622-Supervised-Learning-Final-Project.git cd CSCA-5622-Supervised-Learning-Final-Project -
Install dependencies: It is highly recommended to use a virtual environment.
pip install -r requirements.txt
To replicate the analysis and view the results:
- Start the Jupyter Notebook server in the project directory:
jupyter notebook
- Open the file
Stellar_Classification_Project.ipynb. - Run the cells sequentially to execute the EDA, train the classification models, and generate the final results and visualizations.
- Language: Python
- Environment: Jupyter Notebook
- Libraries: Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn