This repository contains the solution for the Data Challenge, focusing on detecting anomalies in the manufacturing of pneumatic cylinder bottom parts using machine learning models.
- 🎯 Motivation and Goals
- 📊 Data and Feature Exploration
- 🛠️ Concept and Methodology
- 🤖 Machine Learning Models
- 📈 Model Evaluation
- 🏆 Results
- ✅ Final Model and Applicability
- 🚀 Future Improvements
- ⚡ How to Run
-
Background: CNC-milling process of pneumatic cylinders.
-
Goal: Develop a machine learning model to classify bottom parts as:
- False: Anomaly
- True: No anomaly
This ensures quality control before further production steps and helps guarantee product functionality early on. 📷
- Separation of true and false parts.
- Time series analysis across multiple sensors.
- Computation of statistical features: mean, RMS, kurtosis, skewness, etc.
- Generated 900 features per data point.
📷

The data pipeline consists of:
- Feature Extraction ✨
- Feature Selection 🔍 (correlation-based, reduced 900 → 161 features)
- Data Split ✂️ (80% train / 20% validation, stratified)
- Class Imbalance Handling ⚖️ (SMOTE)
- Feature Scaling 📏 (StandardScaler)
We experimented with the following models:
- MLP (Multilayer Perceptron) 🧠
- SVM (Support Vector Machine) 📐
- Random Forest (RF) 🌲
Each model’s hyperparameters were optimized based on trial and error.
-
Uses all data channels.
-
Fast training but requires heavy preprocessing.
-
Models are accurate but not fully reliable for real-world deployment yet. 📷
- Stronger oversampling of anomaly parts (SMOTE tuning).
- Collection of more sensor data to improve reliability.
# Clone this repository
git clone https://github.com/<your-username>/<repo-name>.git
# Install dependencies
pip install -r requirements.txt
# Train the model
python train.py
# Evaluate the model
python evaluate.py
