A machine learning model to classify websites as phishing/malicious or safe using URL and domain features.
📌 Overview
This project trains a machine learning model on a dataset of 30+ URL/Domain features (e.g., having_IP_Address, SSLfinal_State, Page_Rank) to predict if a website is:
✅ Safe (Legitimate)
❌ Phishing/Malicious
Dataset Features (test.csv):
Binary Indicators: having_IP_Address, Shortining_Service, HTTPS_token, etc.
Numerical Metrics: URL_Length, age_of_domain, web_traffic.
Target Label: Result (1 = Phishing, -1 = Safe).
Key Features:
✔️ High-Accuracy Model: Trained on 12,000+ samples (example dataset provided).
✔️ Real-Time Prediction: Analyze URLs dynamically.
✔️ Extensible: Add new features or retrain with updated data.
⚙️ Installation
Prerequisites
Python 3.8+
Libraries: pandas, scikit-learn, numpy
Steps Clone the repo:
git clone https://github.com/absisi44/networksecurity.git cd networksecurity
Install dependencies:
pip install -r requirements.txt.com/absisi44/networksecurity.git cd networksecurity
🚀 Usage
- Predict a Single URL
Run the trained model on a URL:
python predict.py --url "https://example.com"
Output Example:
plaintext
🔍 Analyzing: https://example.com
✅ Prediction: SAFE (96% Confidence)
- Train the Model
Retrain with your dataset (test.csv):
python train.py --dataset test.csv --model_output model.pkl
- Evaluate Performance
Generate accuracy metrics:
python evaluate.py --dataset test.csv --model model.pkl
The model trains on these key features from test.csv:
| Feature Name | Type | Description | Example Values |
|---|---|---|---|
having_IP_Address |
Binary | 1=URL uses IP, -1=Domain name | 1, -1 |
URL_Length |
Numerical | -1=Short, 1=Long URL | -1, 1 |
SSLfinal_State |
Binary | 1=Valid SSL, -1=No/Invalid SSL | 1, -1 |
having_Sub_Domain |
Binary | 1=Multiple subdomains, -1=None | 1, -1 |
Page_Rank |
Numerical | -1=Low rank, 1=High trust score | -1, 1 |
web_traffic |
Numerical | -1=Low traffic, 1=High traffic | -1, 1 |
| Target (Result) | Binary | 1=Phishing, -1=Safe | 1, -1 |
🤝 Contributing Fork the repo.
Add features/fixes in a new branch:
git checkout -b feature/your-idea Submit a Pull Request with:
Tests (tests/)
Updated docs (e.g., dataset.md)
Guidelines:
Follow PEP 8 style.
Document new features.
📬 Contact GitHub: @absisi44
Email:absisi2009l@gamil.com
🔗 References Phishing Dataset Research
Scikit-learn Documentation