Skip to content

absisi44/networksecurity

Repository files navigation

🌐🛡️Network Security: Phishing & Malicious Website Detection

A machine learning model to classify websites as phishing/malicious or safe using URL and domain features.


Overview

📌 Overview

This project trains a machine learning model on a dataset of 30+ URL/Domain features (e.g., having_IP_Address, SSLfinal_State, Page_Rank) to predict if a website is:

✅ Safe (Legitimate)

❌ Phishing/Malicious

Dataset Features (test.csv):

Binary Indicators: having_IP_Address, Shortining_Service, HTTPS_token, etc.

Numerical Metrics: URL_Length, age_of_domain, web_traffic.

Target Label: Result (1 = Phishing, -1 = Safe).

Key Features:

✔️ High-Accuracy Model: Trained on 12,000+ samples (example dataset provided).

✔️ Real-Time Prediction: Analyze URLs dynamically.

✔️ Extensible: Add new features or retrain with updated data.


⚙️ Installation

Prerequisites

Python 3.8+

Libraries: pandas, scikit-learn, numpy

Steps Clone the repo:

git clone https://github.com/absisi44/networksecurity.git cd networksecurity

Install dependencies:

pip install -r requirements.txt.com/absisi44/networksecurity.git cd networksecurity


🚀 Usage

  1. Predict a Single URL

Run the trained model on a URL:

python predict.py --url "https://example.com"

Output Example:

plaintext

🔍 Analyzing: https://example.com

✅ Prediction: SAFE (96% Confidence)

  1. Train the Model

Retrain with your dataset (test.csv):

python train.py --dataset test.csv --model_output model.pkl

  1. Evaluate Performance

Generate accuracy metrics:

python evaluate.py --dataset test.csv --model model.pkl


📊 Dataset Structure

The model trains on these key features from test.csv:

Feature Name Type Description Example Values
having_IP_Address Binary 1=URL uses IP, -1=Domain name 1, -1
URL_Length Numerical -1=Short, 1=Long URL -1, 1
SSLfinal_State Binary 1=Valid SSL, -1=No/Invalid SSL 1, -1
having_Sub_Domain Binary 1=Multiple subdomains, -1=None 1, -1
Page_Rank Numerical -1=Low rank, 1=High trust score -1, 1
web_traffic Numerical -1=Low traffic, 1=High traffic -1, 1
Target (Result) Binary 1=Phishing, -1=Safe 1, -1

🤝 Contributing Fork the repo.

Add features/fixes in a new branch:

git checkout -b feature/your-idea Submit a Pull Request with:

Tests (tests/)

Updated docs (e.g., dataset.md)

Guidelines:

Follow PEP 8 style.

Document new features.


📬 Contact GitHub: @absisi44

Email:absisi2009l@gamil.com


🔗 References Phishing Dataset Research

Scikit-learn Documentation

About

This project trains a machine learning model on a dataset of 30+ URL/Domain features (e.g., having_IP_Address, SSLfinal_State, Page_Rank) to predict if a website is: ✅ Safe (Legitimate) ❌ Phishing/Malicious

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages