A Python project that creates charts and visualizations for e-commerce data. Shows revenue trends, statistics, and predictions. Great for learning how to work with data.
If you like this project, please star it!
-
Main Dashboard
- Money earned each day
- How much money each month
- Sales percentage
- Number of transactions
- Traffic and order value
- Customer costs
- Product categories
- Devices (mobile, desktop, tablet)
- Weekly profit
-
Data Patterns
- Show connections between numbers
- How money is spread across categories
- What days are busiest
- Monthly patterns
- Sales conversion spread
- Funnel (how many people buy)
-
Trends & Growth
- Money trends over time
- How fast money is growing
- Total money made
- How money is distributed
- Normal distribution check
- Moving averages
- How risky is the money
- Find weird days
- Risk and reward
-
Smart Predictions
- Reduce data to main parts
- Show main components
- 2D data view
- What matters most
- Make predictions from data
- Check if predictions are good
| What | Number |
|---|---|
| Days of data | 638 (Jan 2023 - Sep 2024) |
| Total money | $1.05 Billion |
| Number of sales | 95,688 |
| People who bought | 5% |
| Growth per month | 3,724.92% |
| Prediction accuracy | 99.98% |
| Charts | 21+ |
- Python 3.8 or newer
- pip (comes with Python)
- Download the project
git clone https://github.com/kukyasin/ecommerce-data-analysis.git
cd ecommerce-data-analysis- Create environment (optional but good)
python -m venv venv
source venv/bin/activate # Windows: venv\\Scripts\\activate- Install libraries
pip install -r requirements.txt# First set of charts
python professional_data_analysis.py
# Second set of charts
python advanced_predictive_analysis.py
Money, sales, and performance charts
Connections between numbers and distributions
Smart analysis and predictions
You get:
dashboard_analysis.png- Main money chartstatistical_analysis.png- Data patternspredictive_analysis.png- Growth and trendsml_analysis.png- Smart predictionsecommerce_data_processed.csv- The numbers
All charts are high quality (300 DPI).
ecommerce-data-analysis/
├── professional_data_analysis.py # Main analysis script
├── advanced_predictive_analysis.py # ML and advanced analytics
├── requirements.txt # Python dependencies
├── LICENSE # MIT License
├── .gitignore # Git ignore rules
├── README.md # This file
└── outputs/ # Generated visualizations (created on run)
| Type | Tools |
|---|---|
| Data | Pandas, NumPy |
| Charts | Matplotlib, Seaborn |
| Math | SciPy |
| Predictions | Scikit-learn |
| Code | Python 3.8+ |
- Moving averages
- Trend lines
- Season patterns
- Growth tracking
- Change detection
- Basic math (mean, median)
- How data looks (spread, shape)
- Check if data is normal
- Find connections
- Find weird data
- Reduce data size
- Draw data in 2D
- Make predictions
- Check predictions
- Risk calculation
- Profit margins
- How many people buy
- Return on investment
Best for executive-level reporting and KPI tracking. Shows overall business health through 9 key metrics.
Use Cases:
- Executive presentations
- Client reports
- Monthly business reviews
- Performance tracking
Deep-dive into data distributions and relationships. Useful for identifying patterns and anomalies.
Use Cases:
- Data exploration
- Pattern identification
- Category comparison
- Statistical validation
Advanced time series analysis for forecasting and trend detection.
Use Cases:
- Revenue forecasting
- Trend analysis
- Volatility assessment
- Risk management
Dimensionality reduction and predictive modeling insights.
Use Cases:
- Feature importance analysis
- Model diagnostics
- Dimensionality reduction
- Advanced analytics
The project generates synthetic e-commerce data with the following fields:
| Field | Type | Description |
|---|---|---|
| Date | datetime | Transaction date |
| Daily_Revenue | float | Revenue in dollars |
| Transactions | int | Number of transactions |
| Conversion_Rate | float | Conversion rate (%) |
| Average_Order_Value | float | Average order value ($) |
| Customer_Acquisition_Cost | float | CAC ($) |
| Website_Traffic | int | Daily sessions |
| Product_Category | string | Category (Electronics, Clothing, etc.) |
| Device_Type | string | Device (Mobile, Desktop, Tablet) |
ecommerce_data_processed.csv contains processed data with additional calculated fields:
- Month/Week periods
- Day of week
- ROI calculations
- Cumulative metrics
Edit professional_data_analysis.py to change:
- Date range:
pd.date_range(start='2023-01-01', end='2024-09-30') - Product categories: Add/remove from
np.random.choice([...]) - Data ranges: Adjust
np.random.normal()parameters
# Figure size (inches)
plt.rcParams['figure.figsize'] = (12, 9)
# Font size
plt.rcParams['font.size'] = 8
# DPI for saving
plt.savefig('output.png', dpi=300)# Moving average windows
ma_30 = df['Daily_Revenue'].rolling(window=30).mean()
# Anomaly detection threshold
z_scores > 2.5 # Change 2.5 for different sensitivity
# PCA components
pca = PCA(n_components=6) # Change number of components- Linear Regression R² Score: 0.9998
- RMSE: $12,441.72
- MAE: $10,614.42
- MAPE: 2.37%
- Trend p-value: < 0.001 (highly significant)
- Trend slope: $5,182.88 per day
- Projected 90-day revenue: $3,757,504.68
This project is suitable for:
-
Learning Data Science
- Practice visualization techniques
- Learn statistical analysis
- Understand machine learning workflows
-
Portfolio Development
- Showcase data analysis skills
- Demonstrate visualization abilities
- Build credibility for data roles
-
Business Intelligence
- Create reusable BI templates
- Build analytics dashboards
- Automate report generation
-
Educational Purpose
- Teaching data analysis
- Business analytics courses
- Statistics demonstrations
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- You can use this project for commercial and private purposes
- You can modify and distribute the code
- You must include the license and copyright notice
If you use this project in your research or work, please cite it as:
@software{ecommerce_analysis_2025,
title={E-Commerce Data Analysis and Visualization},
author={Kuk, Yasin},
year={2025},
url={https://github.com/kukyasin/ecommerce-data-analysis}
}- Add interactive Plotly visualizations
- Create Streamlit web app
- Add more ML algorithms (forecasting, clustering)
- Database integration (MySQL, PostgreSQL)
- Real data import capabilities
- Automated report generation
- Email report delivery
- API endpoint creation
ImportError: No module named 'pandas'
pip install -r requirements.txtMatplotlib not displaying plots
# Add to start of script
import matplotlib.pyplot as plt
plt.switch_backend('Agg')Memory issues with large datasets
# Process data in chunks
chunk_size = 10000
for chunk in pd.read_csv('data.csv', chunksize=chunk_size):
process_chunk(chunk)-
Reduce figure quality for faster rendering
plt.savefig('output.png', dpi=100) # Instead of 300
-
Use subset of data for quick testing
df = df.head(100) # Test with 100 rows
-
Disable warnings for cleaner output
import warnings warnings.filterwarnings('ignore')
- "Python for Data Analysis" by Wes McKinney
- "Hands-On Machine Learning" by Aurélien Géron
- "Storytelling with Data" by Cole Nussbaumer Knaflic
Yasin Kuk
If you find this project helpful, please:
- Star this repository ⭐
- Share it with others
- Report issues or suggest improvements
- Contribute to the project
- Initial release
- 4 professional dashboards
- 21+ visualizations
- Comprehensive documentation
- 99.98% accuracy ML model
Last Updated: October 26, 2025
Version: 1.0.0
Status: Production Ready
This project uses synthetic data for demonstration purposes. The data is generated using realistic statistical distributions but does not represent actual business data. For production use with real data, ensure proper data validation and privacy compliance.
Made with ❤️ by Yasin Kuk
