A comprehensive database architecture for scalable e-commerce platforms, featuring both operational (OLTP) and analytical (OLAP) systems.
This project implements a complete data model for an e-commerce platform, balancing data integrity for daily operations with high-performance analytics capabilities.
Purpose: Daily transaction processing and business operations
- Normalized schema (3rd Normal Form)
- Data integrity focused design
- Real-time operations support
Purpose: Business intelligence and reporting
- Star schema design
- Optimized for aggregations and complex queries
- Historical tracking with SCD Type 2
regions- Geographical hierarchy (country, state, city)categories- Recursive hierarchical product categorizationproducts- Product catalog with pricing and availabilitycustomers- Customer master data with regional associationorders- Order header information with status trackingorder_items- Order line items with historical pricingtransactions- Payment processing and transaction history
fact_sales- Central fact table for sales metricsdim_date- Enriched time dimension with fiscal calendardim_product- Product dimension with SCD Type 2 supportdim_customer- Customer dimension with segmentationdim_region- Geographical dimensiondim_category- Category dimension with hierarchy support
- Categories → Recursive hierarchy via
parent_category_id - Products → Belong to one category (many-to-one)
- Customers → Associated with one region (many-to-one)
- Orders → Created by one customer (many-to-one)
- Order Items → Link orders and products (many-to-one)
- Transactions → Associated with orders (one-to-one/many)
-- Fast product categorization
CREATE INDEX idx_products_category ON products(category_id);
-- Active product filtering
CREATE INDEX idx_products_active ON products(is_available)
WHERE is_available = TRUE;
-- Customer order history
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);-- Time-based analysis
CREATE INDEX idx_fact_sales_date ON fact_sales(date_id);
-- Product performance
CREATE INDEX idx_fact_sales_product ON fact_sales(product_id);
-- Regional and category analysis
CREATE INDEX idx_fact_sales_category_region ON fact_sales(category_id, region_id);- Snapshot Approach:
order_items.unit_pricecaptures price at time of sale - SCD Type 2: Dimensions track historical changes with effective/expiration dates
- Price History: Separate
product_price_historytable for audit trail
- Operational: Full normalization for data integrity
- Analytical: Strategic denormalization for query performance
- Example:
fact_salesincludescategory_idto eliminate JOINs
- Partitioning:
fact_salesby date for large datasets - Materialized Views: Pre-aggregated metrics for common reports
- Covering Indexes: Optimized for critical query patterns
- Monitoring: Continuous performance tuning based on usage patterns
- Database Setup:
-- Execute the schema creation script
\i scheme.sql- Sample Queries:
-- Monthly sales by category
SELECT
d.year, d.month, c.name AS category,
SUM(fs.quantity) AS total_quantity,
SUM(fs.revenue) AS total_revenue
FROM fact_sales fs
JOIN dim_date d ON fs.date_id = d.date_id
JOIN dim_category c ON fs.category_id = c.category_id
GROUP BY d.year, d.month, c.name
ORDER BY d.year, d.month, total_revenue DESC;- ✅ Proper indexing strategy
- ✅ Foreign key constraints for data integrity
- ✅ SCD Type 2 for historical tracking
- ✅ Balanced normalization/denormalization
- ✅ Comprehensive data types and constraints
This project is open source and available under the MIT License.
Author: Peter Leukanič
Contact: Mail in profile info