🧾 Google AdWords Performance Analysis

End-to-End Marketing Analytics Pipeline for Keyword Performance, Cost Optimization, and Traffic Insights
Excel · Python · MySQL · Power BI

📌 Table of Contents

Overview
Business Problem
Project Objectives
Tools & Technologies
End-to-End Data Workflow
Dataset
Data Cleaning & Preparation
Data Modeling & SQL Schema
SQL Schema (Core with Queries)
Relationship Diagram (EER)
Power BI Data Model
Power BI Dashboard
Key Outcomes, Insights & Future Improvements
How to Run This Project
Repository Structure
Author & Contacts

📌 Overview

This project presents a complete, real-world Google AdWords analytics pipeline, transforming raw keyword-level marketing data into actionable insights and interactive dashboards.

The solution simulates an enterprise-grade ETL + BI workflow, commonly used by:

Digital marketing teams
Growth & performance analysts
Business intelligence units

From raw Excel files to Python-based preprocessing, normalized MySQL modeling, and Power BI dashboards, this project demonstrates strong fundamentals in data engineering, analytics, and visualization.

🎯 Business Problem

Marketing teams often struggle to answer questions such as:

Which keywords generate the most traffic at the lowest cost?
Are we overspending on high-CPC, low-return keywords?
How does keyword difficulty affect traffic contribution?
What are the monthly and seasonal traffic trends?
Which keywords should be scaled, optimized, or paused?

This project addresses these challenges by building a structured analytics framework for Google AdWords performance evaluation.

🎯 Project Objectives
Analyze keyword-level traffic, cost, and competition metrics
Identify high-impact and low-efficiency keywords
Track search demand and CPC trends over time
Build a normalized relational database schema
Deliver interactive Power BI dashboards for decision-making

🛠 Tools & Technologies

Tool	Purpose
Excel	Raw data preparation, formulas, lookup tables
Python (pandas)	Data cleaning, keyword ID generation
MySQL	Relational modeling & SQL querying
Power BI	Data modeling, DAX, dashboards
Git & GitHub	Version control & documentation

🔁 End-to-End Data Workflow

graph LR
    A[Raw Excel Files] --> B[Excel Formulas]
    B --> C[Python Processing]
    C --> D[MySQL Database]
    D --> E[Power BI Dashboard]

Each stage feeds clean, structured data into the next—mirroring real production analytics pipelines.

📂 Dataset

📑 Dataset Description

🔹 Source: Public dataset downloaded from Kaggle (Google AdWords–style keyword performance data)

🔹 Raw_data.xlsx (Simulated Google AdWords export) containing variables:

Keyword
Current & previous positions
Cost Per Click (CPC)
Traffic
Competition
Search volume
Number of SERP results

🧹 Data Cleaning & Preparation

Step 1: Excel Processing

Excel formulas were used to:

Aggregate search volume using SUMIF()
Calculate average keyword difficulty using AVERAGEIF()
Categorize difficulty levels using IF() logic
Create lookup tables using VLOOKUP()

Step 2: Python Processing

Using pandas, the following steps were performed:

Loaded raw Excel data with pd.read_excel()
Generated unique keyword_ID values
Ensured consistent foreign-key mapping across tables

df['keyword_ID'] = df['keyword'].astype('category').cat.codes

Step 3: Lookup Table Creation

The cleaned data was split into three reference tables:

keyword.csv — Unique keywords with IDs
search_volume.csv — Aggregated search volume
keyword_difficulty.csv — Average difficulty and difficulty labels

Excel Formulas Used

Formula	Purpose
`SUMIF()`	Aggregate total search volume
`AVERAGEIF()`	Compute average difficulty
`IF(B2>=50,"Hard","Moderate")`	Assign difficulty level
`VLOOKUP()`	Map keyword metadata

✅ These lookup tables support normalization and ensure data integrity in the SQL schema. These tables act as lookup/reference tables for SQL.

🗂 Data Modeling & SQL Schema

Dimension Tables

🔹 keyword

Column	Description
`keyword_ID`	Primary key
`keyword`	Keyword text

🔹 search_volume

Column	Description
`keyword_ID`	Foreign key
`search_volume`	Monthly search volume

🔹 keyword_difficulty

Column	Description
`keyword_ID`	Foreign key
`avg_difficulty`	Average difficulty score
`difficulty_level`	Hard / Moderate

🔹 Fact Table: website_traffic_data

Column	Description
`title`	Campaign / landing page
`keyword`	Keyword text
`keyword_ID`	Foreign key
`positions`	Current position
`previous_positions`	Previous position
`last_seen`	Date
`Search_Volume`	Monthly demand
`CPC`	Cost per click
`Traffic`	Estimated traffic
`Traffic_Percent`	Share of total traffic
`Traffic_Cost`	Cost attribution
`Traffic_Cost_Percent`	Cost share
`Competition`	Competition score
`Number_of_Results`	SERP results
`Keyword_difficulty`	Difficulty reference

✅ This acts as the central fact table enabling all analytical queries.

Main Table View:

🗃 SQL Schema (Core with Queries)

CREATE DATABASE IF NOT EXISTS traffic_project;
USE traffic_project;

CREATE TABLE website_traffic_data (
    title VARCHAR(250),
    keyword VARCHAR(250),
    keyword_ID INT,
    positions INT,
    previous_positions INT,
    last_seen DATE,
    Search_Volume INT,
    CPC DECIMAL(10,2),
    Traffic INT,
    Traffic_Percent DECIMAL(10,2),
    Traffic_Cost INT,
    Traffic_Cost_Percent DECIMAL(10,2),
    Competition DECIMAL(10,2),
    Number_of_Results INT,
    Keyword_difficulty INT
);

CREATE TABLE keyword (
    keyword_ID INT PRIMARY KEY,
    keyword VARCHAR(250)
);

CREATE TABLE keyword_difficulty (
    keyword_ID INT PRIMARY KEY,
    avg_difficulty DECIMAL(10,2),
    difficulty_level VARCHAR(50)
);

CREATE TABLE search_volume (
    keyword_ID INT PRIMARY KEY,
    Search_Volume INT
);

ALTER TABLE website_traffic_data
ADD FOREIGN KEY (keyword_ID) REFERENCES keyword(keyword_ID);

Foreign-key constraints ensure data integrity and normalization. These keys ensure consistent data joins between tables.

📘 Schema Screenshot:

🔑 Keys & Normalization

Imported the other CSVs into MySQL:

keyword
search_volume
keyword_difficulty

Then established relational integrity:

ALTER TABLE website_traffic_data
ADD FOREIGN KEY (keyword_ID) REFERENCES keyword(keyword_ID);

These keys ensure consistent data joins between tables.

🧠 Relationship Diagram (EER)

EER Diagram:

These diagrams visualize the 1-to-many relationships between:

Keywords → Traffic Data
Keyword Difficulty → Traffic Data
Search Volume → Traffic Data

🧩 Power BI Data Model

The dashboard follows a star-schema data model, ensuring accurate filtering, optimized performance, and scalable reporting.

Relationships:

Fact: website_traffic_data
Dimensions:
- keyword (via keyword_ID)
- keyword_difficulty
- search_volume

✅ This model ensures accurate filtering and joins.

Power BI Data Model:

🔢 DAX Measures Used

📊 Key Measures

Average CPC = AVERAGE(website_traffic_data[CPC])
Total Traffic = SUM(website_traffic_data[Traffic])
Total Search Volume = SUM(website_traffic_data[Search_Volume])
Total Traffic Cost = SUM(website_traffic_data[Traffic_Cost])
Traffic Percent = AVERAGE(website_traffic_data[Traffic_Percent])

Start Date = MIN(website_traffic_data[last_seen])
Last Date = MAX(website_traffic_data[last_seen])

Calendar = 
CALENDAR(
    [Start Date],
    [Last Date]
)

Month = FORMAT(Calendar[Date], "MMMM")
Quarter = QUARTER(Calendar[Date])
Year = YEAR(Calendar[Date])
Year & QTR = Year & " Q" & Quarter

These enable filtering, aggregation, and time-based visualizations.

📈 Power BI Dashboard

The Power BI report connects directly to MySQL database and provides interactive insights.

Dashboard Features

KPI Cards: Total Traffic, CPC, Search Volume, Cost
Line Chart: Traffic trends over time
Bar Chart: Traffic by keyword
Treemap: Search volume distribution
Pie/Donut Charts: Traffic distribution by keyword difficulty and by month
Slicers: Year, Quarter, Keyword filter

Power BI Dashboard Insights:

🚀 Key Outcomes, Insights & Future Improvements

Key Outcomes

Identified high-traffic, low-CPC keywords for scaling
Revealed high-cost, low-return keywords for optimization
Highlighted difficulty-based traffic contribution
Delivered a reusable marketing analytics framework

Insights Generated

High-search-volume keywords do not always deliver proportional traffic
Certain keywords have high CPC but low traffic efficiency
Majority of traffic comes from moderate to high difficulty keywords
Monthly traffic trends reveal seasonal demand patterns

Outcome & Impact

Enabled campaign performance monitoring through a single dashboard
Improved keyword prioritization strategy
Supported cost optimization decisions
Demonstrated the full analytics lifecycle: data modeling → SQL → BI

Future Improvements

Automate data ingestion using Google Ads API or scheduled ETL pipelines
Add advanced KPIs such as CTR, Conversion Rate, and ROAS
Implement time-series analysis for MoM/YoY and seasonality trends
Enhance the data model with campaign, device, and location dimensions
Introduce keyword clustering and performance forecasting
Deploy the solution on cloud platforms for scalability and real-time analytics

📂 Repository Structure

📦 google-adwords-performance-analysis
 ┣ 📄 Raw_data.xlsx
 ┣ 📄 website_traffic_data.csv
 ┣ 📄 keyword.csv
 ┣ 📄 keyword_difficulty.csv
 ┣ 📄 search_volume.csv
 ┣ 📄 SQL_schema_script.sql
 ┣ 📄 AdWords_Traffic_Dashboard.pbix
 ┣ 📁 Images
 ┃ ┣ 📷 main_table.png
 ┃ ┣ 📷 EER_Diagram.png
 ┃ ┣ 📷 mysql_schema_editor.png
 ┃ ┣ 📷 power_bi_dashboard.png
 ┃ ┗ 📷 powerbi_data_model.png
 ┗ 📄 README.md

📈 ⚙ How to Run This Project

1️⃣ Clone the Repository

git clone https://github.com/Samiul1947/google-adwords-performance-analysis.git

2️⃣ Review Source Files

Inspect Raw_data.xlsx and the processed .csv files
(Optional) Modify data using Excel formulas if needed

3️⃣ Set Up MySQL Database

Import all .csv files using MySQL Workbench
Execute the provided SQL schema to create tables and relationships
Ensure foreign keys are applied correctly

4️⃣ Open Power BI Dashboard

Open the provided .pbix file to view the interactive report
Or connect manually via: Home → Get Data → MySQL

👤 Author & Contacts

Samiul Gazi | Aspiring Data Analyst Passionate about data storytelling, unsupervised data exploration, and real-world analytics. 🎓 M.Sc. Economics – University of Calcutta
📍 Kolkata, West Bengal

📧 Email: sgaziamumh@gmail.com
🔗 LinkedIn: linkedin.com/in/samiul-gazi
🌐 GitHub: github.com/Samiul1947

If you liked this project, please ⭐ the repo!

⭐ For Recruiters & Hiring Managers
📌 This project demonstrates hands-on experience in ETL design, relational data modeling, SQL querying, DAX measures, and BI storytelling, which are core competencies for Data Analyst and Business Intelligence roles.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dashboard		dashboard
data		data
images		images
notebooks		notebooks
reports		reports
sql_model		sql_model
README.md		README.md
sql_schema_scripts.sql		sql_schema_scripts.sql

Folders and files

Latest commit

History

Repository files navigation

🧾 Google AdWords Performance Analysis

📌 Table of Contents

📌 Overview

🎯 Business Problem

🎯 Project Objectives

🛠 Tools & Technologies

🔁 End-to-End Data Workflow

📂 Dataset

📑 Dataset Description

🧹 Data Cleaning & Preparation

Step 1: Excel Processing

Step 2: Python Processing

Step 3: Lookup Table Creation

Excel Formulas Used

🗂 Data Modeling & SQL Schema

Dimension Tables

🔹 Fact Table: website_traffic_data

Main Table View:

🗃 SQL Schema (Core with Queries)

📘 Schema Screenshot:

🔑 Keys & Normalization

🧠 Relationship Diagram (EER)

EER Diagram:

🧩 Power BI Data Model

Relationships:

Power BI Data Model:

🔢 DAX Measures Used

📊 Key Measures

📈 Power BI Dashboard

Dashboard Features

Power BI Dashboard Insights:

🚀 Key Outcomes, Insights & Future Improvements

Key Outcomes

Insights Generated

Outcome & Impact

Future Improvements

📂 Repository Structure

📈 ⚙ How to Run This Project

1️⃣ Clone the Repository

2️⃣ Review Source Files

3️⃣ Set Up MySQL Database

4️⃣ Open Power BI Dashboard

👤 Author & Contacts

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages