LEO CDP is an Open Source AI-first Customer Data Platform (CDP) framework that empowers organizations to build and operate their own fully customizable CDP infrastructure — with machine learning and big data at its core.
Designed for developers, data scientists, marketers, and enterprises, LEO CDP enables unified data collection, real-time customer analytics, audience segmentation, and personalized marketing — all while remaining self-hosted and privacy-friendly.
- The philosophy of Dataism → USPA → LEO CDP
- Democratize AI-powered data platforms for digital transformation
- Promote data sovereignty, on-premise intelligence, and open collaboration
Collect customer data from websites, mobile apps, CRM, POS, e-commerce platforms, customer service systems, social media, advertising platforms, IoT devices, and APIs. Unify all data into a single customer profile and source of truth.
Build a comprehensive, real-time view of every customer by combining behavioral, transactional, demographic, and engagement data across all touchpoints.
Leverage machine learning to automatically create intelligent audiences and predictive customer segments using:
- RFM Analysis
- Customer Lifetime Value (CLV)
- Churn Prediction
- Purchase Propensity
- Lead Scoring
- Dynamic Audience Generation
Capture customer interactions and behavioral events in real time. Visualize and analyze customer journeys across channels to understand engagement patterns, conversion paths, and drop-off points.
Apply machine learning and AI models to uncover hidden patterns, predict customer behavior, identify growth opportunities, and generate actionable business insights.
Activate customer segments instantly across marketing, sales, customer service, and advertising platforms. Trigger actions based on customer behavior, events, attributes, and predictive scores.
Support real-time and batch data pipelines using event-driven architecture. Integrate with Apache Airflow for data ingestion, transformation, workflow orchestration, and automation.
Deliver personalized customer experiences across every channel using Agentic AI and Large Language Models (LLMs). Automatically generate content, recommend next-best actions, optimize campaigns, and orchestrate customer journeys in real time based on customer behavior, intent, and business goals.
Key capabilities include:
- AI-generated personalized content
- Next Best Action recommendations
- Dynamic journey orchestration
- Real-time offer personalization
- Conversational AI assistants
- Intelligent campaign optimization
- Context-aware customer engagement
- Autonomous AI-driven marketing workflows
Integrate seamlessly with CRM, ERP, Marketing Automation, Data Warehouse, Business Intelligence, AI platforms, and third-party applications through APIs, webhooks, and modular services.
Ensure enterprise-grade reliability with:
- Consent Management
- Data Privacy Controls
- Role-Based Access Control (RBAC)
- Audit Logs
- Data Governance
- On-Premise & Private Cloud Deployment
- Multi-Tenant Architecture
- Docker & Kubernetes Support
- Prometheus & Grafana Monitoring
- High Availability & Scalable Infrastructure
- Break away from SaaS lock-in. Full customization and ownership of your CDP.
- Ideal for agencies, startups, enterprises, and researchers building AI-powered marketing stacks.
- Open source encourages transparency, innovation, and community-driven evolution.
| Feature | Status |
|---|---|
| ✅ Core CDP Platform (Profiles, Events, Segmentation) | Complete |
| ✅ CDP SDKs (JavaScript, Python) | Complete |
| 🔄 Identity Resolution with Graph + Vector Matching | In Progress |
| 🔄 AI Assistant (Chatbot for Audience Insights & Suggestions) | In Progress |
| 🔄 Agentic AI: Personalizing the Customer Experience | In Progress |
| 🔄 Embedding Model for Customer Vector Search (via Qdrant) | In Progress |
| 🆕 CDP Mobile SDKs (Android, iOS, React Native) | Planned |
| 🆕 Open Source Campaign Management UI | Planned |
| 🆕 Integration Marketplace for Martech Tools | Planned |
| 🆕 Webhook + Event Bus Support (Kafka / RabbitMQ / SQS) | Planned |
| 🆕 Federated Identity Graph using OpenID & OAuth | Planned |
Want to contribute? Join the community!
- URL: https://dcdp.bigdatavietnam.org
- Username:
demo - Password:
12345678
- 🇻🇳 Document bằng tiếng Việt
- 🧠 CDP Handbook 2023
- 📊 Data Model & Journey Map
- ⚙️ Analytics Core Functions
- 💡 Data Strategy with LEO CDP
- Backend: Java 11 (Amazon Corretto), Python 3.10 or Python 3.12
- Database: ArangoDB 3.11 (Multi-model: Document + Graph + Search)
- Monitoring: Prometheus 2 + Grafana 8
- Data Pipeline: Apache Airflow
- Analytics & ML: Jupyter Notebook / Google Colab
- Messaging: Redis 8, OneSignal, Firebase
- Deployment: Ubuntu 22 LTS, Docker, On-Prem / Cloud
- Google Cloud, AWS, VNG Cloud, Viettel Cloud or your own private infrastructure
See: Installation Guide
Created and maintained by Trieu Nguyen (Thomas) — Founder of the LEO CDP Framework and advocate for open, data-driven innovation.
🌐 Connect: https://www.facebook.com/dataism.one
Released under the MIT License.
You are free to:
- ✅ Use in personal and commercial projects
- ✅ Modify and extend the source code
- ✅ Build and distribute your own white-label solutions
- ✅ Integrate into proprietary products
Attribution is appreciated and helps support the continued growth of the open-source community.
Built for organizations that want to own their customer data, AI capabilities, and digital future.
Special thanks to all contributors who have helped improve the framework.
⭐ If this project helps your business or development team, consider starring the repository and sharing it with the community.
- Bugs or ideas? Email: trieu@leocdp.com
- Join our learning group: BigDataVietnam.org
- YouTube: @bigdatavn
- Blog: knowledge.leocdp.net

