Skip to content
View polloncarlos's full-sized avatar

Block or report polloncarlos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
polloncarlos/README.md

Carlos Pollon

Data Scientist — projetos end-to-end com foco em impacto de negócio real.

LinkedIn Portfolio Gmail


Projetos em Destaque

🎯 PA005 — Customer Value Segmentation

Clusterização de clientes de e-commerce com deploy em AWS

Pipeline end-to-end de segmentação não supervisionada: feature engineering com 17 variáveis comportamentais, comparação experimental entre KMeans, GMM, H-Clustering e DBSCAN (com embeddings via Random Forest + UMAP), Silhouette Score de 0.72, e deploy produtivo em EC2 + RDS PostgreSQL + Metabase.

Resultado de negócio: identificação de 35 clientes VIP (0,8% da base) responsáveis por 24% da receita total, e 1.200 clientes em risco de churn.

🔗 Repositório


📈 PA004 — Health Insurance Cross-Sell Ranking

Ranqueamento de clientes por propensão à compra

Solução de Learning to Rank para priorizar clientes com maior probabilidade de contratar seguro veicular. Modelos Random Forest e XGBoost avaliados com Gain@K, Lift@K e NDCG. Estimativa de uplift financeiro e integração com Google Sheets via API em Flask.

🔗 Repositório


📦 PA003 — Rossmann Sales Forecast

Previsão de vendas end-to-end com deploy via Telegram

Modelo XGBoost com seleção de features via Boruta + ExtraTrees e tuning com Optuna, otimizado para ambiente com restrição de memória (512 MB). Deploy como API Flask com bot no Telegram para consulta de previsões por loja.

🔗 Repositório


🧠 Tech Stack

Categoria Ferramentas
Linguagem Python 3.11
Machine Learning Scikit-learn, XGBoost, UMAP
Data Pandas, NumPy, SQLAlchemy
Cloud AWS EC2, S3, RDS
Banco de Dados PostgreSQL, MySQL
Dashboards Metabase, Streamlit
Deploy Flask, API REST
Ambiente Jupyter Notebook, VSCode

GitHub Stats Top Languages

Pinned Loading

  1. customer_value_segmentation customer_value_segmentation Public

    Customer segmentation pipeline using tree-based embedding clustering, AWS cloud infrastructure, and automated prediction workflows for high-value customer identification.

    Jupyter Notebook

  2. health_insurance_ranking health_insurance_ranking Public

    Machine learning solution for identifying high-propensity customers for vehicle insurance cross-sell. The project covers the full data science lifecycle and translates model performance into busine…

    Jupyter Notebook

  3. rossmann_sales_predict rossmann_sales_predict Public

    End-to-end Rossmann sales forecasting (CRISP-DM). Advanced feature engineering with Boruta + ExtraTrees and Optuna-tuned XGBoost. Model optimized under real production memory constraints (512 MB), …

    Jupyter Notebook

  4. curry_company curry_company Public

    Interactive dashboard built with Streamlit to analyze sales and product data from a fictional food company. Includes insights by time, product category, and sales channel to support strategic decis…

    Python

  5. zomato_restaurant zomato_restaurant Public

    An interactive dashboard built with Streamlit to explore Zomato’s global restaurant data. Includes insights by location, country, city, and cuisine type, based on public data from Kaggle.

    Python

  6. ensaio_machine_learning ensaio_machine_learning Public

    Projetos de classificação, regressão e clusterização — fundamentos de Machine Learning.

    Jupyter Notebook