This repository contains a collection of statistical consulting projects, each addressing real-world data problems using a variety of statistical techniques. The goal is to provide clear, reproducible, and practical analyses that demonstrate proficiency in applied statistics and data science.
A utility for calculating empirical quantiles from data, useful for non-parametric statistical summaries and comparisons.
A predictive modeling project using logistic regression, PCA, glmnet methods to identify individuals at risk of heart disease based on clinical attributes.
Statistical modeling and hypothesis testing applied to datasets related to MAFLD, aiming to identify risk factors and disease patterns.
A financial data analysis focusing on the Lo 30 portfolio, applying multiple regression to understand the influence of various economic indicators.
Construction and interpretation of Receiver Operating Characteristic (ROC) curves for evaluating binary classifiers.
Code for generating sample size functions, useful for planning studies with sufficient power to detect meaningful effects.
Each project folder contains its own:
- Datasets (if applicable)
- R scripts or notebooks
- Output files or visualizations
- Notes or documentation explaining the workflow
- R
- RMarkdown
- Tidyverse packages
- Statistical modeling and machine learning libraries
Wei-Chieh (Oscar) Chen
Graduate Student in Applied Statistics
Syracuse University