Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ remotes::install_github("openpharma/bonsaiforest2")

This example demonstrates the usage of `bonsaiforest2` for subgroup treatment effect estimation across multiple overlapping subgroups (Age, Region, and Biomarker) using a Global Modeling approach with a Regularized Horseshoe prior.

```{r example-1}
```{r example-1, message=FALSE, warning=FALSE, results='hide'}
library(bonsaiforest2)

# 1. Simulate trial data
Expand Down Expand Up @@ -59,6 +59,11 @@ fit <- run_brms_analysis(
subgroup_effects <- summary_subgroup_effects(
brms_fit = fit
)
```

```{r}
# Print summary
print(subgroup_effects)

# 4. Visualize Results
plot(subgroup_effects)
Expand Down
175 changes: 29 additions & 146 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,156 +55,39 @@ fit <- run_brms_analysis(
sigma_ref = 3,
chains = 2, iter = 1000, warmup = 500 #
)
#> Step 1: Preparing formula and data...
#> Converting treatment variable 'trt' to numeric binary (0/1). 'Active' = 0, 'Control' = 1
#> Note: Treatment 'trt' automatically added to unshrunk terms.
#> Note: Applied one-hot encoding to shrunken factor 'age_cat' (will be used with ~ 0 + ...)
#> Note: Applied one-hot encoding to shrunken factor 'region' (will be used with ~ 0 + ...)
#> Note: Applied one-hot encoding to shrunken factor 'biomarker' (will be used with ~ 0 + ...)
#> Note: Applied dummy encoding (contr.treatment) to unshrunken factor 'age_cat'
#> Note: Applied dummy encoding (contr.treatment) to unshrunken factor 'region'
#> Note: Applied dummy encoding (contr.treatment) to unshrunken factor 'biomarker'
#> DEBUG: Creating sub-formulas...
#> - all_unshrunk_terms: age_cat, region, biomarker, trt
#> - shrunk_prog_terms:
#> - shrunk_pred_formula: trt:age_cat + trt:region + trt:biomarker
#> DEBUG: Final formula object:
#> outcome ~ unshrunktermeffect + shpredeffect
#> unshrunktermeffect ~ age_cat + region + biomarker + trt
#> shpredeffect ~ 0 + trt:age_cat + trt:region + trt:biomarker
#>
#> Step 2: Fitting the brms model...
#> Using trt_var from prepared_model: trt
#> Using sigma_ref = 3
#> Computed outcome mean: -0.027
#> Using default priors for unspecified effects:
#> - intercept: normal(-0.027484445779618, 15)
#> - unshrunk terms: normal(0, 15)
#> - shrunk predictive: horseshoe(1)
#> Adding sigma prior: normal(0, 3)
#> Fitting brms model...
#> Compiling Stan program...
#> Trying to compile a simple C file
#> Running /usr/lib/R/bin/R CMD SHLIB foo.c
#> using C compiler: ‘gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’
#> gcc -I"/usr/share/R/include" -DNDEBUG -I"/usr/local/lib/R/site-library/Rcpp/include/" -I"/usr/local/lib/R/site-library/RcppEigen/include/" -I"/usr/local/lib/R/site-library/RcppEigen/include/unsupported" -I"/usr/local/lib/R/site-library/BH/include" -I"/home/pedreram/R/x86_64-pc-linux-gnu-library/4.4/StanHeaders/include/src/" -I"/home/pedreram/R/x86_64-pc-linux-gnu-library/4.4/StanHeaders/include/" -I"/usr/local/lib/R/site-library/RcppParallel/include/" -I"/home/pedreram/R/x86_64-pc-linux-gnu-library/4.4/rstan/include" -DEIGEN_NO_DEBUG -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -DSTAN_THREADS -DUSE_STANC3 -DSTRICT_R_HEADERS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -D_HAS_AUTO_PTR_ETC=0 -include '/home/pedreram/R/x86_64-pc-linux-gnu-library/4.4/StanHeaders/include/stan/math/prim/fun/Eigen.hpp' -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1 -fpic -g -O2 -ffile-prefix-map=/build/r-base-JpkSDg/r-base-4.4.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -c foo.c -o foo.o
#> In file included from /usr/local/lib/R/site-library/RcppEigen/include/Eigen/Core:19,
#> from /usr/local/lib/R/site-library/RcppEigen/include/Eigen/Dense:1,
#> from /home/pedreram/R/x86_64-pc-linux-gnu-library/4.4/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22,
#> from <command-line>:
#> /usr/local/lib/R/site-library/RcppEigen/include/Eigen/src/Core/util/Macros.h:679:10: fatal error: cmath: No such file or directory
#> 679 | #include <cmath>
#> | ^~~~~~~
#> compilation terminated.
#> make: *** [/usr/lib/R/etc/Makeconf:195: foo.o] Error 1
#> Start sampling
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 7.6e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.76 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
#> Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
#> Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
#> Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
#> Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
#> Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
#> Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
#> Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
#> Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
#> Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
#> Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
#> Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 1.05 seconds (Warm-up)
#> Chain 1: 0.986 seconds (Sampling)
#> Chain 1: 2.036 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 3.6e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
#> Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
#> Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
#> Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
#> Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
#> Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
#> Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
#> Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
#> Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
#> Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
#> Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
#> Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 1.059 seconds (Warm-up)
#> Chain 2: 0.77 seconds (Sampling)
#> Chain 2: 1.829 seconds (Total)
#> Chain 2:
#> Warning: There were 2 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
#>
#> Analysis complete.

# 3. Derive Marginal Treatment Effects
subgroup_effects <- summary_subgroup_effects(
brms_fit = fit
)
#> Using trt_var from model attributes: trt
#> Using response_type from model attributes: continuous
#> --- Calculating specific subgroup effects... ---
#> Using data from model attributes
#> Step 1: Identifying subgroups and creating counterfactuals...
#> `subgroup_vars` set to 'auto'. Detecting from model...
#> All coefficient names:
#> unshrunktermeffect_Intercept
#> unshrunktermeffect_age_cat>EQ65
#> unshrunktermeffect_regionEU
#> unshrunktermeffect_regionUS
#> unshrunktermeffect_biomarkerPos
#> unshrunktermeffect_trt
#> shpredeffect_trt:age_cat<65
#> shpredeffect_trt:age_cat>EQ65
#> shpredeffect_trt:regionEU
#> shpredeffect_trt:regionUS
#> shpredeffect_trt:biomarkerPos
#> Looking for treatment interactions with pattern: 'trt:'
#> Found 5 treatment interaction coefficients
#> Treatment interaction coefficients found:
#> shpredeffect_trt:age_cat<65
#> shpredeffect_trt:age_cat>EQ65
#> shpredeffect_trt:regionEU
#> shpredeffect_trt:regionUS
#> shpredeffect_trt:biomarkerPos
#> Detected subgroup variable 'age_cat' from coefficient 'shpredeffect_trt:age_cat<65'
#> Detected subgroup variable 'age_cat' from coefficient 'shpredeffect_trt:age_cat>EQ65'
#> Detected subgroup variable 'region' from coefficient 'shpredeffect_trt:regionEU'
#> Detected subgroup variable 'region' from coefficient 'shpredeffect_trt:regionUS'
#> Detected subgroup variable 'biomarker' from coefficient 'shpredeffect_trt:biomarkerPos'
#> ...detected subgroup variable(s): age_cat, region, biomarker
#> Step 2: Generating posterior predictions...
#> ... detected Fixed Effects (Colon model). Predicting with re_formula = NA.
#> ... (predicting expected outcomes)...
#> Step 3: Calculating marginal effects...
#> ... processing age_cat
#> ... processing region
#> ... processing biomarker
#> Done.
```

``` r
# Print summary
print(subgroup_effects)
#> $estimates
#> # A tibble: 7 × 4
#> Subgroup Median CI_Lower CI_Upper
#> <chr> <dbl> <dbl> <dbl>
#> 1 age_cat: <65 0.128 -0.241 0.516
#> 2 age_cat: >=65 -0.184 -0.566 0.219
#> 3 region: Asia -0.109 -0.544 0.366
#> 4 region: EU -0.349 -0.828 0.101
#> 5 region: US 0.387 -0.0988 0.918
#> 6 biomarker: Neg -0.0508 -0.426 0.258
#> 7 biomarker: Pos -0.00256 -0.326 0.346
#>
#> $response_type
#> [1] "continuous"
#>
#> $ci_level
#> [1] 0.95
#>
#> $trt_var
#> [1] "trt"
#>
#> attr(,"class")
#> [1] "subgroup_summary"

# 4. Visualize Results
plot(subgroup_effects)
Expand All @@ -213,4 +96,4 @@ plot(subgroup_effects)
#> Done.
```

<img src="man/figures/README-example-1-1.png" width="100%" />
<img src="man/figures/README-unnamed-chunk-3-1.png" width="100%" />
11 changes: 10 additions & 1 deletion vignettes/Advanced_Functionalities.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ The choice of prior is central to Bayesian shrinkage. `bonsaiforest2` provides s
Priors are specified using separate parameters for each component, with a **required** `sigma_ref` parameter that serves as the reference scale for all prior distributions.

| Prior Component | Parameter Name | Default Prior | Notes |
|:-----------------|:-----------------|:-----------------|:-----------------|
|:---|:---|:---|:---|
| **Reference Scale** | `sigma_ref` | **REQUIRED** | Must be user-specified: `sd(outcome)` for continuous/count, typically `1` for binary/survival |
| **Intercept** | `intercept_prior` | `normal(mean(outcome), 5*sigma_ref)` | Automatically centered at observed mean |
| **Unshrunk Terms** | `unshrunk_prior` | `normal(0, 5*sigma_ref)` | Weakly informative |
Expand Down Expand Up @@ -121,6 +121,15 @@ sim_data <- data.frame(
sim_data$trt <- factor(sim_data$trt, levels = c(0, 1))
sim_data$region <- as.factor(sim_data$region)
sim_data$subgroup <- as.factor(sim_data$subgroup)

# 2. Prepare the model formula
prepared_model <- prepare_formula_model(
data = sim_data,
response_formula = Surv(time, status) ~ trt,
shrunk_predictive_formula = ~ 0 + trt:region,
unshrunk_terms_formula = ~ age,
response_type = "survival"
)
```

### Example 2: Using Default Priors (Recommended)
Expand Down
Loading