This repository contains instruction for reproducing figures from ALPACA manuscript
Note that figures differ slighlty in terms of shapes, colours and layout and style from the published figures as many were enhanced after the generation.
To test the ALPACA model, follow the instruction in https://github.com/McGranahanLab/ALPACA-model
To reproduce the figures follow these steps:
- Open the terminal and choose a directory, for example:
cd GitHub
-
If you don't have git installed, follow the instructions here: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
-
Clone the repository:
git clone https://github.com/McGranahanLab/ALPACA-paper.git
- Navigate to the ALPACA-paper directory:
cd ALPACA-paper
- Download the data package (~670 mb) from https://zenodo.org/records/15519765
curl -O https://zenodo.org/records/15519765/files/_assets.tar.gz
curl -O https://zenodo.org/records/15519765/files/output.tar.gz
or
wget https://zenodo.org/records/15519765/files/_assets.tar.gz
wget https://zenodo.org/records/15519765/files/output.tar.gz
You can also download the file via the internet browser and place the file in the ALPACA-paper directory
- Extract the data to the ALPACA-paper directory
tar -xvf _assets.tar.gz
tar -xvf output.tar.gz
If you don't have tar command available, you can unpack the archive using 7-zip (https://www.7-zip.org)
- Install conda if you don't already have it:
See instructions at: https://conda.io/projects/conda/en/latest/user-guide/install/index.html
- Create conda environment based on the supplied yaml file:
conda env create -f alpaca_figures.yml
- If creating environment from the yml files does not work, create a new environment and add required packages and libraries OR see below for Using Docker image
conda create -n alpaca_figures python=3.8 r-essentials --channel conda-forge
conda activate alpaca_figures
# packages/libraries below are required to reproduce the figures:
pip install papermill
pip install jupyterlab
pip install seaborn
pip install plotly
pip install pandas
pip install kaleido
pip install networkx
conda install conda-forge::r-data.table
conda install conda-forge::r-dplyr
conda install conda-forge::r-ggpubr
conda install conda-forge::r-survminer
conda install conda-forge::r-survival
conda install conda-forge::r-optparse
conda install bioconda::bioconductor-genomicranges
conda install conda-forge::r-lmertest
# start R and install two remaining libraries not available via conda:
R # start R
install.packages("igraph")
install.packages("tidytext")
install.packages("ggridges")
q() # quit R
- Activate the environment:
conda activate alpaca_figures
- To reproduce the figures, navigate to project directory
cd ALPACA-paperand execute./run_all_figures.sh. Make sure thatalpaca_figuresenvironment is active. Figures will be placed inALPACA-paper/figuresdirectory.
- This procedure requires approximately 10GB of disk space to install containerised linux distribution
- Install Docker and run Docker Desktop app: https://docs.docker.com/desktop/install/
- Navigate to project directory
cd ALPACA-paper - Build the image:
docker build -f Dockerfile -t alpaca_container .
- Once the process is completed (~10 minutes) run the container:
docker run -it --name alpaca_test alpaca_container /bin/bash
- You should now be within the containerised system and your terminal prompt should look similar to:
root@7e2a1eb2227a:/app#
- To activate the conda virtual environment run the following commands:
conda init
source /root/.bashrc
conda activate alpaca
- To recreate the figures run:
./run_all_figures.sh
- To inspect the results of the model and figures, we need to export them from the container. First, exit the container by typing
exit
- You should be back in the project root directory 'ALPACA-paper'. Copy the example output and figures with the following commands:
mkdir -p figures
docker cp alpaca_test:/app/figures .
mkdir -p output
docker cp alpaca_test:/app/output/example_cohort ./output
- Stop and remove the image from your system with:
docker stop alpaca_test
docker rm alpaca_test
docker rmi alpaca_container
Docker images and conda environments might not work for Apple M1/M2 chips. In such case, install all the packages and libraries from the command line or R. Make sure to install at Python 3.9 and Gurobipy 11.
Python:
pip install pandas
pip install kneed
pip install gurobipy==11
# packages/libraries below are required to reproduce the figures:
pip install papermill
pip install jupyterlab
pip install seaborn
pip install plotly
pip install pandas
pip install kaleido
pip install networkx
R:
data.table
dplyr
ggpubr
survminer
survival
optparse
genomicranges
lmertest
igraph
tidytext
ggridges
This procedure has been tested in Linux (CentOS Linux 7, Linux 3.10.0-1160.62.1.el7.x86_64) and macOS (Sonoma 14.4.1) environments. For the full list of dependencies, please see the alpaca_figures.yml file. The test run, including downloads, environment creation and making the figures takes approximately 1-2 hrs on a standard laptop.