Skip to content

philfung/openvaxx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

159 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributions Last Commit License: MIT

💉 OpenVAXX: A guide to producing a personalized mRNA cancer vaccine

From biopsy to syringe: this is an end-to-end overview on how to synthesize personalized mRNA cancer vaccine in a private lab. Focuses on open-source, state-of-the-art software tools paired with "best-tool-for-the-job" benchtop lab equipment.

Caution

⚠️ RESEARCH & EDUCATION USE ONLY. NOT MEDICAL ADVICE. This is a reference for educational purposes. Building mRNA vaccines involves severe biological hazards, requiring strict oversight and qualified personnel. The authors assume no liability for misuse. Do not attempt any part of this workflow.

Try the Interactive Guide

  • Contributing: Open to contributors.
  • Feature Requests: please open a Github issue.
Screenshot 2026-03-25 at 4 03 14 PM

Table of Contents

System Architecture

This pipeline is divided into two continuous halves:

  1. Data to Blueprint: Ingests raw sequencing data, utilizes neural networks to identify immunogenic targets, and compiles a stabilized digital mRNA sequence.
  2. Blueprint to Vial: Converts the digital .FASTA sequence into physical DNA, automates In Vitro Transcription (IVT), and formulates the final LNP drug product.

Workflow, Part 1: Upstream Digital Pipeline ("Data to Blueprint")

Step 1: Reading the Blueprint

Goal: Convert samples into unorganized genetic code to establish a baseline and identify anomalies.

  • Hardware: Illumina NextSeq 2000 or Element AVITI
  • Alt. (Outsourced): Novogene, Azenta, Eurofins
  • Est. Cost: ~$300k fixed + ~$1k / pt (In-House) or ~$2.5k / pt (Outsourced)
  • Inputs:
    • Tumor biopsy - at least 35mg in tissue
    • Normal blood (healthy baseline) - standard 4ml EDTA tube
  • Process: The machine reads extracted DNA/RNA, turning biological chemistry into digital text.
  • Outputs:
    1. baseline-normal.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) - Normal blood Whole Exome Sequencing (~30X-50X)
    2. tumor-exome.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) - Tumor biopsy Whole Exome Sequencing (~100X-500X)
    3. tumor-rna.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) - Tumor biopsy RNA-Seq (~50M-100M reads).
    4. tumor-rna-quantification.tsv - Tumor gene expression levels. Made using Salmon / Kallisto on the FASTQ file.
    5. [patient-hla.txt](https://support.illumina.com/content/dam/illumina-support/help/BaseSpace_App_WGS_v6_OLH_15050955_03/Content/Source/Informatics/Apps/HLATypingFormat_appISCWGS.htm#) - Patient HLA profile (MHC Class I & II typing). Made using OptiType or HLA-HD on baseline-normal.FASTQ
  • File Format: .[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) & .txt

Step 2: Spotting the Typos

Goal: Compare healthy code against tumor code to isolate cancers.

  • Hardware: None
  • Software: Take the convergent results from multiple open-source genomic analysis tools:
    1. GATK Mutect2
    2. Google's DeepSomatic
    3. Illumina's Strelka
  • Inputs: baseline-normal.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format), tumor-exome.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format), Human Reference Genome (.[FASTA](https://en.wikipedia.org/wiki/FASTA_format))
  • Process: Aligns reads and mathematically subtracts healthy DNA from tumor DNA to isolate somatic mutations.
  • Outputs:
    1. somatic-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format) - All raw mutation candidates
    2. filtered-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format) - High-confidence, tumor-only mutations
  • File Format: .[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format)

Step 3: Picking the Targets

Goal: Use AI to predict which mutations the immune system will recognize as a threat.

  • Hardware: None
  • Software: Run nextNEOpi (open-source neoantigen prediction pipeline) with 1 or more peptide-MHC binding prediction tools:
    1. MHCflurry (open-source)
    2. NetMHCpan (commercial)
  • Inputs: filtered-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format), [Patient HLA profile](https://support.illumina.com/content/dam/illumina-support/help/BaseSpace_App_WGS_v6_OLH_15050955_03/Content/Source/Informatics/Apps/HLATypingFormat_appISCWGS.htm#) (.txt), tumor-rna-quantification.tsv - Filter candidates by expression level
  • Process: Neural networks predict which mutations will most effectively trigger an immune response based on the patient HLA receptors.
  • Outputs: [ranked-predictions.tsv](https://pvactools.readthedocs.io/en/7.0.0_docs/pvacseq/output_files.html) - Leaderboard of best targets
  • File Format: .tsv

Step 4: Writing the New Code

Goal: Compile the top predicted targets into a printable digital blueprint.

  • Hardware: None
  • Software:
    1. Generate protein string: NeoDesign or pVACvector
    2. Generate multiple candidate mRNA sequences: mRNAfold
    3. Select best mRNA sequence: mRNABERT
  • Inputs: Top targets from [ranked-predictions.tsv](https://pvactools.readthedocs.io/en/7.0.0_docs/pvacseq/output_files.html)
  • Process: Organize the cancer markers into a safe, logical order and then translate those instructions into a highly stable genetic "recipe."
  • Outputs: [vaccine-construct.fa](https://en.wikipedia.org/wiki/FASTA_format) - Master mRNA sequence
  • File Format: .fa

Workflow, Part 2: Downstream Physical Pipeline ("Blueprint to Vial")

Step 5: Printing the Master Copy

Goal: Convert the digital blueprint back into a physical, readable linear DNA template.

  • Hardware: Benchtop DNA Synthesizer (e.g., Telesis Bio BioXp)
  • Alt. (Outsourced): Twist, IDT, GenScript, Azenta
  • Est. Cost: ~$100k fixed + ~$600 / rxn (In-House) or ~$200-$900 / rxn (Outsourced)
  • Inputs:
    • [vaccine-construct.fa](https://en.wikipedia.org/wiki/FASTA_format) blueprint
    • Reagents - Oligonucleotides, BspQI restriction enzymes, AMPure XP purification beads (cell-free route) or competent E. coli cells, LB media, miniprep kit (plasmid route)
  • Process: Two synthesis routes are available - choose one:
    1. Cell-Free / Linear (recommended for speed): The BioXp system prints the DNA template directly from the digital sequence.
    2. Plasmid-Based (traditional): Gibson Assembly stitches oligonucleotides into a DNA plasmid, which is then linearized with enzymes.
  • Outputs: ~1.5 mL Purified linear DNA template (~75 ug)
  • File Format: Liquid DNA

Step 6: Creating the mRNA

Goal: Transcribe DNA into functional, immune-cloaked mRNA.

  • Hardware: Telesis Bio BioXp
  • Alt. (Outsourced): TriLink, GenScript, BiCell Scientific
  • Est. Cost: ~$250k fixed + ~$2k / rxn (In-House) or ~$1k-$3k / rxn (Outsourced)
  • Inputs:
    • ~1.5 mL Purified linear DNA template (~75 ug)
    • IVT Reagents (RNA Polymerase, N1-methylpseudouridine, CleanCap AG)
  • Process: Automated In Vitro Transcription (IVT) systems synthesize the mRNA strand from the DNA template. The process includes:
    1. DNase I digestion to remove the template
    2. multi-stage purification (e.g., magnetic beads or HPLC) to isolate pure, functional mRNA.
  • Outputs: ~5.0 mL Highly pure mRNA (~1.0 mg)
  • File Format: Liquid mRNA

Step 7: Packaging for Delivery

Goal: Wrap mRNA in a protective lipid nanoparticle to allow human cell entry.

  • Hardware: Unchained Labs Sunshine / NanoAssemblr Ignite
  • Alt. (Outsourced): VectorBuilder, Lonza, Vernal Biosciences
  • Est. Cost: ~$150k fixed + ~$500 / rxn (In-House) or ~$2k-$5k / rxn (Outsourced)
  • Inputs:
    • ~5.0 mL Highly pure mRNA (~1.0 mg)
    • 4-Lipid Cocktail (ALC-0315, PEG-Lipid, DSPC, Cholesterol)
  • Process: Precise microfluidic collisions force the negatively charged mRNA and positively charged lipids to self-assemble into nanoparticles.
  • Outputs: ~12 mL Raw mRNA-LNP mixture (~0.9 mg encapsulated)
  • File Format: LNP Mixture

Step 8: Quality Check and Bottling

Goal: Validate integrity, size, and concentration before finalizing for injection.

  • Hardware: Unchained Labs Stunner & TFF System
  • Alt. (Outsourced): CordenPharma, uBriGene, VectorBuilder
  • Est. Cost: ~$100k fixed + ~$100 / rxn (In-House) or ~$1k-$3k / rxn (Outsourced)
  • Inputs: ~12 mL Raw mRNA-LNP mixture
  • Process: DLS verifies particles are exactly 60-100nm. TFF washes out ethanol.
  • Outputs: 10 x 1.0 mL sterile glass vials (approx. 10 doses)
  • File Format: Final Vaccine Product

Web App

You can explore the system architecture interactively through our Flutter-based web app.

Running it locally

The interactive workflow is a Vite-based application. To run it:

  1. Navigate to the directory:
    cd flutter_website
  2. Install dependencies:
    flutter build web
  3. Start the development server:
    flutter run -d chrome
  4. Or, to build for production:
    flutter build web --release --base-href "/openvaxx/" <--wasm>
    The website will be available in the build/web folder

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

About

Guide to producing a personalized mRNA cancer vaccine.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages