From biopsy to syringe: this is an end-to-end overview on how to synthesize personalized mRNA cancer vaccine in a private lab. Focuses on open-source, state-of-the-art software tools paired with "best-tool-for-the-job" benchtop lab equipment.
Caution
- Contributing: Open to contributors.
- Feature Requests: please open a Github issue.
- System Architecture
- Workflow, Part 1: Upstream Digital Pipeline ("Data to Blueprint")
- Workflow, Part 2: Downstream Physical Pipeline ("Blueprint to Vial")
- Web App
This pipeline is divided into two continuous halves:
- Data to Blueprint: Ingests raw sequencing data, utilizes neural networks to identify immunogenic targets, and compiles a stabilized digital mRNA sequence.
- Blueprint to Vial: Converts the digital
.FASTAsequence into physical DNA, automates In Vitro Transcription (IVT), and formulates the final LNP drug product.
Goal: Convert samples into unorganized genetic code to establish a baseline and identify anomalies.
- Hardware: Illumina NextSeq 2000 or Element AVITI
- Alt. (Outsourced): Novogene, Azenta, Eurofins
- Est. Cost: ~$300k fixed + ~$1k / pt (In-House) or ~$2.5k / pt (Outsourced)
- Inputs:
- Tumor biopsy - at least 35mg in tissue
- Normal blood (healthy baseline) - standard 4ml EDTA tube
- Process: The machine reads extracted DNA/RNA, turning biological chemistry into digital text.
- Outputs:
baseline-normal.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format)- Normal blood Whole Exome Sequencing (~30X-50X)tumor-exome.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format)- Tumor biopsy Whole Exome Sequencing (~100X-500X)tumor-rna.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format)- Tumor biopsy RNA-Seq (~50M-100M reads).tumor-rna-quantification.tsv- Tumor gene expression levels. Made using Salmon / Kallisto on the FASTQ file.[patient-hla.txt](https://support.illumina.com/content/dam/illumina-support/help/BaseSpace_App_WGS_v6_OLH_15050955_03/Content/Source/Informatics/Apps/HLATypingFormat_appISCWGS.htm#)- Patient HLA profile (MHC Class I & II typing). Made using OptiType or HLA-HD on baseline-normal.FASTQ
- File Format:
.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format) & .txt
Goal: Compare healthy code against tumor code to isolate cancers.
- Hardware: None
- Software: Take the convergent results from multiple open-source genomic analysis tools:
- GATK Mutect2
- Google's DeepSomatic
- Illumina's Strelka
- Inputs:
baseline-normal.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format),tumor-exome.[FASTQ](https://en.wikipedia.org/wiki/FASTQ_format),Human Reference Genome (.[FASTA](https://en.wikipedia.org/wiki/FASTA_format)) - Process: Aligns reads and mathematically subtracts healthy DNA from tumor DNA to isolate somatic mutations.
- Outputs:
somatic-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format)- All raw mutation candidatesfiltered-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format)- High-confidence, tumor-only mutations
- File Format:
.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format)
Goal: Use AI to predict which mutations the immune system will recognize as a threat.
- Hardware: None
- Software: Run nextNEOpi (open-source neoantigen prediction pipeline) with 1 or more peptide-MHC binding prediction tools:
- Inputs:
filtered-variants.[VCF](https://en.wikipedia.org/wiki/Variant_Call_Format),[Patient HLA profile](https://support.illumina.com/content/dam/illumina-support/help/BaseSpace_App_WGS_v6_OLH_15050955_03/Content/Source/Informatics/Apps/HLATypingFormat_appISCWGS.htm#) (.txt),tumor-rna-quantification.tsv- Filter candidates by expression level - Process: Neural networks predict which mutations will most effectively trigger an immune response based on the patient HLA receptors.
- Outputs:
[ranked-predictions.tsv](https://pvactools.readthedocs.io/en/7.0.0_docs/pvacseq/output_files.html)- Leaderboard of best targets - File Format:
.tsv
Goal: Compile the top predicted targets into a printable digital blueprint.
- Hardware: None
- Software:
- Generate protein string: NeoDesign or pVACvector
- Generate multiple candidate mRNA sequences: mRNAfold
- Select best mRNA sequence: mRNABERT
- Inputs: Top targets from
[ranked-predictions.tsv](https://pvactools.readthedocs.io/en/7.0.0_docs/pvacseq/output_files.html) - Process: Organize the cancer markers into a safe, logical order and then translate those instructions into a highly stable genetic "recipe."
- Outputs:
[vaccine-construct.fa](https://en.wikipedia.org/wiki/FASTA_format)- Master mRNA sequence - File Format:
.fa
Goal: Convert the digital blueprint back into a physical, readable linear DNA template.
- Hardware: Benchtop DNA Synthesizer (e.g., Telesis Bio BioXp)
- Alt. (Outsourced): Twist, IDT, GenScript, Azenta
- Est. Cost: ~$100k fixed + ~$600 / rxn (In-House) or ~$200-$900 / rxn (Outsourced)
- Inputs:
[vaccine-construct.fa](https://en.wikipedia.org/wiki/FASTA_format)blueprint- Reagents - Oligonucleotides, BspQI restriction enzymes, AMPure XP purification beads (cell-free route) or competent E. coli cells, LB media, miniprep kit (plasmid route)
- Process: Two synthesis routes are available - choose one:
- Cell-Free / Linear (recommended for speed): The BioXp system prints the DNA template directly from the digital sequence.
- Plasmid-Based (traditional): Gibson Assembly stitches oligonucleotides into a DNA plasmid, which is then linearized with enzymes.
- Outputs: ~1.5 mL Purified linear DNA template (~75 ug)
- File Format: Liquid DNA
Goal: Transcribe DNA into functional, immune-cloaked mRNA.
- Hardware: Telesis Bio BioXp
- Alt. (Outsourced): TriLink, GenScript, BiCell Scientific
- Est. Cost: ~$250k fixed + ~$2k / rxn (In-House) or ~$1k-$3k / rxn (Outsourced)
- Inputs:
- ~1.5 mL Purified linear DNA template (~75 ug)
- IVT Reagents (RNA Polymerase, N1-methylpseudouridine, CleanCap AG)
- Process: Automated In Vitro Transcription (IVT) systems synthesize the mRNA strand from the DNA template. The process includes:
- DNase I digestion to remove the template
- multi-stage purification (e.g., magnetic beads or HPLC) to isolate pure, functional mRNA.
- Outputs: ~5.0 mL Highly pure mRNA (~1.0 mg)
- File Format: Liquid mRNA
Goal: Wrap mRNA in a protective lipid nanoparticle to allow human cell entry.
- Hardware: Unchained Labs Sunshine / NanoAssemblr Ignite
- Alt. (Outsourced): VectorBuilder, Lonza, Vernal Biosciences
- Est. Cost: ~$150k fixed + ~$500 / rxn (In-House) or ~$2k-$5k / rxn (Outsourced)
- Inputs:
- ~5.0 mL Highly pure mRNA (~1.0 mg)
- 4-Lipid Cocktail (ALC-0315, PEG-Lipid, DSPC, Cholesterol)
- Process: Precise microfluidic collisions force the negatively charged mRNA and positively charged lipids to self-assemble into nanoparticles.
- Outputs: ~12 mL Raw mRNA-LNP mixture (~0.9 mg encapsulated)
- File Format: LNP Mixture
Goal: Validate integrity, size, and concentration before finalizing for injection.
- Hardware: Unchained Labs Stunner & TFF System
- Alt. (Outsourced): CordenPharma, uBriGene, VectorBuilder
- Est. Cost: ~$100k fixed + ~$100 / rxn (In-House) or ~$1k-$3k / rxn (Outsourced)
- Inputs: ~12 mL Raw mRNA-LNP mixture
- Process: DLS verifies particles are exactly 60-100nm. TFF washes out ethanol.
- Outputs: 10 x 1.0 mL sterile glass vials (approx. 10 doses)
- File Format: Final Vaccine Product
You can explore the system architecture interactively through our Flutter-based web app.
The interactive workflow is a Vite-based application. To run it:
- Navigate to the directory:
cd flutter_website - Install dependencies:
flutter build web
- Start the development server:
flutter run -d chrome
- Or, to build for production:
The website will be available in the build/web folder
flutter build web --release --base-href "/openvaxx/" <--wasm>
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.