Skip to content

Running Xclone on multi-region data #39

@tpjones15

Description

@tpjones15

Hello,

I was wondering what the best approach would be for multi-region data from the same tumour.

Would it be:

  1. run the baf preprocessing separately per sample:
xcltk baf  \
    --label        {sample name}    \
    --sam          {BAM file}       \
    --barcode      {barcode file}   \
    --snpvcf       {genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz}  \
    --region       {annotate_genes_hg38_update_20230126.txt}        \
    --outdir       {output folder}          \
    --gmap         {Eagle_v2.4.1/tables/genetic_map_hg38_withX.txt.gz}  \
    --eagle        {Eagle_v2.4.1/eagle}      \
    --paneldir     {1000G_hg38}              \
    --ncores       10
  1. Combine sample level adatas into a tumour level object for the RDR module
  2. Combine the output of the sample level xcltk runs:
import xclone
data_dir = "xxx/xxx/xxx/"
AD_file = data_dir + "AD.mtx"
DP_file = data_dir + "DP.mtx"
mtx_barcodes_file = data_dir + "barcodes.tsv" # cell barcodes
# use default gene annotation
BAF_adata = xclone.pp.xclonedata([AD_file, DP_file], 'BAF',
                                 mtx_barcodes_file,
                                 genome_mode = "hg19_genes")
BAF_adata = xclone.pp.extra_anno(BAF_adata, anno_file, barcodes_key = "cell",
            cell_anno_key = ["Clone_ID", "Type", "cell_type"], sep = ",")

Doing the above for each sample and then combining the BAF_adata objects ?
4. Run the BAF and RDR modules on these combined objects

Any help on this would be much appreciated

Best,
Tom

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions