Running Xclone on multi-region data

Hello,

I was wondering what the best approach would be for multi-region data from the same tumour. 

Would it be:
1. run the baf preprocessing separately per sample:
```
xcltk baf  \
    --label        {sample name}    \
    --sam          {BAM file}       \
    --barcode      {barcode file}   \
    --snpvcf       {genome1K.phase3.SNP_AF5e2.chr1toX.hg38.vcf.gz}  \
    --region       {annotate_genes_hg38_update_20230126.txt}        \
    --outdir       {output folder}          \
    --gmap         {Eagle_v2.4.1/tables/genetic_map_hg38_withX.txt.gz}  \
    --eagle        {Eagle_v2.4.1/eagle}      \
    --paneldir     {1000G_hg38}              \
    --ncores       10
```
2. Combine sample level adatas into a tumour level object for the RDR module
3. Combine the output of the sample level xcltk runs:
```
import xclone
data_dir = "xxx/xxx/xxx/"
AD_file = data_dir + "AD.mtx"
DP_file = data_dir + "DP.mtx"
mtx_barcodes_file = data_dir + "barcodes.tsv" # cell barcodes
# use default gene annotation
BAF_adata = xclone.pp.xclonedata([AD_file, DP_file], 'BAF',
                                 mtx_barcodes_file,
                                 genome_mode = "hg19_genes")
BAF_adata = xclone.pp.extra_anno(BAF_adata, anno_file, barcodes_key = "cell",
            cell_anno_key = ["Clone_ID", "Type", "cell_type"], sep = ",")
```
Doing the above for each sample and then combining the BAF_adata objects ? 
4.  Run the BAF and RDR modules on these combined objects 

Any help on this would be much appreciated

Best,
Tom 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Xclone on multi-region data #39

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running Xclone on multi-region data #39

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions