Interpreting Sequence_Inference output

Hello,

I would like to train a supervised model with known antigen specificity, then use that model to classify new TCR sequences as potentially targeting certain antigens. I have followed along with the tutorials, but am still unclear on the best way to do this. I believe the closest is the "8 - VAE Inference.ipynb" tutorial but using a supervised model rather than the unsupervised. However, I am unclear on how to interpret the output from Sequence_Inference. I am using the example data Mouse Antigens for the model and Rudqvist for the new dataset. The resulting "features" object is 23856x9 which I believe corresponds to the individual TCR sequences (23856) and 9 different antigens with the entriesS being scores for how well the TCR sequence fits that antigen. 

1) Does a higher or lower score mean the TCR sequence fits better with the given antigen?

I tried to assess this myself by looking at the features of the supervised model, however this object has 224 columns. I was expecting this to have 9 corresponding with the different antigens.

2) What do the columns of the features object from the supervised model correspond to?

3) Would you suggest this method of classification, or something more akin to this tutorial "3 - Supervised Sequence Regression.ipynb"?

Thank you for your help!






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpreting Sequence_Inference output #46

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Interpreting Sequence_Inference output #46

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions