-
Notifications
You must be signed in to change notification settings - Fork 44
Description
Hello,
I would like to train a supervised model with known antigen specificity, then use that model to classify new TCR sequences as potentially targeting certain antigens. I have followed along with the tutorials, but am still unclear on the best way to do this. I believe the closest is the "8 - VAE Inference.ipynb" tutorial but using a supervised model rather than the unsupervised. However, I am unclear on how to interpret the output from Sequence_Inference. I am using the example data Mouse Antigens for the model and Rudqvist for the new dataset. The resulting "features" object is 23856x9 which I believe corresponds to the individual TCR sequences (23856) and 9 different antigens with the entriesS being scores for how well the TCR sequence fits that antigen.
- Does a higher or lower score mean the TCR sequence fits better with the given antigen?
I tried to assess this myself by looking at the features of the supervised model, however this object has 224 columns. I was expecting this to have 9 corresponding with the different antigens.
-
What do the columns of the features object from the supervised model correspond to?
-
Would you suggest this method of classification, or something more akin to this tutorial "3 - Supervised Sequence Regression.ipynb"?
Thank you for your help!