Hi,
As working through #108, I hit another interesting case on 2896b833af1259e9cd5907fa6819c3da53beb88b:
As example, I took K03455.1 and masked the start of PR with Ns:
K03455_masked.fna.gz
Running through HIVDB via Stanford correctly identifies that AA 1-33 of PR are missing:

With sierra-local, I see a full alignment of PR in the JSON and no warnings:
K03455_masked_results.json
I get the same or similar results with both post-align and nucamino. It seems to me that we would want to handle X amino acids differently than other amino acids, more similarly to HIVDB? Many users may mask their sequences for low or no coverage bases and so it would be important from a quality perspective to know that we may miss important variants at those sites.
Hi,
As working through #108, I hit another interesting case on 2896b833af1259e9cd5907fa6819c3da53beb88b:
As example, I took K03455.1 and masked the start of PR with
Ns:K03455_masked.fna.gz
Running through HIVDB via Stanford correctly identifies that AA 1-33 of PR are missing:
With sierra-local, I see a full alignment of PR in the JSON and no warnings:
K03455_masked_results.json
I get the same or similar results with both
post-alignandnucamino. It seems to me that we would want to handleXamino acids differently than other amino acids, more similarly to HIVDB? Many users may mask their sequences for low or no coverage bases and so it would be important from a quality perspective to know that we may miss important variants at those sites.