From the Nature Method paper, it seems like NTv2 was trained with datasets including token. Was NTv3 also trained with so that embedding is meaningful? It's in the tokenizer vocab, but there's no explicit mention of it in the bioarxiv paper for NTv3.