-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Allow users to specify a list of tokens which can be treated as atomic units (i.e., analogously to bytes). This will enable correct handling of multi-byte EOS tokens, which currently must be generated one byte at a time.
The TokenByteTrie already provides support for this via the optional atomic_tokens argument, which can take a list of tokens that are to be treated as atomic units rather than being split into bytes.
We will need to refactor the prefill function, which constructs a beam state given a byte sequence (for, e.g., prompted generation). This function currently steps the beam one byte at a time, which will lead to issues when an atomic byte sequence appears in the input to prefill.
Metadata
Metadata
Assignees
Labels
No labels