speculative-sampling

Here are 4 public repositories matching this topic...

2.24x decode TPS increase On Qwen 3.6 27B @ temp 0.6 | Native MTP Speculative Decoding On Apple Silicon With No External Drafter.

ToyLLM: Learning LLM from Scratch

deep-learning gpt gpt2 llm large-language-model speculative-sampling

Implementation of Speculative Sampling in "Accelerating Large Language Model Decoding with Speculative Sampling"

efficient speculative sampling for language models

Add a description, image, and links to the speculative-sampling topic page so that developers can more easily learn about it.

To associate your repository with the speculative-sampling topic, visit your repo's landing page and select "manage topics."