分享

EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

热度