分享

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

热度