分享

ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

热度