分享

TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices

热度