01
训练并行性
模型并行性
02
专家混合(MoE)
03
其他节省内存的设计
@article{weng2021large,
title = "How to Train Really Large Models on Many GPUs?",
author = "Weng, Lilian",
journal = "lilianweng.github.io/lil-log",
year = "2021",
url = "https://lilianweng.github.io/lil-log/2021/09/24/train-large-neural-networks.html"
}
参考文献:
内容中包含的图片若涉及版权问题,请及时与我们联系删除
评论
沙发等你来抢