分享

The Bayesian Geometry of Transformer Attention

热度