• 17. Feb. 2025
  • Forschungsergebnis

Symmetric Dot-Product Attention for Efficient Training of BERT Language Models