• 09. Oct 2024
  • Research output

Symmetric Dot-Product Attention for Efficient Training of BERT Language Models