We propose a Transformer neural network architecture specifically designed for lattice QCD, focusing on preserving the fundamental symmetries required in lattice gauge theory. The proposed architecture is gauge covariant/equivariant, ensuring it respects gauge symmetry on the lattice, and is also equivariant under spacetime symmetries such as rotations and translations on the lattice.
A key feature of our approach lies in the attention matrix, which forms the core of the Transformer architecture. To preserve symmetries, we define the attention matrix using a Frobenius inner product between link variables and extended staples. This construction ensures that the attention matrix remains invariant under gauge transformations, thereby making the entire Transformer architecture covariant.
We evaluated the performance of the gauge covariant Transformer in the context of self-learning HMC. Numerical experiments show that the proposed architecture achieves higher performance compared to the gauge covariant neural networks, demonstrating its potential to improve lattice QCD calculations.