Gauge invariance
Jean Zinn-Justin and Riccardo Guida (2008), Scholarpedia, 3(12):8287. | doi:10.4249/scholarpedia.8287 | revision #147398 [link to/cite this article] |
In electrodynamics, the structure of the field equations is such that the electric field \(\mathbf{E}(t,\mathbf{x})\) and the magnetic field \( \mathbf{B}(t,\mathbf{x})\) can be expressed in terms of a scalar field \(A_0(t,\mathbf{x})\) (scalar potential) and a vector field \( \mathbf{A}(t,\mathbf{x})\) (vector potential). The term gauge invariance refers to the property that a whole class of scalar and vector potentials, related by so-called gauge transformations, describe the same electric and magnetic fields. As a consequence, the dynamics of the electromagnetic fields and the dynamics of a charged system in an electromagnetic background do not depend on the choice of the representative \((A_0(t,\mathbf{x}),\mathbf{A}(t,\mathbf{x}))\) within the appropriate class. The concept of gauge invariance has then been extended to more general theories like, for example, Yang-Mills theories or General Relativity.
The classical variational principle and gauge invariance
The emphasis on variational principles in an article devoted to gauge invariance has the following motivation. The property that classical equations can be derived from a variational principle has played an essential role in the quantization of the corresponding models. In the case of electromagnetism, this has necessitated the introduction of a vector potential and the concept of gauge invariance. Conversely, the description of quantum mechanics in terms of path integrals yields a natural explanation for the appearance of variational principles in classical mechanics.
Euler-Lagrange equations
Around 1750, Euler and Lagrange developed the variational calculus. Lagrange showed that the equations of motion of Newtonian mechanics can be derived from a variational principle. Given a dynamical system expressed in terms of some coordinates \(\mathbf{q}\equiv\left(q^1,q^2,\ldots\right)\ ,\) one constructs a Lagrange function or Lagrangian \(\mathcal{L}( \mathbf{q}(t),\dot{\mathbf{q}}(t);t)\ ,\) where the total derivative of a quantity \(X\) with respect to the time \(t\) has been denoted by \(\dot{X}\ .\) The time integral of the Lagrangian, \[\tag{1} \mathcal{S}(\mathbf{q})=\int_{t'}^{t''}\mathrm{d} t\, \mathcal{L}\left( \mathbf{q}(t),\dot{\mathbf{q}}(t);t\right),\]
is called the action. Imposing the stationarity of the action with respect to variations of the trajectory \( \mathbf{q} (t)\ ,\) with fixed boundary conditions \(\mathbf{q}(t')=\mathbf{q}'\) and \(\mathbf{q}(t'')=\mathbf{q}''\ ,\) one recovers the equations governing the classical motion in a form called Euler-Lagrange equations: \[\tag{2} \mathcal{S}\left(\mathbf{q}+\delta\mathbf{q}\right)-\mathcal{S}\left(\mathbf{q}\right)=O(\|\delta\mathbf{q}\|^2)\ \Rightarrow\ \frac{\delta \mathcal{S}}{\delta q^i}=0\ \Rightarrow\ {\partial\mathcal{ L}\over\partial q^i}-{\mathrm{d} \over\mathrm{d} t}{\partial\mathcal{L} \over\partial \dot{q}^i}=0\,. \]
Note that the addition of a total derivative to the Lagrangian, \(\mathcal{L}\mapsto\mathcal{L}_{\Omega}=\mathcal{L}+\frac{\mathrm{d}\Omega(t,\mathbf{q}(t))}{\mathrm{d}t}\ ,\) changes the action only by boundary terms, \(\mathcal{S}\mapsto \mathcal{S}_{\Omega}=\mathcal{S}+\Omega (t'',\mathbf{q}'')-\Omega(t',\mathbf{q}')\ ,\) which do not affect the equation of motion.
In the following the coordinates \(\mathbf{q}:=(q^1,q^2,q^3)\) will refer to the standard Cartesian orthogonal coordinates, also denoted by \(\mathbf{x}:=(x^1,x^2,x^3)\ .\) We will also use the differential (vector) operator \(\nabla:=(\frac{\partial}{\partial x^1},\frac{\partial}{\partial x^2},\frac{\partial}{\partial x^3})\ .\)
The classical non-relativistic particle in a magnetic field
Gauss's law for magnetism states that the flux of the magnetic field \( \mathbf{B}(t,\mathbf{q})\) through any closed surface must vanish. In differential form, Gauss's law reads \[\tag{3} \nabla \cdot \mathbf{B}=\,0\,.\]
In a contractible manifold, thanks to Poincaré's lemma, Gauss's law (3) can be integrated by introducing a vector potential \( \mathbf{A}(t,\mathbf{q})\ ,\) in the form \[\tag{4} \mathbf{B} (t, \mathbf{q})=-\nabla\times \mathbf{A}(t, \mathbf{q}) .\]
Note that we have chosen for \(\mathbf{A}\) a sign convention opposite to the more usual one, to ensure a consistency of conventions between Abelian and non-Abelian gauge transformations. The identity \(\nabla \cdot \left(\nabla\times\mathbf{A}\right)=0\ ,\) valid for any smooth \( \mathbf{A}(t,\mathbf{q})\ ,\) implies that the expression (4) satisfies Gauss's law (3). Moreover, from the identity \(\nabla\times\left(\nabla \Omega\right)=0\ ,\) valid for any smooth \(\Omega(t,\mathbf{q})\ ,\) it follows that two vector potentials \( \mathbf{A}\) and \( \mathbf{A}^{\Omega}\ ,\) related by a gauge transformation \[\tag{5} \mathbf{A}^{\Omega}(t, \mathbf{q})= \mathbf{A}(t, \mathbf{q})-\nabla\Omega(t,\mathbf{q}) \,,\]
correspond to the same physical magnetic field. Therefore, the vector potential is not considered to be a physical quantity in classical mechanics. Vector potentials related by a gauge transformation form, from the physical viewpoint, an equivalence class.
For simplicity, we first assume that \(\mathbf{B}(\mathbf{q})\ ,\) the vector potential \(\mathbf{A}(\mathbf{q})\) and, thus, the function \(\Omega\) in the transformation (5), are time-independent . The equation of motion of a non-relativistic particle of charge \(e\) and mass \(m \) in a static external magnetic field \(\mathbf{B}(\mathbf{q})\) takes the form \[\tag{6} m \,\ddot{\mathbf{q}}= e \,\dot{\mathbf{q}}\times \mathbf{B} ( \mathbf{q})\,.\]
Remarkably enough, both equations (3) and (6) can be derived simultaneously from an action principle. One considers the Lagrangian \[\tag{7} \mathcal{L}( \mathbf{q},\dot{\mathbf{q}})={1\over2} \, m \,\dot{\mathbf{q}}^2-e \, \mathbf{A}( \mathbf{q})\cdot \dot{\mathbf{q}}\, ,\]
where \(\mathbf{A}( \mathbf{q})\) is a given vector field. The corresponding Euler-Lagrange equations (2) then read \[\tag{8} m \,\frac{\mathrm{d}^2q^i}{\mathrm{d}^2 t}-e\,\frac{\mathrm{d}}{\mathrm{d}t}\left(A_i(\mathbf{q}(t))\right)= -e \sum_{k=1}^3 {\dot{q}}^k \frac{\partial A_k}{\partial q^i}.\]
Using the identity \(\left((\nabla\times\mathbf{A})\times \dot{\mathbf{q}}\right)_i=\sum_k {\dot{q}}^k\partial_k A_i-{\dot{q}}^k\partial_i A_k\ ,\) one can rewrite (8) as \[m \,\ddot{\mathbf{q}}= e \left(\nabla\times \mathbf{A}(\mathbf{q})\right) \times \dot{\mathbf{q}}\,.\] Identifying the vector \(-\nabla\times \mathbf{A}(\mathbf{q})\) with the magnetic field \(\mathbf{B}(\mathbf{q})\ ,\) one recovers both Gauss's law in the form (4) and the equation of motion (6).
The principle of gauge invariance
The form (7) directly shows that in a time-independent gauge transformation (5) the Lagrangian is only modified by a total time-derivative \(e\,\dot{\mathbf{q}}\nabla\Omega(\mathbf{q})\ .\) As explained before, this change does not affect the equations of motion, which can thus be expressed in terms of gauge-independent quantities only. Conversely, the principle of gauge-invariance of the equations of motion, that is, the property that the Lagrangian should vary only by a total time derivative in a time-independent gauge transformation, constraints the form of the vector potential term, hence the explicit form of the equations of motion.
Looking for a Lagrangian appropriate to general time-dependent vector potentials, it is then natural to require invariance with respect to time-dependent gauge transformations. However, a time-dependent gauge transformation (5) no longer adds to the Lagrangian (7) a total time-derivative. Indeed, \[\mathcal{L}\mapsto \mathcal{L}_\Omega= \mathcal{L}+e\,\dot{\mathbf{q}}\cdot\nabla\Omega( t,\mathbf{q}) = \mathcal{L}+e\,{\mathrm{d}\Omega( t,\mathbf{q}) \over\mathrm{d}t} -e\,{\partial \Omega( t,\mathbf{q}) \over \partial t}\,.\] One is thus led to consider the more general Lagrangian \[\tag{9} \mathcal{L}( \mathbf{q},\dot{\mathbf{q}})={1\over2} \,m \,\dot{\mathbf{q}}^2-e\, \mathbf{A}(t, \mathbf{q})\cdot \dot{\mathbf{q}}-e \,A_0(t, \mathbf{q}), \]
where, to cancel the partial time-derivative \(-e\,{\partial \Omega( t,\mathbf{q}) \over \partial t}\ ,\) one has introduced an additional scalar potential \(A_0( t,\mathbf{q}) \) transforming as \[\tag{10} A_0(t, \mathbf{q}) \mapsto A_0^\Omega(t, \mathbf{q})= A_0(t, \mathbf{q})-{\partial\Omega( t,\mathbf{q}) \over \partial t}\]
under time-dependent gauge transformations. It follows that under time-dependent gauge transformations (5), (10) the Lagrangian (9) transforms as \[\tag{11} \mathcal{L}\mapsto \mathcal{L}_\Omega= \mathcal{L}+e\,{\mathrm{d}\Omega( t,\mathbf{q}) \over\mathrm{d}t}\]
The corresponding (gauge-invariant) classical equations of motion then read \[m\,\ddot{\mathbf{q}}= e\, \dot{\mathbf{q}} \times \mathbf{B}(t,\mathbf{q}) +e\, \mathbf{E}(t,\mathbf{q}),\] where the gauge-invariant quantity \[\mathbf{E}(t,\mathbf{q})={\partial\mathbf{A}(t, \mathbf{q}) \over\partial t}-\nabla A_0(t, \mathbf{q}) \] can be identified with the electric field, and, still, \(\mathbf{B}(t,\mathbf{q})=-\nabla\times \mathbf{A}(t,\mathbf{q})\ .\) When all fields are static, \(A_0(\mathbf{q})\) can be identified with the electrostatic potential and \( e A_0(\mathbf{q})\) corresponds to the electrostatic energy.
The classical Hamiltonian in a magnetic and electric field
From the classical Lagrangian (9), after the Legendre transformation, \[H({\mathbf p},{\mathbf q})+\mathcal{L}(\mathbf {q},\dot{\mathbf{q}})=\mathbf{p}\cdot\dot{\mathbf{q}}\quad \mathrm{with}\quad p_i:={\partial\mathcal{L}\over \partial {\dot{q}}^i}=m{\dot{q}}^i-e A_i\,,\] one obtains the classical Hamiltonian, function of the phase space variables \(({\mathbf p},{\mathbf q})\ ,\) \[\tag{12} H({\mathbf p},{\mathbf q};t)={1\over2m}\left( {\mathbf p}+e {\mathbf A}(t, {\mathbf q})\right)^2+e A_0( t, {\mathbf q}).\]
Under a gauge transformation (5), (10) the Lagrangian transform by a total time derivative, (11). In the Hamiltonian framework, this corresponds to a shift of the conjugated momentum and the scalar potential since \[H_\Omega(\mathbf{p},\mathbf{q};t) ={1\over2m}\left(\mathbf{p}+e {\mathbf A}^\Omega(t, {\mathbf q})\right)^2+e A_0^\Omega( t, {\mathbf q})=H(\mathbf{p}-e\nabla\Omega,\mathbf{q};t)-e{\partial\Omega\over\partial t}\,.\] In particular, in presence of a magnetic field, the relation between velocity and momentum \[\dot{q}^i ={\partial H \over\partial p_i}={1\over m}\left(p_i +eA_i\right),\] is gauge-dependent showing that the conjugated momentum \({\mathbf p}\) is no longer a physical quantity but only the generalized conjugate momentum \({\mathbf p}+e {\mathbf A}( t,{\mathbf q})\) is.
The quantum non-relativistic, spinless particle in a static magnetic field
Quantum mechanics. The Schrödinger equation
Quantum mechanics is built up out of physical states, described by vectors of unit norm \(\psi\) belonging to a complex Hilbert space \(\mathcal{H}\) (pure states), and out of operators \(\hat{O}\) acting on \(\mathcal{H}\ .\) In particular, physical observables are described by self-adjoint operators. Physical predictions in quantum mechanics involve operator expectation values of the form \((\psi,\hat{O}\psi)\) (where\((\cdot,\cdot)\) is the scalar product in \(\mathcal{H}\)): it follows that they are not affected by unitary transformations which act both on state vectors and on operators as \[\tag{13} \psi\mapsto U\psi, \quad \hat{O}\mapsto U \hat{O} U^\dagger,\]
where \(U\) is a unitary operator, that is, such that \( UU^\dagger=U^\dagger U=1\ .\) From these rules, it is also clear that the multiplication of state vectors by a phase (a \(U(1)\) global group transformation) \(\psi\mapsto \mathrm{e}^{i\theta}\psi\) leaves operators and physical predictions unchanged.
The time evolution of state vectors \(\psi(t )\in \mathcal{H}\) is governed by the Schrödinger equation (Schrödinger 1926) \[\tag{14} i\hbar{\partial \psi(t )\over\partial t}=\hat {H}\psi(t ),\]
where \(\hat{H}\) is the quantum Hamiltonian (self-adjoint) operator and \(2\pi\hbar\) is Planck's constant.
Quantum Hamiltonian in a magnetic field and gauge invariance
We now consider the Hamiltonian (12) restricted to \(\mathbf{A}(\mathbf{q})\) time-independent and \(A_0(t,\mathbf{q})=0\ .\)
The `correspondence principle' suggests to take as a quantum Hamiltonian the classical Hamiltonian (12) in which phase space variables, position \(\mathbf{q}\) and conjugated momentum \( \mathbf{p}\ ,\) are replaced by the corresponding self-adjoint quantum operators \(\hat{\mathbf{q}}\) and \(\hat {\mathbf{p}}\ ,\) which satisfy the canonical commuting relations \([p_j,q^k]=-i\hbar\delta_{j}^k\ .\) Complemented by the hermiticity condition (self-adjointness of observable operators) to fix the order of operators in products, this leads to the quantum Hamiltonian operator \[\hat{H}={1\over2m}\left(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\right)^2.\] One then verifies the identity \[\tag{15} U\left(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\right)U^\dagger =\hat {\mathbf p}+e{\mathbf A}^\Omega(\hat {\mathbf q}),\]
where \[\tag{16} U=\mathrm{e}^{i e\Omega(\hat{\mathbf{q}})/\hbar}\]
and \({\mathbf A}^\Omega\) is a gauge transform of \({\mathbf A}\ :\) \[\tag{17} \mathbf{A}^\Omega({\mathbf q})={\mathbf A}({\mathbf q})-\nabla \Omega ({\mathbf q}).\]
A gauge transformation of the vector potential, thus induces a unitary transformation on the covariant conjugate momentum operator \(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\) and, thus, on the Hamiltonian: \[U \hat {H}(\mathbf{A})U^\dagger = {1\over 2m}\left[ U\left(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\right)U^\dagger \right]^2=\hat{H}(\mathbf{A}^\Omega),\] where we have introduced the notation \(\hat{H}(\mathbf{A})\) to emphasize the dependence of the Hamiltonian on the vector potential. We now perform the corresponding unitary transformation on the vector \(\psi(t)\) in the Schrödinger equation (14), \[\tag{18} \psi_\Omega(t)=\mathrm{e}^{i e\Omega(\hat{\mathbf{q}})/\hbar} \psi(t),\]
a transformation also called gauge transformation in this context, and obtain the unitary equivalent Schrödinger equation: \[i\hbar {\partial\psi_\Omega(t) \over \partial t}= \hat {H}(\mathbf{A}^\Omega)\psi_\Omega(t).\]
We can now formulate the principle of gauge invariance in quantum mechanics: all physics results should be invariant in a simultaneous gauge transformation on the vector field \(\mathbf{A}\) and the state vector \(\psi(t)\) (equations (17),(18)). Gauge invariance ensures that physics results do not depend on the specific choice of the vector potential in the equivalence class. The principle of gauge invariance requires that physical operators \(\hat{O}\) undergo the same unitary transformation: \[\tag{19} \hat{O}({\mathbf A}^\Omega)= U \hat{O}(\mathbf {A}) U^\dagger\,.\]
For operators function of \(\hat{\mathbf{ p}}\) and \(\hat{\mathbf{q}}\ ,\) this implies that they can depend on \(\hat{\mathbf{ p}}\) only through the combination \(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\ .\)
Conversely, the condition of gauge invariance in the sense of demanding that physics results should remain unchanged in a gauge transformation (18) without performing the corresponding unitary transformations (13) on operators, is a dynamical principle in contrast with the condition of global invariance: it implies the introduction in the Hamiltonian of a vector potential and, thus, the presence of a magnetic field as well as a specific form of physical operators.
The wave function representation
The Hilbert space can be chosen in such a way that state vectors are represented by functions \(\psi(t ,\mathbf{q})\) (wave functions) on which \(\hat{\mathbf{q}},\,\hat {\mathbf{p}}\) act like \[\hat{\mathbf{q}}\psi(t ,\mathbf{q})=\mathbf{q}\psi(t ,\mathbf{q}),\quad \hat {\mathbf{p}}\psi(t ,\mathbf{q})={\hbar\over i}\nabla \psi(t ,\mathbf{q}).\] The unitary transformation \(U\) in (16) then takes the form \[\tag{20} \psi(t,\mathbf{q})\mapsto \psi_\Omega(t,\mathbf{q}):=\mathrm{e}^{i e\Omega (\mathbf{q})/\hbar}\psi(t,\mathbf{q})\equiv U(\mathbf{q})\, \psi(t,\mathbf{q}) .\]
Acting on wave functions \(\psi(t,\mathbf{q})\ ,\) these transformations generate, at each point \( \mathbf{q}\) in space, a representation of the multiplicative group \(U(1)\) of complex numbers of modulus 1. Group transformations that vary pointwise are called local group transformations, in contrast to global group transformations (or rigid group transformations) where the transformation act in the same way everywhere (in the present case, \(\Omega( \mathbf{q})\) constant for all \(\mathbf{q}\)). Gauge invariance implies then the invariance of physical results in local \(U(1)\) transformations.
The covariant conjugate momentum acts as a first order differential operator. One defines \[{i\over\hbar}\left(\hat {\mathbf p}+e{\mathbf A}(\hat {\mathbf q})\right) \mapsto \mathbf{D}:=\nabla+{ie\over\hbar}{\mathbf A}({\mathbf q})\,.\] As a consequence of the transformation (15),the covariant derivative \(\mathbf{D} \) satisfies \[\tag{21} U(\mathbf{q})\, \mathbf{D}_{\mathbf A} U^\dagger(\mathbf{q}) =\mathbf{D}_{{\mathbf A}^\Omega}\ \Leftrightarrow\ U(\mathbf{q})\, \mathbf{D}_{\mathbf A}=\mathbf{D}_{{\mathbf A}^\Omega}\, U(\mathbf{q})\,,\]
where we have used the notation \(\mathbf{D}_{\mathbf A}\) to emphasize its dependence on \( {\mathbf A}\ .\) Transformations (21) imply that \([\mathbf{D}\psi](x)\) and \([\mathbf{D}^2\psi](x)\) gauge-transform as tensors. The important point is that implementing gauge invariance in this context amounts to replacing normal derivatives by covariant derivatives. The quantum Hamiltonian is then a second order differential operator that can be written as \[\hat {H}:=-{\hbar^2\over2 m} \mathbf{D}^2 .\] The components \(D_i\) of \(\mathbf{D} \) satisfy \[ [D_i,D_j]={ie\over\hbar}\left(\nabla_i \mathbf{A}_j-\nabla_j \mathbf{A}_i\right)={ie\over\hbar}F_{ij} \quad \mathrm{with}\quad F_{ij}=-\sum_k \epsilon_{ijk} B_k \,.\] (Where \( \epsilon_{ijk}\) is the completely antisymmetric tensor with \(\epsilon_{123}=1\ .\)) Gauge transformations of covariant derivatives (21) imply that \(U(\mathbf{q})\,F_{ij}({\mathbf A})U^\dagger(\mathbf{q})=F_{ij}({\mathbf A}^\Omega)\ .\) Thus, because \(F_{ij}\) commutes with \(U(\mathbf{q})\) (which is just a phase), it is gauge-invariant.
One can generalize the formalism to time-dependent vector potentials and gauge transformations \[\tag{22} \psi_\Omega(t,\mathbf{q})\mapsto\psi_\Omega(t,\mathbf{q}):=\mathrm{e}^{i e\Omega (t,\mathbf{q})/\hbar}\psi(t,\mathbf{q})\equiv U(t,\mathbf{q})\, \psi(t,\mathbf{q})\,. \]
This implies introducing a scalar potential \(A_0\) and replacing in the Schrödinger equation the time-derivative by the covariant time-derivative \(D_0=\frac{\mathrm{d}}{\mathrm{d}t}+{ie\over\hbar}A_0\) (or, equivalently to use the quantized version of the classical Hamiltonian (12)): \[\tag{23} i\hbar\left(\frac{\mathrm{d}}{\mathrm{d}t}+{ie\over\hbar}A_0\right)\psi(t,\mathbf{q})= -{\hbar^2\over2 m} \mathbf{D}^2 \psi(t,\mathbf{q})\,.\]
The Schrödinger equation (23) then describes the evolution of a spinless charged particle in a magnetic and an electric field.
Path integrals
Following Feynman (Feynman 1948), quantum mechanics can be alternatively formulated in terms of path integrals. In this formalism, the matrix elements of the quantum evolution operator \(\mathcal{U}(t'',t') \) between times \( t'\) and \( t''\) are given by a sum over all possible (classical and non-classical) trajectories \(\mathbf{q}(t)\) (paths), which in the simplest cases can be written as \[\tag{24} \langle \mathbf{q}'' \left| \mathcal{U}(t'',t') \right| \mathbf{q}' \rangle = \int \left [ \mathrm{d} \mathbf{q} (t) \right] \exp\left({i \over \hbar}\mathcal{S} (\mathbf{q} )\right) \quad \hbox{with boundary conditions}\quad \mathbf{q}(t')=\mathbf{q}' , \ \mathbf{q}(t'')=\mathbf{q}'', \]
where \( \mathcal{S} (\mathbf{q} ) \) is the classical action defined in (1).
This formulation of quantum mechanics actually explains why equations of motion in classical mechanics can be derived from a variational principle in the form (2). In the classical limit, that is, when the typical classical action is large with respect to \(\hbar\ ,\) the path integral can be approximated by using the stationary phase method. The sum over paths is thus dominated by paths that leave the action stationary: the classical paths that satisfy (2). This property generalizes to relativistic quantum field theory.
In the case of a spinless non-relativistic particle in an electric and magnetic fields, the Lagrangian (9), under a gauge transformation changes by a total time derivative, (11). The action changes then by boundary terms and, correspondingly, the evolution operator transforms as \[\langle \mathbf{q}'' \left|\mathcal{U}(t'',t') \right| \mathbf{q}'\rangle\mapsto\, U(t'',\mathbf{q''})\,\langle \mathbf{q}'' \left|\mathcal{U}(t'',t') \right| \mathbf{q}'\rangle \, U^\dagger(t',\mathbf{q}')\] which is consistent with the transformation (22) of the wave function.
Classical electromagnetism and Maxwell's equations
Maxwell's equations (Maxwell 1861-1862) in the vacuum can be written (in local differential form) as:
\(\nabla\times \mathbf{E}+{\partial \mathbf{B}\over \partial t} =0 \) | Faraday's law | \(\tag{25} \) |
\(\nabla\cdot \mathbf{B}= 0 \) | Gauss's law for magnetism | \(\tag{26} \) |
\(\nabla\cdot \mathbf{E} =\rho\) | Gauss's law | \(\tag{27} \) |
\(\nabla\times \mathbf{B} -{\partial \mathbf{E}\over \partial t} = \mathbf{J}\) | Ampère-Maxwell's law | \(\tag{28} \) |
where \(\rho(t,\mathbf{x})\) is the charge density and \( \mathbf{J}(t,\mathbf{x})\) the current density, and, again, \(\mathbf{E}(t,\mathbf{x})\) is the electric field and \( \mathbf{B}(t,\mathbf{x})\) is the magnetic field. (Unless otherwise stated, in this article we work in the international system of units extended with the conditions of unitary electric and magnetic constants \(\epsilon_0=\mu_0=1 \ ,\) enforcing that the speed of light in these units is one, \( c=1\ .\))
Maxwell's equations are consistent with special relativity, that is, electromagnetism is a relativistic theory. The relativistic invariance of a theory is elegantly highlighted by expressing its observables in terms of quadrivectors and quadri-tensors, whose components are labeled by Greek indices \(\mu,\nu,\cdots\) running from \(0\) to \(3 \ .\) Due to their different behaviour under Lorentz transformations, upper indices and lower indices are named contravariant and covariant, respectively. Two quadrivectors of special interest are the coordinates \(x^\mu \) where \(x^0\equiv t\) is the time-component and \(x^1,x^2,x^3\) the space-components, and the corresponding derivatives \[ \partial_\mu=\frac{\partial}{\partial x^\mu}=\left(\frac{\partial}{\partial t},\frac{\partial}{\partial x^1},\frac{\partial}{\partial x^2},\frac{\partial}{\partial x^3}\right)\,. \] The Minkowski metric \(\eta_{\mu\nu}=\mathbf{diag}(+1,-1,-1,-1)\) (respectively, its inverse \(\eta^{\mu\nu}\)) is used to lower contravariant indices (to raise covariant indices) like, for example, \(V_\mu=\sum_{\nu=0}^3 \eta_{\mu\nu}V^\nu\ .\)
To express Maxwell's equations in quadri-covariant notation, one introduces the antisymmetric electromagnetic tensor \( F_{\mu\nu}\ ,\) defined by \[ \ F_{0i}=E_i\,,\ F_{ij}=-\sum_{k}\epsilon_{ijk}B_k\,, \ ( i,j,k=1,2,3 ) \,,\] as well as the quadri-current \(J^\mu=(\rho,J^i)\) (\(E_i, B_i, J^i\) being respectively the \(i\)-th component of the three-vectors \(\mathbf{E},\mathbf{B},\mathbf{J}\)).
In relativistic form, Faraday's and Gauss's laws (25), (26) are combined into \[\tag{29} \partial_\lambda F_{\mu\nu}+\partial_\mu F_{\nu\lambda}+\partial_\nu F_{\lambda\mu}=0\,,\]
(known as Bianchi identities) while Gauss's and Ampère-Maxwell's laws (27), (28) give rise to \[\tag{30} \sum_{\mu}\partial_{\mu} F ^{\mu\nu}=J^\nu \ \Rightarrow\ 0=\sum_{\nu,\mu}\partial_\nu\partial_{\mu} F ^{\mu\nu}=\sum_\nu\partial_\nu J^\nu\,.\]
In a contractible manifold, thanks to Poincaré's lemma, Bianchi identities (29) can be integrated by introducing a gauge field \(A_\mu(x)\) such that:
\[\tag{31}
F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu\,.\]
One then verifies that two gauge fields \(A_\mu(x)\) and \(A_\mu^{\Omega}(x)\) related by the gauge transformation \[\tag{32} A^{\Omega}_\mu(x)=A_\mu(x)-\partial_\mu \Omega(x),\]
correspond to the same electromagnetic tensor (31). In terms of non-relativistic scalar and vector potentials, \(A_\mu(x)=(A_0(x), A_i(x))\ ,\) and (32) corresponds to (5) and (10).
Like in the example of a particle in a magnetic field, Maxwell's equations (30) can be derived from the stationarity of an action expressed in terms of the vector potential, after the identification (31). In presence of a conserved current \(J^\mu\ ,\) \(\sum_\mu \partial_\mu J^\mu =0\ ,\) the Lagrangian density reads \[\tag{33} \mathcal{L}(A,\dot{A};J)=-{1\over4}\sum_{\mu,\nu} \left(\partial_\mu A_\nu-\partial_\nu A_\mu\right)\left(\partial^\mu A^\nu-\partial^\nu A^\mu\right) -\sum_\mu J^\mu A_\mu \]
and the action then is \[\tag{34} \mathcal{S}( A;J)=\int\mathrm{d}^4 x\,\mathcal{L} (A,\dot{A};J).\]
As in the example of the non-relativistic particle in a magnetic field, the action changes by boundary terms under the gauge transformation (32); if \(\Omega(x)\) is a smooth function vanishing at space-time infinity, the action is invariant since then \[ \mathcal{S}^\Omega-\mathcal{S}=\int\mathrm{d}^4 x\,\sum_\mu J^\mu(x)\partial_\mu \Omega(x)=\int\mathrm{d}^4 x\,\sum_\mu\left[ \partial_\mu \left( J^\mu(x)\Omega(x)\right)-\left( \partial_\mu J^\mu(x)\right)\Omega(x)\right]=0 \, .\]
Gauge fixing in classical gauge theories
In classical electromagnetism, the gauge-fixing problem is simply the problem of choosing a representative in the class of equivalent potentials, convenient for practical calculations or most suited to physical intuition.
Among the most usual non-relativistic gauges, one may cite (see (Jackson, 2000) for more details):
- \(\nabla\cdot\mathbf{A}(t,\mathbf{x}) =0 \ ,\) known as Coulomb's gauge,
- \(A_0(t,\mathbf{x})=0 \ ,\) known as temporal gauge (or Hamiltonian or Weyl's gauge),
- \(\mathbf{n}\cdot\mathbf{A}(t,\mathbf{x})=0\ ,\) known as non-relativistic axial gauge,
- \(\mathbf{x}\cdot\mathbf{A}(t,\mathbf{x})=0 \ ,\) known as multipolar gauge (or non-relativistic Poincaré's gauge),
and the relativistically invariant gauges:
- \(\sum_\mu \partial^\mu A_\mu(x)=0 \ ,\) known as Lorenz's gauge or Landau's gauge,
- \(\sum_\mu x^\mu A_\mu(x)=0 \ ,\) known as relativistic Poincaré's gauge (or Fock-Schwinger's gauge),
- \(\sum_\mu n^\mu A_\mu(x)=0 \ ,\) where \(n\) is a space-like quadrivector, is known as relativistic axial gauge,
- \(\sum_\mu n^\mu A_\mu(x)=0 \ ,\) where \(n\) is a null-like quadrivector, is known as light cone gauge,
- \(\sum_\mu \partial^\mu A_\mu(x)=s(x) \ ,\) for some scalar function \(s(x)\) (this gauge is sometimes used in the quantization process).
Note that, some of these conditions do not fix the gauge field representative completely. The form and the meaning of the residual invariance depend on the gauge fixed. Finally, these gauges have simple generalizations to the non-Abelian situation.
Quantum electrodynamics
In Quantum Electrodynamics (QED), time-evolution can be derived from an integral over fields, simple generalization of the path integrals of non-relativistic quantum mechanics.
Gauge field coupled to a conserved current
In analogy with non-relativistic quantum mechanics, one expects the time-evolution operator to be given by an integral over all classical fields: \[\mathcal{U} = \int\prod_\mu \left[ \mathrm{d} A_\mu(x) \right]\exp\left({i\over \hbar}\mathcal{S} ( A;J) \right), \] where the action is given by (34) (33).
However, due to its gauge invariance, the action depends only on three (out of four) degrees of freedom of the gauge fields and, thus, the integral over the full space of gauge fields is not defined. To solve the problem it is necessary to fix the gauge (see the sections on classical gauge-fixing and on quantum gauge-fixing). Gauge fixing can be achieved by restricting the integration over fields on a gauge section fixed by a constraint \(G(A,x)=0\ ,\) or, more generally, by fixing \(G(A,x)=s(x)\) and by integrating over \( s(x) \) with a Gaussian field distribution. The latter method leads to the addition to the action of the non-gauge invariant contribution \[\tag{35} \mathcal{S}_\mathrm{gauge} ( A) ={1\over 2\xi}\int\mathrm{d}^4 x\,\left(G(A,x)\right)^2\,.\]
The relativistic-covariant choice \(G(A,x)=\sum_\mu \partial_\mu A^\mu(x)\) gives in the limit \(\xi\rightarrow0\) Landau's gauge \(\sum_\mu \partial_\mu A^\mu(x)=0\ .\) The special value \(\xi=1\) in (35) corresponds to so called Feynman's gauge.
The quantum field theory constructed by this procedure does not explicitly satisfy the unitarity requirement (related to conservation of probabilities) and seems to depend on the gauge fixing function and the parameter \(\xi\ .\) The Ward–Takahashi identities, a set of relations among the Green's functions, allow to prove that physical observables do not depend on these specific choices (a property called gauge independence in this context) and also satisfy unitarity.
Charged matter fields
In a local field theory, the action is local, that is, is a space-time integral of a Lagrangian density, function of the fields and their derivatives.
To construct a gauge-invariant local action describing the interaction of matter with the gauge field \(A_\mu(x)\ ,\) we start from an action for charged matter fields that is local and invariant under global \(U(1)\) transformations.
As an example of matter, we consider free spin 1/2 Dirac fermions with charge \(e_\chi\) and mass \(m\ .\) The fields are then 4-component conjugate complex anticommuting (i.e., belonging to a Grassmann algebra) vectors \(\chi\) and \(\bar\chi\) (called spinors). The Lagrangian density reads: \[\tag{36} \mathcal{L}_{\mathrm{matter}}=\bar\chi(x)\left(\sum_\mu\gamma^\mu \partial_\mu +im \right)\chi(x)\,,\]
where \(\gamma^\mu\) are the \(4\times 4\) Dirac matrices, satisfying \(\gamma^\mu \gamma^\nu + \gamma^\nu \gamma^\mu = 2 \eta^{\mu \nu}\ .\) The action \(\mathcal{S}_{\mathrm{matter}} =\int\mathrm{d}^4 x\,\mathcal{L}_{\mathrm{matter}}\) is invariant under the global \(U(1)\) group transformations \[\tag{37} \chi(x)\mapsto U\chi(x) , \quad \bar \chi(x)\mapsto U^*\bar\chi (x),\qquad U:=\mathrm{e}^{ie_\chi\Omega/\hbar}\,,\]
where \(\Omega\) is space-time independent constant. Thanks to Noether's theorem (Noether 1918), this invariance implies (classically) the existence of a conserved current and of a conserved charge.
Gauge invariance requires invariance of the new action under the local group transformations obtained by replacing in (37) \(\Omega,U\) by space-time functions \(\Omega(x),U(x)\ .\) This is achieved by replacing, in the matter Lagrangian density (36), \(\partial_\mu\) by the covariant derivative \(D_\mu=\partial_\mu+i\frac{e_\chi}{\hbar} A_\mu(x)\) which transforms under a four-dimensional generalization of equation (21) as \(D_\mu(A^\Omega) [U(x)\chi(x)]=U(x)D_\mu(A)\chi(x)\ .\) Note that \( [ D_\mu, D_\nu]=i\frac{e_\chi}{\hbar}F_{\mu\nu}\ .\)
Quantum evolution of the total system (matter and gauge field) is then given by the field integral \[\mathcal{U}=\int[\mathrm{d}A_\mu][\mathrm{d}\bar\chi][\mathrm{d}\chi] \exp{{i\over\hbar}\left[\mathcal{S}(A)+\int\mathrm{d}^4 x\,\bar\chi(x)\left(\sum_\mu\gamma^\mu D_\mu +im\right)\chi(x)\right]}.\]
Non-Abelian gauge theories
To conform with the standard usage, in this section we use SI units furtherly constrained by the requirement \(\hbar=c=1\ .\)
Classical field theory
Yang and Mills (1954) have generalized the structure of quantum electrodynamics to a situation where the Abelian gauge group \(U(1)\) is replaced by some non-Abelian Lie group \(G\) of \(N\times N\) unitary matrices.
The construction of a gauge-invariant action describing the interactions of non-Abelian gauge fields with the matter fields goes as follows.
The starting point is a matter Lagrangian density invariant under global (i.e., space-time independent) transformations belonging to the group \(G\ .\) For example, we assume that matter fields \(\phi(x)\) form complex vectors that transform like \[\tag{38} \phi(x)\mapsto \mathbf{g}\phi(x),\quad \mathbf{g}\in G\,.\]
In addition we assume that the Lagrangian density \(\mathcal{L}_\mathrm{matter}(\phi,\partial_\mu\phi)\) is preserved by this global transformation, that is, that \[\tag{39} \mathcal{L}_\mathrm{matter}(\mathbf{g}\phi,\partial_\mu\mathbf{g}\phi)= \mathcal{L}_\mathrm{matter}(\mathbf{g}\phi,\mathbf{g}\partial_\mu\phi)= \mathcal{L}_\mathrm{matter}(\phi,\partial_\mu\phi)\,.\]
The goal is to promote the global invariance (38) of the action to an invariance under local transformations:
\[\tag{40}
\phi(x)\mapsto \mathbf{g}(x)\phi(x),\quad \mathbf{g}(x)\in G \ \forall x\,.\]
Again a problem arises with field derivatives. As in the Abelian example, the solution is to replace derivatives by covariant derivatives. Here, covariant derivatives are \(N\times N\) matrices of the form \[\tag{41} \mathbf{D}_{\mu}\equiv\mathbf{D}_{\mu}(\mathbf{A})= \mathbf{1}\,\partial_{\mu} + \mathbf{A}_{\mu}(x),\]
where the gauge field \(\mathbf{A}_\mu(x)\) belongs to the Lie algebra of the group \(G\ .\) (The non-Abelian gauge field should not be confused with the three-vector potential \(\mathbf{A}(x)\) used in previous sections.) By definition the covariant derivative \(\mathbf{D}_\mu\) is a tensor under general gauge transformations, that is, transforms linearly as \[\tag{42} \mathbf{g}(x)\left(\mathbf{1}\,\partial_{\mu} + \mathbf{A}_{\mu}(x)\right)\mathbf{g}^{-1}(x)=\mathbf{1}\partial_\mu +\mathbf{A}^{\mathbf{g}}_\mu(x) \ \Leftrightarrow\ \mathbf{g}(x)\mathbf{D}_\mu(\mathbf{A})= \mathbf{D}_\mu(\mathbf{A}^{\mathbf{g}})\, \mathbf{g}(x),\]
where, as a consequence, the gauge transform \(\mathbf{A}^{\mathbf{g}}_\mu\) of the gauge field is given by \[\tag{43} \mathbf{A}_{\mu}^{\mathbf g}(x)= \mathbf{g}(x) \mathbf{A}_{\mu}(x) \mathbf{g}^{-1}(x) + \mathbf{g}(x) \partial_{\mu} \mathbf{g}^{-1}(x),\quad \mathbf{g}(x)\in G\ \forall x\,. \]
The transformation is linear in the special case of a constant \( \mathbf{g}(x)=\mathbf{g}_0\) (global transformation), but in general is affine. From the property (42) of the covariant derivative it follows \[\tag{44} \mathcal{L}_\mathrm{matter}(\mathbf{g}(x)\phi,D_\mu\mathbf{g}(x)\phi)= \mathcal{L}_\mathrm{matter}(\mathbf{g}(x)\phi,\mathbf{g}(x)D_\mu\phi)= \mathcal{L}_\mathrm{matter}(\phi,D_\mu\phi)\]
and, thus, the matter action is now gauge-invariant.
In general, the gauge field \(\mathbf{A}_\mu(x)\) has a mathematical interpretation as a Lie-valued connection and is used to construct covariant derivatives acting on fields, whose form depends on the representation of the group \(G\) under which the field transforms (for global transformations).
The commutator of covariant derivatives of type (41), \[\tag{45} \mathbf{F}_{\mu\nu}(x) = \left[ \mathbf{D}_{\mu},\mathbf{D}_{\nu}\right] = \partial_{\mu} \mathbf{A}_{\nu}(x) - \partial_{\nu} \mathbf{A}_{\mu}(x) + \left[ \mathbf{A}_{\mu}(x),\mathbf{A}_{\nu}(x)\right] ,\]
is no longer a differential operator and corresponds to the curvature of the connection, which is a tensor for gauge transformations (i.e., it transforms linearly): \[ \mathbf{F}_{\mu\nu}(x) \mapsto \mathbf{g}(x) \mathbf{F}_{\mu\nu}(x) \mathbf{g}^{-1}(x). \] The field-strength tensor \(\mathbf{F}_{\mu\nu}(x)\) is an element of the Lie algebra of \(G\ .\) Since \(\mathbf{F}_{\mu\nu}\) is a tensor, the local action for the gauge field \[\tag{46} \mathcal{S}(\mathbf{ A})= {1\over 4 e^{2}} \int \mathrm{d}^4 x\, \mathrm{tr}\sum_{\mu,\nu} \mathbf{F} _{\mu \nu} (x ) \mathbf{F}^{\mu \nu} (x ),\]
is gauge-invariant.
When \( \mathbf{g}(x)\) is close to the identity, that is, \(\mathbf{g}(x)=\mathbf{1}+\omega(x) +O(\|\omega\|^2)\ ,\) the transformation (43) takes the form \[\tag{47} \mathbf{A}_{\mu}^{\mathbf g}(x)-\mathbf{A}_{\mu}(x)=-\mathbf{D}_\mu \omega(x) +O(\|\omega\|^2)\quad \mathrm{with}\quad \mathbf{D}_\mu \omega(x)\equiv \partial_\mu \omega(x)+[\mathbf{A}_{\mu}(x),\omega(x)],\]
where \( \mathbf{D}_\mu\) in (47) is the covariant derivative acting on fields belonging to the Lie algebra of the group \(G\ .\) This allows writing the field equations corresponding to the action (46) in the form \[\tag{48} \sum_\mu \mathbf{D}_\mu \mathbf{F}^{\mu\nu}(x)=0\,.\]
Since the strength tensor is quadratic in the gauge field (45) and the covariant derivative linear (47), the field equations are cubic, indicating that non-Abelian gauge fields are self-interacting, in contrast to the Abelian case.
In both the Abelian and non-Abelian case, physical observables are related to gauge-invariant polynomials in the fields (or gauge-invariant operators).
Component form
A basis of generators of the Lie algebra of the unitary matrix group can be chosen in the form of a set of \(N\times N\) antihermitian matrices \(\mathbf{t}_a\ .\) Both the gauge field and the strength-field tensor can be expanded on such a basis: \[\mathbf{A}_{\mu}(x)=\sum_a A_{\mu}^a(x)\,\mathbf{t}_a \,,\quad\mathbf{F}_{\mu\nu}(x)=\sum_a F_{\mu\nu}^a(x)\,\mathbf{t}_a\,.\] Introducing the structure constants of the Lie algebra \[[\mathbf{t}_a,\mathbf{t}_b]=\sum_c f_{abc}\,\mathbf{t}_c\,,\] the components of the tensor can be written more explicitly as \[\tag{49} F_{\mu\nu}^a(x)=\partial_\mu A_\nu^a(x)-\partial_\nu A_\mu^a(x)+\sum_{b,c}f_{bca}A_\mu^b(x) A_\nu^c(x).\]
The formulation of the Abelian gauge fields of the previous sections can be recovered (up to trivial normalization factors) for \(N=1\ ,\) \(\mathbf{t}_1 \mapsto i e_\phi \) and thus, \(f_{111}=0\ .\)
Gauge fields, geometry and lattice gauge theories
The concepts of gauge fields and covariant derivatives can be translated into the language of differential geometry, based on differential forms.
The gauge-field \(\mathbf{A}_{\nu}\) is equivalent to a Lie valued connection 1-form \(\mathbf{A}=\sum_\mu \mathbf{A}_\mu \mathrm{d} x^\mu\) and the strength-field tensor \(\mathbf{F}_{\mu\nu}\) to a Lie valued curvature 2-form \(\mathbf{F}=\sum_{\mu,\nu} \mathbf{F}_{\mu\nu}\mathrm{d} x^\mu\wedge \mathrm{d} x^\nu\ .\) In terms of differential forms, the relation (45) reads \[\mathbf{F}=2\left(\mathrm{d}\mathbf{A}+\mathbf{A}\wedge\mathbf{A} \right).\]
As all connections, the gauge connection generates parallel transport. Parallel transport becomes especially important in the framework of lattice gauge theories where space-time is replaced by a discrete lattice and gauge fields by group elements associated with links. Let us consider, for example, the lattice \(\mathbb{Z}^4\) of points with integer coordinates (lattice sites). To each pair of neighbouring sites one associates a group element \(\mathbf{U}_{ij}=\mathbf{U}^\dagger_{ji}\ ,\) which in this context plays the role of the gauge field. (We have denoted by \((i,j)\) the lattice sites.) A gauge transformation is then defined by a set of independent group elements \(\mathbf{g}_i\) on each lattice site \(i\ ;\) its action on the gauge field is: \[\mathbf{U}_{ij}\mapsto \mathbf{U}_{ij} ^{\mathbf{g}}=\mathbf{g}_i \mathbf{U}_{ij} \mathbf{g}^\dagger_j\,.\] Parallel transport is then achieved by products of group elements \(\mathbf{U}_{ij}\) along curves on the lattice (i.e. following lattice links).
In absence of matter, gauge invariant quantities, including the lattice gauge action, take the form of traces of product of link variables along closed loops.
Quantization of non-Abelian gauge theories
In Abelian and non-Abelian gauge theories not all components of the gauge field are dynamical due to gauge invariance and a simple canonical quantization is impossible. A gauge fixing is required. However, in contrast to QED, the quantized form of non-Abelian gauge theories cannot be guessed from simple heuristic arguments, even in the absence of matter fields. A simple extension of the ideas that worked for QED fails here. A formal quantization in the temporal or Weyl's gauge \(\mathbf{A}_0=0\) is still possible, but leads to a theory that is not explicitly relativistic covariant and has some singular properties. Transformations that can be easily explained only in the language of field integrals, allow going over to explicitly covariant formulations.
Gauge fixing in gauge field integrals
By contrast to classical field theory, in quantum gauge theories gauge-fixing is a basic issue. Indeed, since the integrand in field integrals giving physical observables is constant along gauge orbits (the set of all gauge fields obtained from one representative by gauge transformations), the naive field integrals (sums over all field configurations) are not defined. It becomes necessary to fix the gauge, that is, to integrate only over a section of gauge field space that -ideally- contains only one representative per orbit. Generalizing the formalism of QED, one defines the gauge section by a set of equations of the form \(G^a(A)=0\ ,\) where the index \(a\) runs over the generators of the Lie algebra. Frequently, one actually integrates not only on a gauge section, but on a whole neighbourhood of a section, by fixing \(G^a(A)=s^a(x)\) and by integrating over\( s^a(x)\) with a (functional) Gaussian distribution. This procedure amounts to adding to the action a (non gauge-invariant) gauge fixing term, which is a generalization of the term (35).
Moreover, it is necessary to integrate over gauge fields with a measure ensuring that physical results are independent of the choice of the section. Such a measure has been identified by Faddeev and Popov (1967) and, in the case of Landau's gauge, contains as a factor the absolute value of the determinant of a differential operator (which in QED reduces to an inessential constant). For small gauge fields, the determinant can be chosen positive and can be written in local form by the introduction of unphysical spinless fermion fields, the Faddeev--Popov ghost fields. BRST invariance (see (Becchi Rouet Stora 1976) and references therein) emerges in this context, as a substitute for the broken gauge invariance. It can then be proved that this construction yields a consistent theory in the sense of a perturbative expansion: in perturbation theory, the integrand in the field integral is expanded in the form of a Gaussian density multiplied by a series of powers of the interaction term.
However, the problem of integration over suitable gauge sections is more subtle at a non-perturbative level. Indeed, one would like the section to cut once or, at least, the same number of times all gauge orbits, but this is not generally the case in non-Abelian gauge theories, as first pointed out by Gribov in (Gribov 1978). (One speaks then of Gribov copies.) In particular, the Faddeev-Popov determinant changes sign when two Gribov copies merge and, if the absolute value of the determinant is not taken into proper account, the integration measure is no longer positive. One idea is then to restrict the integration to the region enclosing only one Gribov copy, but this is not easy to achieve in practice. The only known non-perturbative definition of gauge theories is obtained by replacing continuum space-time by a lattice leading to lattice gauge theories. Then, at least in a finite volume, gauge fixing is not necessary and the Gribov's problem can be ignored.
General relativity
Einstein's relativistic theory of gravitation, also known as General relativity, has properties somewhat related to gauge theories. Here, invariance under diffeomorphisms \(x \mapsto x(y)\) (locally regular changes of coordinates) in a (pseudo-) Riemannian manifold \(\mathcal{M}\ ,\) replaces gauge invariance. In this context, the Levi-Civita connection \(\mathbf{\Gamma}_\mu\) (with components the Christoffel symbol \(\Gamma^\lambda_{\nu\mu}\)), which generates parallel transport, plays the role of the gauge field. Some elements are given in Appendix "More on general relativity".
The similarity with gauge theories is even more striking when the vielbein formalism is introduced, which is required in the case of matter fields with spin. A vielbein is a locally flat frame in the space tangent to the manifold. A gauge transformation corresponds to a change of local frame (a local Lorentz transformation). Gauge invariance corresponds to the independence of field equations from the choice of the local frame. The so-called spin connection, which can be expressed in terms of the vielbein, plays the role of the gauge field.
Brief history of gauge invariance
- Gustav Kirchhoff uses the components of the vector potential in (Kirchhoff 1857, p.530) where he extends the work of Wilhelm Weber (1848) on electromagnetic induction. In this article, Kirchhoff also notices that the vector potential and the scalar potential satisfy a relation that in modern language can be called a gauge-fixing condition.
- The component form of Maxwell's equations can be found in equations (54), (56), (112), (115) of Clerk W. Maxwell's article "On physical lines of force" (Maxwell 1861-1962). Maxwell, influenced by William Thomson (later named Lord Kelvin), and by George G. Stokes, expresses, in equation (55) of the article, \(\mu\,(\alpha,\beta,\gamma)\) (in modern language the magnetic field \(\mathbf{B}=\mu\mathbf{H}\) as appropriate for linear materials) in terms of the curl of vector potential, whose components are denoted there \((F,G,H)\ .\) The components of the vector potential, are chosen to satisfy the constraint (57), which, in modern language is the Coulomb gauge condition \(\nabla\cdot\mathbf{A}=0\ .\) Maxwell uses the component vector potential also in his previous article "On Faraday's lines of force" (Maxwell, 1855, pag. 202), where they are called electro-tonic functions and denoted by \((\alpha_0,\beta_0,\gamma_0)\ .\)
- In a series of articles starting with (Helmholtz 1870), Hermann von Helmholtz analyzes previous work of Wilhelm Weber, F.E. Neumann and others on the electromagnetic forces among current elements. Helmholtz exhibits a family of vector potentials dependent on a real parameter and makes the seminal observation that while the differential form of the magnetic energy among current elements depends on the value of the parameter, the resulting integrated energy does not, or, in modern language, is gauge-invariant. See (Jackson and Okun, 2002) for more explicit expressions.
- Hermann Weyl introduces in three papers of 1918-1919 the concept of local scale invariance, called Massstab Invarianz in the first two articles, and Eich Invarianz in the third one. The Eich Invarianz, first translated as calibration invariance was then translated as gauge invariance.
- Weyl's attempt to unify gravity and electromagnetism, which culminated in 1919 in his book Raum, Zeit und Materie, was unsuccessful. It turned out from the works of Schrödinger (1922), Fock (1927), London (1927), Weyl (1928) that quantum wave functions should vary by a phase factor under a gauge transformation and that original Weyl's real scale should be replaced by a phase change involving the usual four dimensional electromagnetic vector potential.
- The years 1930-1950 saw the birth of Quantum Electrodynamics, the quantum theory extending Maxwell's classical theory. Many physicists contributed to this effort, including P.A.M. Dirac, V. Weisskopf, J. Schwinger, S. Tomonoga, F.J. Dyson and R.P. Feynman. The interested reader will find a collection of most relevant papers in Schwinger's book (1958).
- In 1954 Yang and Mills introduce non-Abelian gauge fields (Yang and Mills 1954). Ronald Shaw (Shaw 1955) deals with the same topic in his unpublished PhD thesis (under the direction of A. Salam).
Interested readers can find more information, for example, in (Darrigol 1999), (Jackson and Okun, 2002), (Taylor 2001).
Appendix: More on general relativity
References
- Aharonov(1959). Significance of Electromagnetic Potentials in the Quantum Theory. Physical Review 115: 485-491.
- Darrigol, O (1999). Electrodynamics from Ampère to Einstein. Oxford University Press, Oxford. ISBN 0-19-850593-0
- Becchi, C; Rouet, A and Stora, R (1976). Renormalization of gauge theories. Annals of Physics 98: 287-321.
- Ehrenberg(1949). The Refractive Index in Electron Optics and the Principles of Dynamics. Proc. Phys. Soc. B62: 8-21. doi:10.1088/0370-1301/62/1/303.
- Faddeev(1967). Feynman diagrams for the Yang-Mills field. Physics Letters B25: 29-30.
- Feynman, R P (1948). Space-Time Approach to Non-Relativistic Quantum Mechanics. Review of Modern Physics 20(2): 367-387.
- Fock, V (1927). Über die invariante Form der Wellen- und der Bewegungsgleichungen für einen geladenen Massenpunkt. Zeitschrift für Physik 39: 226-232. doi:10.1007/bf01321989. English translation in (Taylor 2001).
- Gribov, V N (1978). Quantization of non-Abelian gauge theories. Nuclear Physics B 139: 1-19.
- Helmholtz, H (1870). Ueber die Bewegungsgleichungen der Elektricität für ruhende leitende Körper. Journal für die reine und angewandte Mathematik 72: 57-129. doi:10.1515/crll.1870.72.57.
- Jackson(2001). Historical roots of gauge invariance. Review of Modern Physics 73: 663. arXiv:hep-ph/0012061
- Jackson, J D (2002). From Lorenz to Coulomb and other explicit gauge transformations. American Journal of Physics 70: 917. arXiv:physics/0204034
- Kirchhoff, G (1857). II. Ueber die Bewegung der Elektricität in Leitern Annalen der Physik und Chemie 102: 529-544. doi:10.1002/andp.18571781203. Reprinted in Gesammelte Abhandlungen von G.Kirchhoff (J. A. Barth, Leipzig 1882), p. 154-168.
- Lagrange J L (1788). Mécanique analytique. Chez La Veuve Desaint, Libraire (Paris) reedited in Œuvres de Lagrange, J-A Serret ed., Paris (1867).
- London, F (1927). Quantenmechanische Deutung der Theorie von Weyl Zeitschrift für Physik 42: 375-389. doi:10.1007/bf01397316. English translation in (Taylor 2001).
- Maxwell, J C (1855). On Faraday's Lines of Force Transactions of the Cambridge Philosophical Society X(I): 1455-228. Scanned copy from blazelabs.com.
- Maxwell, J C (1861-62). On Physical Lines of Force. Philosophical Magazine. Scanned copy from Wikimedia commons.
- Noether, E (1918). Invariante Variationsprobleme. Nachr. d. König. Gesellsch. d. Wiss. zu Göttingen, Math-phys. Klasse, 235–257. English translation by Tavel MA (1971), Transport Theory and Statistical Physics. 1(3):183–207. arxiv.org/abs/physics/0503066.
- Schrödinger, E (1922). Über eine bemerkenswerte Eigenschaft der Quantenbahnen eines einzelnen Elektrons Zeitschrift für Physik 12: 13-23. doi:10.1007/bf01328077.
- Schrödinger, E (1926). An Undulatory Theory of the Mechanics of Atoms and Molecules. Physical Review 28: 1049 - 1070.
- Schwinger, J editor (1958). Selected papers on Quantum Electrodynamics. Dover Publications, New York.
- Shaw, R (1955). The problem of particle types and other contributions to the theory of elementary particles. Cambridge Ph. D. Thesis. Unpublished. Advisor: A. Salam. A partial reprint can be found in (Taylor 2001).
- Taylor, J C editor (2001). Gauge Theories in the Twentieth Century. Imperial College Press, London.
- Weber, W (1848). I. Elektrodynamische Maassbestimmungen Annalen der Physik und Chemie 73: 193-240. Shortened version of the 1846 paper published in the Abhandlungen der Königlichen Sächsischen Gesellschaft der Wissenschaften, Leipzig.
- Weyl, H (1918). Sitzber. Preuss Akad. Wiss. : 465. Weyl, H (1918). Math. Z. 2: 384. Weyl, H (1919). Ann. Phys. 59: 101.
- Weyl, H (1919). Raum, Zeit und Materie. Verlag von Julius Springer, Berlin. Scanned copy from Internet Archive.
- Weyl, H (1928). Gruppentheorie und Quantenmechanik Hirzel , Lepzig.
- Yang(1954). Conservation of Isotopic Spin and Isotopic Gauge Invariance. Physical Review 96: 191-195.
Further reading
- Faddeev, L D and Slavnov, A A (1991). Gauge Fields. Introduction to quantum theory. (2nd edition). Addison-Wesley Publishing Company, T. ISBN 0201524724.
- Frankel, T (2003). The Geometry of Physics: An Introduction (2nd edition). Cambridge University Press, Cambridge. ISBN 0521539277.
- Itzykson, C and Zuber, J B (2006). Quantum Field Theory. Dover Publications, New York. ISBN 0486445682
- Lai, C H ed. (1981). Gauge Theory of Weak and Electromagnetic Interactions. World Scientific Publishing, Singapore. ISBN 978-9971830236
- 't Hooft, G (2005). 50 years of Yang-Mills theory World Scientific, Singapore. ISBN 978-981-256-007-0.
- Weinberg, S (1996). The quantum theory of fields. Vol. 2: Modern Applications. Cambridge University Press, Cambridge. ISBN 0521550025
- Zinn-Justin, J (2002). Quantum Field Theory and Critical Phenomena (4th edition). Oxford University Press, Oxford. ISBN 0198509235
External links
See also
Becchi-Rouet-Stora-Tyutin symmetry, Gauge theories, Gribov problem, Slavnov-Taylor identities, Zinn-Justin equation