The Peierls argument: a proof of spontaneous magnetization

For some time after 1924 when Ising first solved the 1D version of the model that now bears his name, a popular opinion was that Ising-type models could never exhibit a phase transition from a disordered, paramagnetic state to a ferromagnetic state. This erroneous belief was propped up by arguments about analytic functions: a partition function for any finite system is a sum of analytic functions, hence it is itself analytic. So it cannot exhibit any sharp behavior like spontaneous magnetization. The claim about analytic functions is indeed true, but beside the point, since large finite sums of analytic functions can have behavior that cannot be distinguished from non-analytic by any physically realistic measurement apparatus.

It wasn't until 1936 that the first proof of spontaneous magnetization in the Ising model was established by Rudolph Peierls. In the lectures, we've seen how the mean-field approximation is a way to estimate the sharp onset of magnetization, but that argument was only suggestive since we cannot rule out that corrections to the mean-field approximation would destroy the phase transition point. Peierls established rigorously that there was a finite temperature \(T^+\) above which \(\langle M\rangle = 0\) in the thermodynamic limit, and another temperature \(T^-\) below which \(|\langle M\rangle| > \epsilon\) in the thermodynamic limit for some positive constant \(\epsilon\).

We now give a sketch of the Peierls argument. Recall that the Helmholtz free energy is given by \(F = E - TS\). In thermal equillibrium, we must be at a free energy minimum or else we would be able to extract useful work, contradicting the very notion of equillibrium. We will show that for the 1D Ising model, the entropy part of that equation dominates and will always lead to a disordered state no matter what the temperature. Whereas with the 2D Ising model, there is a competition between energy and entropy and therefore there must be a transtition temperature that leads to two stable states that both minimize the free energy.

Consider first the 1D Ising model. Suppose we start with a very long chain, and we fix the boundary spins to be in the +1 configuration (say). Now suppose that we insert a domain wall centered at some spin, say the origin, far from the boundary. Flipping this block of contiguous spins creates two boundaries, one on the left and one on the right. The total energy cost of the flip is \(\Delta E = 2 J - (-2J) = 4 J\), and this is true irrespective of the length L of the domain wall. The free energy difference, however, depends also on the entropy. To compute the change in entropy, we simply use \(S = k \log W\) to count the number of configurations that contain the origin spin inside such a domain wall. There are just L such configurations for a domain wall of length L, so the free energy change is given by \(\Delta F = 4 J - k T \log L\). Thus, no matter what the temperature T, there is some value L such that the introduction of the domain wall will lower the free energy. We conclude that thermal fluctuations can freely insert domain walls at arbitrarily small temperatures, leading to disorder and no net magnetization.

The situation in 2D is very different for both the energy and the entropy terms. Again we imagine starting in a state of a large 2D system with the spins on the boundary frozen in (say) the +1 configuration. We again wish to compute the free energy difference of inserting a domain wall at the origin that has a different sign. Now our domain wall boundary consists not just of a pair of points, but of some perimeter of length L. Each misaligned spin on the boundary creates an energy penalty of \(2J\), so the total change in energy is \(\Delta E = 2 J L\).

The entropy change is again \(k \log W\), but how can we count the number of connected domain walls? I claim that there are at most \(L^2 3^L\) such domain walls. This is because the domain is a connected component, so the boundary is a self-avoiding curve. The curve has perimeter L, so it must fit inside a box of side-length L centered at the origin. Each step along the perimeter of this curve has at most 3 choices for where to step next (or else it would backtrack and self-intersect). Since the total length is L and there are at most \(L^2\) starting points at which to begin the curve, there are at most \(L^2 3^L\) such domain walls. It is easy to see conversely that there are at least \(c^L\) such domain walls for some \(c \ge \sqrt{2}\). We conclude that the entropy difference scales like \(\sim k L \log c + O(\log L)\).

We therefore see that the free energy difference in the 2D Ising model is a competition between two terms that both scale as L: \[\Delta F = 2 J L - k T L \log c .\] When \(k T / J \gg 1\), the disorder dominates, as in the 1D Ising model, and it is thermodynamically advantageous to insert a domain wall. We conclude that there is no net magnetization because there is only one stable equillibrium state. By contrast, when \(kT/J \ll 1\), the cost of inserting the domain wall at the origin becomes infinite, and flipping a large domain cannot be achieved by any thermal fluctuation in the thermodynamic limit. We reached this conclusion by assuming, arbitrarily, that the initial spin at the origin was in the -1 state with the boundary in the +1 state, but we could have equally well started in the +1 state with a boundary in the -1 state. Therefore there are at least two stable equillibrium configurations where it is impossible to disorder them by putting in a domain of flipped spins; the free energy cost is simply too high. These are precisely the two magnitically ordered states that we've observed in the low-temperature phase in our Monte Carlo studies. This proves conclusively that the 2D Ising model indeed has a phase transition to a ferromagnetic state with spontaneous net magnetization.

Wolff sampling and critical slowdown

One of the main reasons we study the Ising model is the existence of a critical point in the phase diagram where spontaneous magnetization occurs. The critical temperature for this phase transition is given by \[k T_c = \frac{2 \lvert J\rvert}{\ln(1+\sqrt{2})} \approx 2.27 \lvert J\rvert\,.\]

Near the critical point, algorithms like Metropolis that flip only single spins at a time begin to suffer from a phenomenon called critical slowing down. This is where the time needed to generate statistically independent samples by repeating the update rule starts to grow faster than the size of the lattice near the critical point.

Recall that the average correlation function is defined as \[C(x,y) = \frac{1}{N^2}\sum_{i,j}\avg{s_{i,j} s_{i+x,j+y}} - \avg{M}^2\,,\] where \(\avg{M}\) is the average magnetization density. If we change variables to radial coordinates, and further average over the angular variable, then we can put this in terms of a distance \(r\). This radial correlation function \(C(r)\) can be parameterized in the following manner in the limit of a large lattice (and \(r \gg 1\)): \[C(r) \sim \frac{\e^{-r/\xi}}{r^{d-2+\eta}}\,.\] Here \(\xi\) is called the correlation length, \(d\) is the dimension of the model (2 for the 2D Ising model), and \(\eta\) is called the correlation function critical exponent (\(\eta = 1/4\) for the 2D Ising model).

Near the critical point, the correlation length starts to diverge, and the exponential decay of correlations starts to become merely an algebraic, long-range decay. A specific prediction from renormalization theory is that \[\xi \sim \frac{1}{\lvert T-T_c \rvert^\nu}\,\] where \(\nu = 1\) for the 2D Ising model. This divergence and the long-range correlations that result cause difficulty in simulating physics near the critical point.

The integrated autocorrelation time \(\tau\) is a measure of this difficulty. Suppose we have an order parameter, i.e. a quantity like average magnetization density in the 2D Ising model, which quantifies the phase transition. After a suitably long simulation time, this quantity will eventually reach its equilibrium value. This time scale can be expressed as \[ \tau = \sum_{t=0}^\infty\frac{\bigl(\avg{M(t)M(0)} - \avg{M(0)}^2\bigr) }{\bigl(\avg{M(0)^2} - \avg{M(0)}^2\bigr)}\,.\] Near the critical point, and for an infinite system, this quantitiy too diverges \[ \tau \sim \xi^z \sim \frac{1}{\lvert T-T_c \rvert^{z\nu}}\,. \] The quantity \(z\) is called the dynamical critical exponent associated to the observable \(M\), and \(z=2\) for the magnetization in the 2D Ising model.

Now consider a simulation of the Ising model on a finite \(N \times N\) grid. The correlation length cannot be greater than \(N\) for this system. This means that the characteristic timescale for equilibration is about \[\tau \sim N^{z} = N^2\,.\]

For an algorithm like Metropolis sampling that only flips single spins at a time, this is a disaster. Because Metropolis is a local algorithm, it can only move a spin configuration by a constant number of lattice sites for every \(N^2\) iterations. But because of the long-range correlations, we need to move about \(N\) sites before reaching a new, uncorrelated configuration. This is roughly a random walk, so it should take an additional \(N^2\) time steps to travel this distance. Thus, sampling independent spin configurations takes time roughly \(N^4\) for a local algorithm like Metropolis.

Cluster flip algorithms

An obvious idea is to try to flip many spins at once. However, it isn't clear that this can be done while still fulfilling the key ingredients of a successful Monte Carlo algorithm: it should satisfy the conditions of ergodicity, the Markov property, and detailed balance. It turns out that such a method does exist, and it is due to Wolff, who built on initial ideas of Swedson and Wang.

The Wolff cluster flip algorithm is the following:

  1. Choose a spin at random and remember its direction.
  2. For each neighboring spin, if it is parallel to the seed spin, add it with probability \(p\).
  3. For each newly added neighbor, repeat 2 recursively until the cluster stops growing.
  4. Flip all the spins in the cluster.
  5. Repeat.

For the very special value of \(p = 1-\e^{-2J/kT}\), the Wolff algorithm satisfies the detailed balance condition with Boltzmann weights. At the level of single clusters, it is Markovian. And any spin can flip, so it is ergodic. Therefore, it has the Boltzmann probability distribution as its unique stationary distribution.

The advantage of the Wolff algorithm is that the correlation time is much shorter near the critical point. Because the Wolff algorithm makes nonlocal moves, it can change to a statistically independent configuration after very few moves.

More fun with simulations

Here are some more exercises for you to explore, including some that we did with Metropolis sampling. Download the latest version of ising.zip and try to answer the questions below.

Perfect sampling and coupling from the past

There is another method to deal with the long autocorrelation times for certain families of critical models, including the ferromagnetic Ising model. Ideally, we would like to obtain a "perfect" sample from the stationary distribution, but this would seem to involve running our Markov chain for an infinite amount of time. A beautiful idea due to Propp and Wilson showed that we can in fact sample perfectly if we run our Markov chain starting in the distanct past and go back to the future!

This sounds fantastical at first, and maybe even nonsensical. But we can break it down into a few simple ideas, each of which is well motivated, and together the Propp-Wilson algorithm for perfect sampling emerges naturally.