Ising model and it’s mean field approximation

An Ising model is a mathematical model that can explain the physical behavior of magnets. It was instrumental in understanding phase transition.

An Ising model with N spins. The i-th spin is Si. Each spin can only take one of the two states: +1 or 1. B is magnetic field. J is interaction strength between two neighboring spins. Total energy is:

E=iBSi+JijSiSj

The j in second term sums over nearest neighbor. The first term is the interaction between the external field B and individual spins, which is easy to deal with. The second term is hard – It is the interaction between the spins, a product of two unknowns. In other words, the second term denotes coupling between the spins.

To simplify the problem, we can imagine each spin Si feels an average spin ˉS coming from each of its neighbors, much like the external field B. By doing that, the second term of Eq. (1) depends on a single spin instead of two. It is now just like the first term involving a single summation.

The total energy Eq.(1) becomes

EMF=iBSi+JiSiˉS=(B+zJˉS)iSi

We can define mean field

ΔB=zJˉS

and z is the number of nearest neighbor (e.g. 4 for a square lattice).

Note that although the mean field energy Eq.(2) is simpler than Eq.(1), it is still a function of all spins, i.e. EMF(s1,s2,..,sN). Only that now the energy is a linear function of the spins.

Deriving the mean field

Gibbs-Bogoliubov-Feynman inequality

The mean field approximation is a way to simplify the partition function and make it possible for analytical treatment. In this section, we will show mean field approximation results in a lower bound to the partition function. (Partition function is defined as the sum of the probabilities of all possible states. It is an important expression in statistical mechanics. From it we can calculate many physical quantities such as average spin.)

Partition function of Ising model is

Z=s1,s2...sNexp[βE(s1,s2,...,sN)]

The sum is over all spin configurations, .i.e. s1=±1,s2=±1 etc.

This partition function is hard to evaluate. We can try to simplify it to make it computationally tractable. One way to do it is to find a way to replace the energy E whith its mean value.

Let E=EMF+ΔE. ΔE is the energy difference from the mean field energy EMF. The partition function can be rewritten as

Z=exp[β(EMF+ΔE)]=ZMFexp[β(EMF+ΔE)]ZMF=ZMFexp(βΔE)MF

ZMF=exp(βEMF) is the mean-field partition function, MF denotes expectation value weighted over mean field probabilities. Now the expectation value is still hard to evaluate. We can move the expectation operator to the exponent of the exponential function by

exp(βΔE)=exp(βΔE)exp(β(ΔEΔE)exp(βΔE)

Where we used

  • exp(x)>=1+x (You can see this as Taylor’s expansion or Jensen’s inequality)
  • exp(ΔE) is just a number and does not suject to the average.
  • The second exp() became 1 after first order expansion exp(x)>=1+x

The result is the Gibbs-Bogoliubov-Feynman inequality:

ZZMFexp(βΔEMF)

We can use ZMF instead of Z. The trade-off is now we have the lower bound instead of the actual partition function, introducing a biased error. But this makes the function computable, so we would take it…

Explicit expression of the lower bound

We can further calculate ΔE for the Ising model.

ΔE=EEMF

Recall

E=iBSi+JijSiSj

The mean field energy is

EMF=(B+zJˉS)iSi

So

EMF=N(B+ΔB)ˉS

The factor 1/2 in interaction term accounts for double counting.

E=iB¯Si+Jij¯Si¯Sj=N(BˉS+J2zˉS2)

Note that we can write ¯SiSj=¯Si¯Sj becasue the mean field energy that’s average over is a linear summation of the spins.

ΔE=EEMF=N(J2zˉS2ΔBˉS)

Note that the mean spin ˉS is a function of the mean field ΔB because the field is created by the spin self-consistently.

Explicitly, The lower bound of partition function is:

ZZMFexp(βΔEMF)=ZMFexp(βN(J2zˉS2ΔBˉS))

Note that ZMF and ˉS depend on ΔB.

Variational mean field

What is the physical interpretation of the mean field Eq.(3)? In this section, we show that it is the field that minimize the free energy. That sense sense because otherwise it won’t be the mean observed value!

Holtmonz free energy is given by

H=kBTlnZ

We can find the mean field ΔB by minimziing the free energy H. Instead of working with H directly, we minimize the upper bound (lower bound of partition function Z):

HUB=kBTln[ZMFexp(βN(J2zˉS2ΔBˉS))] =kBTlnZMF+NJ2zˉS2NBˉS

Minimizing the free energy upper bound w.r.t. the mean field ΔB (This is where the term variation comes from):

HUB/ΔB=0

Using

ZMF/ΔB=NβˉSZMF

It follows:

HUB/ΔB=0 ˉS/ΔB(ΔBJzˉS)=0

Therefore the mean field is

ΔB=JzˉS

the same result as using the mean field argument in the first section.

Connection to KL divergence

We wanted to approximate the true probability distribution

p(s)=exp(βE(s))/Z

with the mean field distribution

q(s)=exp(βEMF(s))/ZMF

where s=(s1,s2,,sN) is a vector that defines configuration of all spins.

A way to measure how good q(s) approximate p(s) is through Kullback-Leibler divergence:

KL(q(s)||p(s))=ds[q(s)logq(s)p(s)]

Putting in p(s) and q(s) gives

KL(q(s)||p(s))=dsexp(βEMF)ZMF(β(EEMF)+logZZMF)

Z and ZMF does not depend on the integrating variables and can be taken out of the integral.

KL(q(s)||p(s))=βEEMFMF+logZZMF

Since KL divergence is always 0, this gives

logZZMFβEEMFMFZZMFexp(βΔEMF)

which we recovers the Gibbs-Bogoliubov-Feynman inequality Eq.(10). So one interpretation of mean field approximation is to find a distribution to closely approximate the true distribution Eq. (26). For this reason, variational approaches to Bayesian inference problems are sometimes called mean-field theory.

Additional information

Mean Field Theory Solution of the Ising Model is an excellent write up of mean field solution to Ising model. Variational Inference: A Review for Statisticians is an excellent review of applications of variational methods.