Ising model and it’s mean field approximation

An Ising model is a mathematical model that can explain the physical behavior of magnets. It was instrumental in understanding phase transition.

An Ising model with spins. The i-th spin is . Each spin can only take one of the two states: or . is magnetic field. is interaction strength between two neighboring spins. Total energy is:

The in second term sums over nearest neighbor. The first term is the interaction between the external field and individual spins, which is easy to deal with. The second term is hard – It is the interaction between the spins, a product of two unknowns. In other words, the second term denotes coupling between the spins.

To simplify the problem, we can imagine each spin feels an average spin coming from each of its neighbors, much like the external field . By doing that, the second term of Eq. () depends on a single spin instead of two. It is now just like the first term involving a single summation.

The total energy Eq.() becomes

We can define mean field

and is the number of nearest neighbor (e.g. 4 for a square lattice).

Note that although the mean field energy Eq.() is simpler than Eq.(), it is still a function of all spins, i.e. . Only that now the energy is a linear function of the spins.

Deriving the mean field

Gibbs-Bogoliubov-Feynman inequality

The mean field approximation is a way to simplify the partition function and make it possible for analytical treatment. In this section, we will show mean field approximation results in a lower bound to the partition function. (Partition function is defined as the sum of the probabilities of all possible states. It is an important expression in statistical mechanics. From it we can calculate many physical quantities such as average spin.)

Partition function of Ising model is

The sum is over all spin configurations, .i.e. etc.

This partition function is hard to evaluate. We can try to simplify it to make it computationally tractable. One way to do it is to find a way to replace the energy whith its mean value.

Let . is the energy difference from the mean field energy . The partition function can be rewritten as

is the mean-field partition function, denotes expectation value weighted over mean field probabilities. Now the expectation value is still hard to evaluate. We can move the expectation operator to the exponent of the exponential function by

Where we used

(You can see this as Taylor’s expansion or Jensen’s inequality)
is just a number and does not suject to the average.
The second became 1 after first order expansion

The result is the Gibbs-Bogoliubov-Feynman inequality:

We can use instead of . The trade-off is now we have the lower bound instead of the actual partition function, introducing a biased error. But this makes the function computable, so we would take it…

Explicit expression of the lower bound

We can further calculate for the Ising model.

Recall

The mean field energy is

The factor 1/2 in interaction term accounts for double counting.

Note that we can write becasue the mean field energy that’s average over is a linear summation of the spins.

Note that the mean spin is a function of the mean field because the field is created by the spin self-consistently.

Explicitly, The lower bound of partition function is:

Note that and depend on .

Variational mean field

What is the physical interpretation of the mean field Eq.()? In this section, we show that it is the field that minimize the free energy. That sense sense because otherwise it won’t be the mean observed value!

Holtmonz free energy is given by

We can find the mean field by minimziing the free energy . Instead of working with directly, we minimize the upper bound (lower bound of partition function ):

Minimizing the free energy upper bound w.r.t. the mean field (This is where the term variation comes from):

Using

It follows:

Therefore the mean field is

the same result as using the mean field argument in the first section.

Connection to KL divergence

We wanted to approximate the true probability distribution

with the mean field distribution

where is a vector that defines configuration of all spins.

A way to measure how good approximate is through Kullback-Leibler divergence:

Putting in and gives

and does not depend on the integrating variables and can be taken out of the integral.

Since KL divergence is always , this gives

which we recovers the Gibbs-Bogoliubov-Feynman inequality Eq.(). So one interpretation of mean field approximation is to find a distribution to closely approximate the true distribution Eq. (). For this reason, variational approaches to Bayesian inference problems are sometimes called mean-field theory.

Additional information

Mean Field Theory Solution of the Ising Model is an excellent write up of mean field solution to Ising model. Variational Inference: A Review for Statisticians is an excellent review of applications of variational methods.