Unformatted Document Excerpt
Coursehero >>
Yemen >>
Sana'a University >>
PHYSICS 111
Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
Mechanics
UweJens Statistical Wiese
Albert Einstein Center for Fundamental Physics
Institute for Theoretical Physics
Bern University
December 23, 2010
2
Contents
1 Introduction
9
2 Kinetic Theory of the Classical Ideal Gas
13
2.1
Atoms and Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2
Pressure and Temperature of the Ideal Gas . . . . . . . . . . . . . 15
3 Microcanonical and Canonical Ensemble
19
3.1
The Hamilton Function . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2
The Concept of an Ensemble . . . . . . . . . . . . . . . . . . . . . 21
3.3
The Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . 22
3.4
The Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5
Particle on an Energy Ladder . . . . . . . . . . . . . . . . . . . . . 25
3.6
Model for a Heat Bath . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7
Canonical Ensemble for Particles on a Ladder . . . . . . . . . . . . 28
3.8
Microcanonical Ensemble for Particles on a Ladder . . . . . . . . . 29
4 Information and Entropy
33
4.1
Information and Information Deﬁcit . . . . . . . . . . . . . . . . . 33
4.2
The Concept of Entropy . . . . . . . . . . . . . . . . . . . . . . . . 35
3
4
CONTENTS
4.3
Entropy and Free Energy in the Canonical Ensemble . . . . . . . . 36
4.4
Entropy of Particles on a Ladder . . . . . . . . . . . . . . . . . . . 36
4.5
The Principle of Maximum Entropy . . . . . . . . . . . . . . . . . 38
4.6
The Arrow of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Canonical Ensemble for the Ideal Gas
45
5.1
The MaxwellBoltzmann Distribution
. . . . . . . . . . . . . . . . 45
5.2
Ideal Gas in a Gravitational Field
5.3
Distinguishability of Classical Particles . . . . . . . . . . . . . . . . 48
5.4
The Entropy of the Classical Ideal Gas . . . . . . . . . . . . . . . . 49
5.5
Gibbs’ Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.6
Mixing Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
. . . . . . . . . . . . . . . . . . 46
6 Grand Canonical Ensemble
55
6.1
Introduction of the Grand Canonical Ensemble . . . . . . . . . . . 55
6.2
Grand Canonical Ensemble of Particles on a Ladder . . . . . . . . 57
6.3
Chemical Potential of Particles on a Ladder . . . . . . . . . . . . . 58
6.4
Chemical Potential of the Classical Ideal Gas . . . . . . . . . . . . 60
6.5
Grand Canonical Ensemble for the Ideal Gas . . . . . . . . . . . . 61
7 Pressure Ensemble
63
7.1
Introduction of the Pressure Ensemble . . . . . . . . . . . . . . . . 63
7.2
The Pressure of the Classical Ideal Gas
7.3
The Pressure Ensemble for the Classical Ideal Gas . . . . . . . . . 65
7.4
Overview of Diﬀerent Ensembles . . . . . . . . . . . . . . . . . . . 66
. . . . . . . . . . . . . . . 64
5
CONTENTS
8 Equilibrium Thermodynamics
69
8.1
The First Law of Thermodynamics . . . . . . . . . . . . . . . . . . 69
8.2
Expansion of a Classical Ideal Gas . . . . . . . . . . . . . . . . . . 70
8.3
Heat and Entropy Change . . . . . . . . . . . . . . . . . . . . . . . 71
8.4
Equations of State . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.5
Thermodynamic Coeﬃcients . . . . . . . . . . . . . . . . . . . . . . 74
9 Nonequilibrium Thermodynamics
77
9.1
Extremality of the Entropy in Equilibrium . . . . . . . . . . . . . . 77
9.2
Time Evolution and Poisson Brackets
9.3
Conservation of Probability . . . . . . . . . . . . . . . . . . . . . . 79
9.4
Conservation of Entropy . . . . . . . . . . . . . . . . . . . . . . . . 80
9.5
A Model with Entropy Increase . . . . . . . . . . . . . . . . . . . . 81
9.6
A Model for Diﬀusion . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.7
Approach to Equilibrium
10 The Ising Model
. . . . . . . . . . . . . . . . 79
. . . . . . . . . . . . . . . . . . . . . . . 84
87
10.1 Deﬁnition and Basic Properties . . . . . . . . . . . . . . . . . . . . 87
10.2 Mean Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.3 Exact Results for the 1dimensional Ising Model . . . . . . . . . . 91
10.4 Exact Results for the 2dimensional Ising Model . . . . . . . . . . 93
10.5 Cluster Representation . . . . . . . . . . . . . . . . . . . . . . . . . 94
11 The Monte Carlo Method
97
11.1 The Concept of a Markov Chain . . . . . . . . . . . . . . . . . . . 97
11.2 Ergodicity and Detailed Balance . . . . . . . . . . . . . . . . . . . 98
6
CONTENTS
11.3 The Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . 99
11.4 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
11.5 The SwendsenWang Cluster Algorithm . . . . . . . . . . . . . . . 101
11.6 The Wolﬀ Cluster Algorithm . . . . . . . . . . . . . . . . . . . . . 103
12 Quantum Statistical Mechanics
105
12.1 Canonical Ensemble in Quantum Statistics . . . . . . . . . . . . . 105
12.2 Canonical Ensemble for the Harmonic Oscillator . . . . . . . . . . 108
13 Hot Quantum Gases in the Early Universe
109
13.1 BoseEinstein Statistics and Background Radiation . . . . . . . . . 109
13.2 Thermodynamical Distributions . . . . . . . . . . . . . . . . . . . . 114
13.3 Entropy Conservation and Neutrino Temperature . . . . . . . . . . 117
14 Lattice Vibrations
121
14.1 A 1dimensional Model for Ions Forming a Crystal . . . . . . . . . 122
14.2 Phonon Creation and Annihilation Operators . . . . . . . . . . . . 124
14.3 Phonons in One Dimension . . . . . . . . . . . . . . . . . . . . . . 126
14.4 From Particles to “Wavicles” . . . . . . . . . . . . . . . . . . . . . 127
14.5 Speciﬁc Heat of a 1dimensional “Solid” . . . . . . . . . . . . . . . 131
14.6 Fluctuations in a 1dimensional “Solid” . . . . . . . . . . . . . . . 132
14.7 A 3dimensional Model for Ions in a Crystal . . . . . . . . . . . . . 135
14.8 Speciﬁc Heat of 3dimensional Solids . . . . . . . . . . . . . . . . . 138
15 Electrons in Solids
141
15.1 Electron Creation and Annihilation Operators . . . . . . . . . . . . 142
CONTENTS
7
15.2 A Model for Electrons Hopping on a Lattice . . . . . . . . . . . . . 147
15.3 Grand Canonical Ensemble and Fermi Surface . . . . . . . . . . . . 148
15.4 Electrons and the Speciﬁc Heat of Metals . . . . . . . . . . . . . . 151
15.5 Repulsive Hubbard Model at HalfFilling
16 Magnons in Ferro and Antiferromagnets
. . . . . . . . . . . . . . 152
155
16.1 Antiferromagnetic Heisenberg Model . . . . . . . . . . . . . . . . . 156
16.2 Ferromagnetic Heisenberg Model . . . . . . . . . . . . . . . . . . . 157
16.3 Magnon Dispersion Relation . . . . . . . . . . . . . . . . . . . . . . 159
16.4 Speciﬁc Heat of a Ferromagnet . . . . . . . . . . . . . . . . . . . . 161
8
CONTENTS
Chapter 1
Introduction
Macroscopic amounts of matter manifest themselves in a vast variety of diﬀerent phases. Besides gases, liquids, and solids, there are BoseEinstein condensates, gels, liquid crystals, superﬂuids, and superconductors (including both the
traditional metallic lowtemperature superconductors and the more recently discovered ceramic hightemperature superconductors). There are other even more
exotic forms of condensed matter such as nuclear matter inside dense neutron
stars, and perhaps quark matter deep inside their cores, the quarkgluon plasma
that existed during the ﬁrst microsecond after the big bang, and the plasma of
other elementary particles that ﬁlled the early universe.
The fundamental constituents of all these forms of matter are quite well understood in the standard model of particle physics: they are the quarks that are
bound inside protons and neutrons, which in turn form the atomic nucleus, the
electrons that surround the nucleus, as well as more exotic elementary particles
such as the shortlived muons or the very weakly interacting neutrinos. Also the
fundamental forces are quite well understood: the strong interactions that bind
quarks inside protons and neutrons are mediated by gluons, the weak interactions
responsible e.g. for radioactive decay are mediated by heavy W  and Z bosons,
electromagnetic interactions are mediated by massless photons, and then there is
gravity — the weakest force in the universe.
Although the basic constituents of matter as well as the fundamental forces
between them are well understood, it is in general extremely diﬃcult to derive
the complex properties of condensed matter from ﬁrst principles. The enormous
number of basic degrees of freedom that are participating in the dynamics of
condensed matter makes it practically impossible to understand them in detail
9
10
CHAPTER 1. INTRODUCTION
from ﬁrst principles. For example, a gas of N particles moving in 3dimensional
space has 3N microscopic degrees of freedom. A macroscopic sample of gas
particles easily contains Avogadro’s number of particles, NA ≈ 6 × 1023 , an
enormous number of degrees of freedom. If we know all forces that the gas
particles exert on each other, and if we know the initial positions and velocities of
all gas particles, e.g. classical mechanics can be used to describe the gas. The time
evolution of the gas particles then results as the solution of Newton’s equations,
which is a system of 3N coupled ordinary second order diﬀerential equations. In
practice, it is impossible to solve such a gigantic system of equations. Already the
determination of the initial conditions alone is practically impossible. Even if we
would be able to overcome all these problems, the solution of Newton’s equation
would provide us with a set of 3N functions describing the particle coordinates
as functions of time. Hence, although the fundamental theories of physics —
classical or quantum mechanics — in principle allow us to deal explicitly with all
microscopic degrees of freedom, in practice this is neither possible nor particularly
useful. Who wants to know exactly how each individual degree of freedom evolves
as a function of time?
It is much more adequate to describe the gas by macroscopic quantities that
are averages over many microscopic degrees of freedom. Such averaged quantities are, for example, the energy or particle number density. Remarkably, we
don’t need to know in detail how the system works microscopically before we can
make statements about those macroscopic quantities. For example, in the nineteenth century physicists knew much less about the basic constituents of matter
than we do today, and still they were able to develop the very successful theory
of thermodynamics. The large number of particles in a gas undergoes chaotic
motion. The kinetic energy of this motion manifests itself as heat. To a large
extent thermodynamics is the science of heat. No matter how complicated a
system is microscopically, it will always obey energy conservation. Consequently,
heat appears as a form of energy in the ﬁrst law of thermodynamics, which just
represents energy conservation.
While it is easy to convert mechanical energy into heat (e.g. by using the
friction between a rotating stick and a wooden plate), it is impossible to completely convert heat back into mechanical energy (e.g. by heating a stick and
wooden plate and waiting for the stick to start rotating). A machine that would
completely convert heat into mechanical energy is known as a perpetuum mobile
of the second kind. Although it would obey energy conservation (and hence the
ﬁrst law of thermodynamics), it would violate the second law of thermodynamics,
which states that entropy (a measure of disorder) never decreases. Hence, a perpetuum mobile of the second kind can indeed not exist. In the nineteenth century
11
engineers were interested in converting heat into mechanical energy, for example, using steam engines. Indeed, originally one motivation for the development
of thermodynamics was the need to understand the dynamics of steam engines.
Even though heat cannot be converted completely into mechanical energy, it can
be converted to some extent. The most eﬃcient engine for doing this is going
through a socalled Carnot process. Remarkably, even if we would not understand
how the gas works at microscopic scales (which was true for nineteenth century
engineers), we can compute the eﬃciency of a Carnot machine.
Thermodynamics is limited to understanding the most basic features of a
macroscopic system which follow from basic principles like energy conservation
and entropy increase. Statistical mechanics is more ambitious and tries to derive
thermodynamic properties from the underlying microscopic physics. At microscopic scales physics is governed by quantum mechanics. Hence, to some extent we will be dealing with quantum statistical mechanics. In general it will
be diﬃcult to derive the averaged thermodynamic quantities exactly from the
microscopic dynamics. This is particularly hard if the averages of microscopic
quantities still vary in space or with time. In that case, the system would not be
in thermodynamic equilibrium. If, on the other hand, the system is in equilibrium, its macroscopic features are characterized by a small number of averages
which are themselves constant. Even then, for interacting systems it is in general impossible to derive those averages exactly from the microscopic degrees of
freedom. We will hence start with simple systems like ideal, i.e. noninteracting,
gases and proceed to interacting systems later.
12
CHAPTER 1. INTRODUCTION
Chapter 2
Kinetic Theory of the Classical
Ideal Gas
As an introduction to the subject of statistical mechanics, let us study gases
consisting of weakly interacting atoms or molecules. The atoms or molecules
of a gas move more or less independent of each other. One can use an idealization to describe this situation: in an ideal gas, the atoms or molecules move
independently as free particles, except during collisions which are assumed to
be completely elastic. In practice it is completely impossible (and fortunately
unnecessary) to describe all the degrees of freedom (of order 1023 ) in detail. It
is much more practical to describe the gas particles from a statistical point of
view. Their average force on the walls of a container determines the pressure
of the gas, and the average kinetic energy of the particles determines the temperature. Pressure and temperature are directly measurable physical quantities
which characterize the gas much better than a list of all positions and momenta
of the gas particles. Pressure, temperature, and density of a gas are related to
each other by the ideal gas law. When a gas is heated, it increases its internal
energy (and hence its temperature) and it may also expand and thereby do work.
The energy balance of a gas is summarized in the ﬁrst law of thermodynamics
which reﬂects nothing but energy conservation. To get acquainted with gases, we
will ﬁrst consider their elementary constituents: atoms and molecules, and study
some of their properties.
13
14
2.1
CHAPTER 2. KINETIC THEORY OF THE CLASSICAL IDEAL GAS
Atoms and Molecules
Atoms consist of an atomic nucleus and a number of electrons. The nucleus
consists of Z positively charged protons and N neutrons and has a size of the
order of 10−14 m. The whole atom is electrically neutral because there are also Z
negatively charged electrons forming a cloud surrounding the nucleus. The size of
the electron cloud (and of the entire atom) is of the order of 10−10 m. The mass
of a neutron or proton is one atomic mass unit Mn ≈ Mp = 1u = 1.66 × 10−24 g,
while the electron mass is much smaller (Me = 9.04 × 10−28 g). Hence, the mass
of the atom is almost entirely concentrated in the nucleus and we can write it as
MA = (Z + N )u.
To understand the physics of the electron cloud as well as of the nucleus
we need quantum mechanics. However, at moderate temperatures the energy
is insuﬃcient to ionize the atoms and we can treat the atoms as pointlike and
structureless. This is exactly what we do for a monatomic ideal gas. Of course,
this is an idealization. In particular, at very high temperatures the electrons
could be removed from the atomic nucleus and the system becomes a plasma (as
e.g. the matter that exists in the sun). At yet much higher temperatures even
the atomic nucleus itself would dissolve into quarks and gluons and we would end
up in a quarkgluon plasma — the state of matter that existed during the ﬁrst
microsecond after the big bang.
The simplest atom is the hydrogen atom H . It consists of one proton (the
nucleus) and one electron in the cloud and has a mass MH = 1u. The electrons are
bound to the atomic nucleus by electromagnetic forces. These forces also bind
atoms to molecules. For example, two hydrogen atoms may form a hydrogen
molecule H2 , by sharing their electron cloud. The diatomic hydrogen molecule
has a mass MH2 = 2u. Table 2.1 summarizes some properties of atoms and
molecules.
How many water molecules are contained in 1 cm3 of water? We know that
1 liter = 103 cm3 of water weighs 1 kg. Hence, 1 cm3 of water weighs 1 g. One
water molecule weighs 18u, so the number of water molecules in 1 cm3 of water
is
1g
1g
=
= 3.36 × 1022 .
(2.1.1)
N=
18u
2.98 × 10−23 g
Consequently, 18 g of water contain Avogadro’s number
NA = 18N = 6.02 × 1023
(2.1.2)
of water molecules. An amount of matter that contains NA basic units (atoms
2.2. PRESSURE AND TEMPERATURE OF THE IDEAL GAS
Particle
hydrogen atom
helium atom
nitrogen atom
oxygen atom
hydrogen molecule
nitrogen molecule
oxygen molecule
water molecule
notation
H
He
N
O
H2
N2
O2
H2 O
Z
1
2
7
8
2
14
16
10
N
0
2
7
8
0
14
16
8
15
MA
1u
4u
14u
16u
2u
28u
32u
18u
Table 2.1: Basic Properties of some atoms and molecules.
or molecules) of some substance is called one mole of that substance. One mole
of water weighs 18 g, and in general one mole of a substance weighs the number
of its basic units in grams. For example, one mole of oxygen gas (O2 ) weighs 32
g, while one mole of helium gas (He) weighs 4 g. The ﬁrst contains NA oxygen
molecules, the second NA helium atoms.
2.2
Pressure and Temperature of the Ideal Gas
Let us consider a container of volume V = Lx × Ly × Lz containing an ideal gas
consisting of N particles (atoms or molecules) of mass M . The number density of
gas particles is then given by n = N/V and the mass density is ρ = N M/V = M n.
The gas particles perform a random, chaotic motion, and each has its own velocity
va . The particles collide with each other and with the walls of the container. For
an ideal gas we assume that all these collisions are completely elastic.
Let us consider the average force that the gas exerts on the walls of the
container. This will lead to an expression for the pressure of the gas. When
particle number a collides with the wall (perpendicular to the xdirection) its
′
′
velocity va changes to va . Since the collision is elastic we have vax = −vax .
Hence, during the collision particle a transfers an impulse Jax = 2M vax to the
wall (in the perpendicular xdirection). What is the probability for particle a
hitting the wall during a time interval ∆t? In order to be able to reach the wall
within ∆t, the particle must at most be a distance ∆xa = vax ∆t away from the
wall. Since the wall has area A = Ly × Lz , it must be inside a volume A∆xa .
Since the total volume is V , the probability to be within the volume A∆xa is
A∆xa /V . Still, the particle will not necessarily hit the wall, even if it is within
16
CHAPTER 2. KINETIC THEORY OF THE CLASSICAL IDEAL GAS
the volume A∆xa . In half of the cases it will move away from the wall. Hence,
the probability for particle a to hit the wall during the time ∆t is only
1 A∆xa
1 Avax ∆t
=
.
2V
2V
(2.2.1)
The force (impulse per time) exerted on the wall by particle a is hence given by
Fax =
A
2M vax 1 Avax ∆t
2
= M vax ,
∆t 2 V
V
(2.2.2)
and the total force exerted by all particles is
N
Fax =
Fx =
a=1
A
V
N
2
M vax =
a=1
A
2
N M vx .
V
(2.2.3)
We have introduced the average over all particles
2
M vx =
1
N
N
2
M vax .
(2.2.4)
a=1
The force (perpendicular to the wall) per unit area of the wall is the pressure
which is hence given by
p=
Fx
N
N2 M 2
2
=
M vx =
v.
A
V
V3 2
(2.2.5)
2
2
2
Here we have introduced the velocity squared v 2 = vx + vy + vz and we have used
2
2
2
symmetry to argue that vx = vy = vz . Now we can write
pV = N
2M2
v.
32
(2.2.6)
In other words, the pressure of the gas is proportional to the average kinetic
1
energy of the gas particles 2 M v 2 . Pressure is measured in Pascal (1 Pa = 1
2 ) and the typical atmospheric pressure is about 105 Pa.
N/m
The absolute temperature T of the gas is deﬁned by
M2
3
kB T =
v.
2
2
(2.2.7)
3
Up to the numerical factor 2 kB the temperature is just the average kinetic energy
of the gas particles. The Boltzmann constant kB is present to match the diﬀerent units of temperature and energy. If we (or Boltzmann and his colleagues)
2.2. PRESSURE AND TEMPERATURE OF THE IDEAL GAS
17
had decided to measure temperature in Joules (J), kB could have been dropped.
However, temperature is traditionally measured in degrees Kelvin (K) and
kB = 1.38 × 10−23 J/K.
(2.2.8)
From its deﬁnition it is clear that the absolute temperature must be positive, i.e.
T ≥ 0, because v 2 ≥ 0. Only if all gas particles are at rest we have T = 0. This
corresponds to the absolute zero of temperature (0 K). In degrees Celsius this
corresponds to −273.16 C.
With the above deﬁnition of temperature we now obtain
pV = N kB T.
(2.2.9)
This is the ideal gas law. Sometimes it is also written as
pV = N RT,
(2.2.10)
where N = N/NA is the number of moles of gas, and R = kB NA = 8.3 J/K is
the socalled gas constant.
A monatomic ideal gas has no internal degrees of freedom. In contrast to
diatomic gases, the particles in a monatomic ideal gas are considered as pointlike and cannot rotate or vibrate. The average energy E of a monatomic ideal
gas is hence just its kinetic energy, i.e.
E =N
3
M2
v = N kB T,
2
2
(2.2.11)
and we can also write the ideal gas law as
pV =
2
E.
3
(2.2.12)
18
CHAPTER 2. KINETIC THEORY OF THE CLASSICAL IDEAL GAS
Chapter 3
Microcanonical and Canonical
Ensemble
In this chapter we introduce some powerful formalism of statistical mechanics.
First we formulate classical mechanics using the Hamilton formalism. Then we
introduce the concept of an ensemble and we discuss the microcanonical and the
canonical ensembles. To illustrate these ideas we use the model of a particle on
an energy ladder and we introduce a simple model for a heat bath.
3.1
The Hamilton Function
The previous discussion of the ideal gas was intuitive and did not use more than
we know from Newtonian mechanics. It also led to the concepts of pressure and
temperature. We need additional powerful concepts that will be useful also beyond ideal gases. In particular, as a preparation of quantum statistical mechanics
(the quantum mechanical version of statistical mechanics) it is useful to introduce classical mechanics (and then classical statistical mechanics) in the socalled
Hamilton formulation. Although perhaps less intuitive, the new formulation will
turn out to be extremely useful. Later we will use the ideal gas to show explicitly
that the new formalism is completely equivalent to what we derived before.
The conﬁguration of a general classical system of particles is characterized
by specifying the coordinates xa and the momenta pa of all the particles. The
coordinates and momenta deﬁne the socalled phase space of the system. The
total energy of any conﬁguration of the system is given by the classical Hamilton
19
20
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
function
H[x, p] =
a
p2
a
+
2Ma
a>b
V (xa − xb ).
(3.1.1)
The ﬁrst term is the total kinetic energy. Here Ma is the mass of particle number
a. The second term is the total potential energy which we have assumed to be a
sum of pairpotentials depending on the distance of the interacting particles. The
Hamilton function describes the dynamics of the system through the Hamilton
equations of motion
dxai ∂ H
dpai
∂H
=
,
=−
.
(3.1.2)
∂pai
dt ∂xai
dt
For the Hamilton function from above these equations take the form
dxai
pai
=
,
Ma
dt
Fabi =
b
dpai
.
dt
(3.1.3)
We conclude that
dxa
,
dt
i.e. momentum is the product of mass and velocity. Furthermore
pa = Ma
Fiab = −
∂V (xa − xb )
∂xai
(3.1.4)
(3.1.5)
is the force that particle b exerts on particle a. Taking the time derivative of the
momentum and combining both equations we obtain
Ma
dpa
d2 xa
=
=
2
dt
dt
Fab .
(3.1.6)
b
This is nothing but Newton’s equation, which is hence completely equivalent to
the pair of Hamilton equations of motion.
Physical quantities O[x, p] are functions over conﬁguration (or phase) space.
Examples of relevant physical quantities are the total energy H[x, p], the kinetic
energy
p2
a
,
(3.1.7)
T [x, p] =
2Ma
a
the potential energy
V [x, p] =
a>b
V (xa − xb ),
(3.1.8)
or the total momentum
P [x, p] =
pa .
a
(3.1.9)
3.2. THE CONCEPT OF AN ENSEMBLE
3.2
21
The Concept of an Ensemble
Let us consider a gas in a container with a removable wall that separates the
left from the right half of the container. Let us assume that all gas particles are
initially in the left half of the container, while the right half is empty. When
we remove the wall, the gas will immediately expand into the right half and will
eventually reach an equilibrium state. In equilibrium, we will never ﬁnd all the
gas particles in the left half again, because without the wall such conﬁgurations
are extremely improbable.
A classical gas evolves through conﬁguration (or phase) space following the
Hamilton (or equivalently Newton’s) classical equations of motion, starting from
its initial conditions. In order to understand the probabilities with which diﬀerent conﬁgurations are realized during the timeevolution, one might attempt to
solve the equations of motion. However, due to the enormous number of degrees
of freedom and due to our limited knowledge of the initial conditions, this is
completely impossible in practice. Fortunately, this is indeed not necessary.
Statistical mechanics describes the probabilities of the various conﬁgurations
using the concept of an ensemble. Instead of considering a single system during
its timeevolution, one considers a large number of independent identical systems
(an ensemble) at one moment in time. The hypothesis of ergodicity states that
in equilibrium the timeaverage of a single system is the same as the ensemble
average over many independent identical systems. An ensemble is characterized
by a probability distribution
ρ[x, p] = ρ(x1 , p1 , x2 , p2 , ..., xN , pN ) ≥ 0,
(3.2.1)
which describes the probability to ﬁnd a system in the ensemble that is in the
conﬁguration (x1 , p1 ), (x2 , p2 ), ..., (xN , pN ). The total probability is normalized
as
D xD p ρ[x, p] = 1.
(3.2.2)
Here the integrations extend over all of phase space, i.e.
D xD p =
13
d x1 d3 p1 d3 x2 d3 p2 ...d3 xN d3 pN .
∆N
(3.2.3)
Here ∆ is an arbitrarily chosen volume of phase space which is introduced in
order to make the integration measure dimensionless. An ensemble average of a
physical quantity O is given by
O=
D xD p O[x, p]ρ[x, p].
(3.2.4)
22
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
Note that the arbitrarily chosen factor ∆ drops out of this expression for the
average.
3.3
The Microcanonical Ensemble
Depending on the physical conditions, diﬀerent systems are described by diﬀerent
ensembles. The simplest case is the socalled microcanonical ensemble for which
ρ[x, p] =
1
δ(H[x, p] − E ).
Z (E )
(3.3.1)
This ensemble describes an isolated system with total energy E . For example,
a gas in a container with perfectly reﬂecting walls can be described in this way.
It is completely isolated from the rest of the world and cannot exchange energy
with it. Hence, the total energy of all the gas particles is necessarily conserved.
Instead of considering a single isolated system during its timeevolution, the microcanonical ensemble describes a large number of independent identical systems,
all with the same total energy E . In the microcanonical ensemble all conﬁgurations are equally probable, as long as they have the correct total energy E . The
energydependent partition function Z (E ) of the microcanonical ensemble is a
normalization factor for the total probability and is given by
Z (E ) =
D xD p δ(H[x, p] − E ).
(3.3.2)
The average of the total energy in the microcanonical ensemble is given by
H
=
D xD p H[x, p]ρ[x, p]
1
D xD p H[x, p]δ(H[x, p] − E )
Z (E )
1
D xD p δ(H[x, p] − E )
=E
Z (E )
=
=E
D xD p ρ[x, p] = E.
(3.3.3)
It is interesting to consider the variance
∆O =
O2 − O
2
(3.3.4)
of a physical quantity O. In the microcanonical ensemble the variance of the
energy vanishes because
H2 = E 2 = H 2 .
(3.3.5)
3.4. THE CANONICAL ENSEMBLE
23
Other physical quantities like the kinetic or potential energy will in general have
a nonzero variance in the microcanonical ensemble.
3.4
The Canonical Ensemble
Many interesting physical systems are not totally isolated from the rest of the
world. For example, we could put our gas container inside a bigger tank of gas
at some temperature T . Now the gas inside the container can exchange energy
with the surrounding gas in the tank which serves as a socalled heat bath. As
a result, the gas in the container will eventually reach a new equilibrium at the
same temperature as the heat bath. In the following it will be important that
the heat bath is very large and can exchange an unlimited amount of energy with
our physical system. In particular, when energy is absorbed by the system this
will not lead to a decrease of the temperature of the heat bath.
The canonical ensemble describes a system in thermal equilibrium with a
heat bath of temperature T . Since the system can exchange energy with the heat
bath, the energy of the system now ﬂuctuates (it has a variance). The distribution
function of the canonical ensemble is given by
ρ[x, p] =
1
exp(−β H[x, p]),
Z (β )
(3.4.1)
where β = 1/kB T is the inverse temperature (in units of kB ). The partition
function of the canonical ensemble is hence given by
Z (β ) =
D xD p exp(−β H[x, p]),
(3.4.2)
and the thermal average of a physical quantity is given by
O=
1
Z (β )
D xD p O[x, p] exp(−β H[x, p]).
(3.4.3)
It is interesting to note that the thermal average of the energy can be written as
H
1 ∂Z (β )
∂ log Z (β )
=−
∂β
Z (β ) ∂β
1
D xD p H[x, p] exp(−β H[x, p]).
Z (β )
=−
=
(3.4.4)
24
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
Similarly, the variance of the energy takes the form
(∆H)2 =
H2 − H
2
=
∂ 2 log Z (β )
∂β 2
1
1 ∂ 2 Z (β )
−
2
Z (β ) ∂β
Z (β )2
=
∂ Z (β )
∂β
2
.
(3.4.5)
The partition functions of the canonical and microcanonical ensembles are
related by a socalled Laplace transform
∞
Z (β ) =
−∞
∞
=
dE Z (E ) exp(−βE )
D xD p δ(H[x, p] − E ) exp(−βE )
dE
−∞
=
D xD p exp(−β H[x, p]).
(3.4.6)
The Laplace transform is closely related to the more familiar Fourier transform.
Indeed, if we analytically continue the inverse temperature to purely imaginary
values and write β = it the above equation turns into a Fourier transform
∞
Z (it) =
−∞
dE Z (E ) exp(−iEt).
(3.4.7)
The corresponding inverse Fourier transform takes the form
Z (E ) =
∞
1
2π
dt Z (it) exp(iEt).
(3.4.8)
−∞
Using the Fourier representation of the δfunction
δ(E ) =
1
2π
∞
−∞
dt exp(−iEt),
(3.4.9)
one can write
Z (E ) =
=
=
=
D xD p δ(H[x, p] − E )
1
2π
1
2πi
1
2πi
∞
dt
−∞
i∞
dβ
−i∞
i∞
D xD p exp(−i(H[x, p] − E )t)
D xD p exp(−β H) exp(βE )
dβ Z (β ) exp(βE ).
−i∞
(3.4.10)
25
3.5. PARTICLE ON AN ENERGY LADDER
In analogy to the inverse Fourier transform of eq.(3.4.8) this is the inverse Laplace
transform. It should be noted that β is to be integrated over purely imaginary
values.
3.5
Particle on an Energy Ladder
To illustrate the ideas introduced before, let us consider a simple physical system
— a single particle on a ladder. A classical physics realization of this model is
a particle that climbs a ladder in the gravitational potential of the earth. The
ladder has steps of height zn = z0 n, n = 0, 1, 2, ..., ∞ with a corresponding
potential energy
H[n] = En = M gzn = M gz0 n = ǫn, ǫ = M gz0 .
(3.5.1)
Here g is the gravitational acceleration and M is the particle mass.1 The particle
in this model has only potential (and no kinetic) energy and hence its total energy
is given by En . It will turn out that the total energy of a quantum mechanical
1
harmonic oscillator is also described by an energy ladder with En = ω (n + 2 )
where ω is the frequency of the oscillator and is Planck’s constant. Until we
will be familiar enough with quantum mechanics we can imagine the classical
particle on the ladder. Still, the most interesting physical applications of the
energy ladder emerge from the quantum harmonic oscillator.
Let us consider the canonical partition function of the particle on the ladder.
Since the particle only has potential energy, its conﬁguration is entirely speciﬁed
by the step n. Hence the partition function is a sum (and not an integral) over
conﬁguration space and we obtain
∞
∞
Z (β ) =
n=0
exp(−βEn ) =
n=0
exp(−βǫn) =
1
.
1 − exp(−βǫ)
(3.5.2)
Here we have used the formula for a geometric series
∞
xn =
n=0
1
.
1−x
(3.5.3)
1
Strictly speaking this formula for the potential energy is valid only close to the earth’s
surface.
26
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
The average energy takes the form
H
∂ log Z (β )
∂ log[1 − exp(−βǫ)]
=
∂β
∂β
ǫ
ǫ exp(−βǫ)
=
.
1 − exp(−βǫ)
exp(βǫ) − 1
=−
=
(3.5.4)
Similarly, the variance of the energy is given by
(∆H)2 = H2 − H
2
=
∂ 2 log Z (β )
ǫ2 exp(βǫ)
=
.
∂β 2
[exp(βǫ) − 1]2
(3.5.5)
Hence the ratio of the variance and the average energy is given by
∆H
1
= exp( βǫ).
H
2
(3.5.6)
Such a large variance is typical for small systems (we are dealing with a single
particle), while for large systems the variance is typically much smaller than the
average.
3.6
Model for a Heat Bath
Let us use the particle on the energy ladder to discuss a simple model for a heat
bath. The heat bath consists of a large number of particles with energy zero or
1
ǫ. The fraction with energy ǫ is q ∈ [0, 2 ]. The single particle on the ladder is
coupled to the heat bath. At each timestep it interacts with one of the bath
particles and exchanges energy according to the following rules. If the particle on
the ladder is hit by a bath particle of energy ǫ it absorbs the energy and jumps
from the step n up to the step n + 1. We assume that the bath is inﬁnitely large,
such that the fraction q does not change when a bath particle transfers its energy
to the particle on the ladder. If the particle on the ladder is hit by a bath particle
of energy zero, on the other hand, the particle on the ladder steps down from step
n to step n − 1 and transfers the energy ǫ to the bath. The only exception from
this rule arises when the ladder particle is initially at the ground level n = 0.
Then it cannot lower its energy further. It will then stay at n = 0 and will not
transfer energy to the bath. The interaction of the particle on the ladder with
the heat bath can be characterized by a transition probability between steps n
and n + 1
w[n → n + 1] = q,
(3.6.1)
27
3.6. MODEL FOR A HEAT BATH
(with probability q the particle on the ladder is hit by a bath particle of energy
ǫ and steps up the ladder), and between the steps n and n − 1
w[n → n − 1] = 1 − q, n ≥ 1,
(3.6.2)
(with probability 1 − q the particle on the ladder is hit by a bath particle of
energy zero and steps down the ladder). Finally, if initially n = 0
w[0 → 0] = 1 − q.
(3.6.3)
All other transition probabilities are zero, i.e.
w[n → n′ ] = 0, n′ − n ≥ 2.
(3.6.4)
For any given initial n ≥ 1 the transition probability is correctly normalized
because
n′
w[n → n′ ] = w[n → n + 1] + w[n → n − 1] = q + 1 − q = 1,
(3.6.5)
and also for n = 0
n′
w[0 → n′ ] = w[0 → 1] + w[0 → 0] = q + 1 − q = 1.
(3.6.6)
Let us imagine that the particle on the ladder is initially characterized by
an ensemble with probability distribution ρ0 [n]. For example, if the particle
is initially certainly on the ground we have ρ0 [n] = δn0 , but any other initial
probability distribution would also be possible. After one timestep, i.e. after one
interaction with a bath particle, the new probability distribution will be
ρ1 [n′ ] =
n
ρ0 [n]w[n → n′ ].
(3.6.7)
Similarly, after the ith time step, i.e. after i interactions with the bath particles
ρi−1 [n]w[n → n′ ].
ρi [n′ ] =
n
(3.6.8)
After a large number of timesteps, we expect that the particle on the ladder will
get into thermal equilibrium with the heat bath, i.e. its probability distribution
will converge to a ﬁxed ρ∞ [n] which is characterized by
ρ∞ [n′ ] =
n
ρ∞ [n]w[n → n′ ].
(3.6.9)
28
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
In particular, for n′ ≥ 1 this implies
ρ∞ [n′ ] = ρ∞ [n′ − 1]w[n′ − 1 → n′ ] + ρ∞ [n′ + 1]w[n′ + 1 → n′ ]
= qρ∞ [n′ − 1] + (1 − q )ρ∞ [n′ + 1],
(3.6.10)
and for n′ = 0
ρ∞ [0] = ρ∞ [0]w[0 → 0] + ρ∞ [1]w[1 → 0] = (1 − q )ρ∞ [0] + (1 − q )ρ∞ [1]. (3.6.11)
This equation implies
q
ρ∞ [0].
1−q
ρ∞ [1] =
Using
(3.6.12)
ρ∞ [1] = qρ∞ [0] + (1 − q )ρ∞ [2],
this leads to
ρ∞ [2] =
ρ∞ [1] − qρ∞ [0]
=
1−q
Generally one obtains
q
1−q
ρ∞ [n] =
q
1−q
(3.6.13)
2
ρ∞ [0].
(3.6.14)
n
ρ∞ [0].
(3.6.15)
Remarkably, the equilibrium distribution ρ∞ [n] resembles the canonical ensemble
ρ[n] for which
ρ[n] =
1
1
exp(−β H[n]) =
exp(−βEn ) = exp(−βǫn)ρ[0],
Z (β )
Z (β )
if one identiﬁes
3.7
q
1
= exp(−βǫ) ⇒ q =
.
1−q
exp(βǫ) + 1
(3.6.16)
(3.6.17)
Canonical Ensemble for Particles on a Ladder
We will now consider not just one but N distinguishable particles on the energy
ladder. The conﬁguration [n] of the system is then speciﬁed by the heights
za = na z0 (with a ∈ {1, 2, ..., N }) of each of the N particles. The corresponding
Hamilton function is given by
N
N
H[n] =
na ,
Ea = ǫ
a=1
a=1
(3.7.1)
3.8. MICROCANONICAL ENSEMBLE FOR PARTICLES ON A LADDER 29
and the canonical partition function
N
Z (β ) =
[ n]
exp(−β H[n]) =
∞
a=1 na =0
exp(−βǫna ) = z (β )N
(3.7.2)
factorizes into single particle partition functions
∞
z (β ) =
n=0
exp(−βǫn) =
1
.
1 − exp(−βǫ)
(3.7.3)
The average energy of the particles is given by
H =−
∂ log Z (β )
∂ log z (β )
ǫ
= −N
=N
,
∂β
∂β
exp(βǫ) − 1
(3.7.4)
which is just N times the average energy in the single particle case. Let us also
consider the variance of the energy
(∆H)2 =
∂ 2 log Z (β )
∂ 2 log z (β )
ǫ2 exp(βǫ)
=N
=N
.
2
2
∂β
∂β
[exp(βǫ) − 1]2
(3.7.5)
The ratio of the variance and the average of the energy now takes the form
1
∆H
1
= √ exp( βǫ).
H
2
N
(3.7.6)
In the socalled thermodynamic limit N → ∞ of a large number of particles, the
√
ﬂuctuations of the energy are suppressed as 1/ N .
3.8
Microcanonical Ensemble for Particles on a Ladder
Let us again consider N distinguishable particles on the energy ladder. However,
we will now use the microcanonical ensemble, i.e. the N particles will share a ﬁxed
amount of energy E = M ǫ. Let us ﬁrst determine the microcanonical partition
function Z (E ) = Z (M, N ). How many ways are there for distributing M basic
units of energy ǫ among N distinguishable particles? Let us begin with the trivial
case of a single particle (N = 1). Then that particle gets all the energy and
Z (M, 1) = 1.
(3.8.1)
30
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
Next we consider two particles. Then we can give n units of energy to particle 1
and the remaining M − n units to particle 2, such that
Z (M, 2) = M + 1.
(3.8.2)
With three particles the thing gets more interesting. First, we can give n units
of energy to particle 1 and we can then distribute the remaining M − n units
among the particles 2 and 3. There are Z (M − n, 2) = M − n + 1 ways of doing
that and hence
M
M
Z (M, 3) =
n=0
Z (M − n, 2) =
(M − n + 1) =
n=0
(M + 2)(M + 1)
.
2
(3.8.3)
The general formula for N particles takes the form
Z (M, N ) =
(M + N − 1)!
,
M !(N − 1)!
(3.8.4)
Z (M − n, N − 1).
(3.8.5)
which indeed satisﬁes
M
Z (M, N ) =
n=0
What is the density of particles at step n? In other words, what is the
probability of particle 1 (or any other particle) to reach step n. The previous
discussion immediately implies
ρ[n] =
(M − n + N − 2)! M !(N − 1)!
Z (M − n, N − 1)
=
.
Z (M, N )
(M − n)!(N − 2)! (M + N − 1)!
(3.8.6)
Let us consider a system of many particles (N → ∞) with a large amount of
energy (M → ∞), such that the energy per particle, i.e. M/N , remains ﬁxed.
We now use the Stirling formula
log N ! = N log N − N,
(3.8.7)
which holds in the limit of large N . This implies
log ρ[n] = log Z (M − n, N − 1) − log Z (M, N )
= (M − n + N − 2) log(M − n + N − 2) − (M − n) log(M − n)
− (N − 2) log(N − 2) + M log M
+ (N − 1) log(N − 1) − (M + N − 1) log(M + N − 1)
= log ρ[0] − n log(M + N ) + n log M + O(n2 ),
(3.8.8)
3.8. MICROCANONICAL ENSEMBLE FOR PARTICLES ON A LADDER 31
and hence
ρ[n] = ρ[0]
M
M +N
n
= ρ[0] exp(−βǫn).
(3.8.9)
Remarkably, this is the same result as for the canonical ensemble provided that
we identify
exp(βǫ) = 1 +
N
Nǫ
ǫ
=1+
⇒ E=N
.
M
E
exp(βǫ) − 1
(3.8.10)
The average energy in the canonical ensemble is indeed given by
H =N
ǫ
.
exp(βǫ) − 1
(3.8.11)
Hence, in the thermodynamical limit of many particles (N → ∞) both the canonical and the microcanonical ensemble give the same physical results. In particular,
just as for any N in the microcanonical ensemble, for N → ∞ the variance of the
total energy vanishes even in the canonical ensemble.
32
CHAPTER 3. MICROCANONICAL AND CANONICAL ENSEMBLE
Chapter 4
Information and Entropy
In this chapter we introduce other useful concepts — information and information
deﬁcit as well as the closely related entropy.
4.1
Information and Information Deﬁcit
Let us imagine that somebody places a queen at some unknown square on the
chess board. How many binary questions do we need to ask in order to ﬁgure
out where the queen is? First of all, the question “Where is the queen?” is
not binary because a possible answer would be “On B7”, and not just “Yes” or
“No”. Binary questions have only two possible answers: “Yes” or “No”. The
question “Is the queen on F4?” is binary, but not necessarily the best question
to ask. For example, following the strategy of checking out individual squares,
in the worst case it requires 63 questions to ﬁgure out where the queen is.1 On
the other hand, if we are lucky we ﬁnd the queen already after the ﬁrst question.
On the average, following this strategy, we would need 32 questions to locate the
queen. Obviously, there is a much better strategy. First, we ask “Is the queen
in the left side of the board?”. If the answer is “No”, we concentrate on the
right side and ask “Is the queen in the tophalf of the right side?” Depending on
the answer, we check out one half of the remaining section of the board. In this
way, after log2 64 = 6 questions we will always ﬁnd the queen. Hence, although
following the second strategy we will never ﬁnd the queen after the ﬁrst question,
1
Of course, we don’t need to ask the 64th question because if we got the answer “No” 63
times, we know that the queen is on the square we did not check.
33
34
CHAPTER 4. INFORMATION AND ENTROPY
on average we will ﬁnd it more than ﬁve times faster than using the ﬁrst strategy.
Indeed the second strategy is optimal, because it minimizes the average number
of binary questions.
How much information is encoded in the position of the queen? Hence, how
big is our information deﬁcit when we don’t know where the queen is? By positioning the queen we can store the answer to 6 binary questions, hence the
information content is 6 bits. Consequently, initially our information deﬁcit is
also 6 bits. With every binary question we reduce the deﬁcit by one bit, until
ﬁnally, after 6 questions, we have extracted all the information.
As we just deﬁned it, information is measured in bits which encode answers
to binary questions. What if the information content is not an integer number
of bits? For example, let us imagine that the queen can only be placed in three
possible positions. Then, following the logic from before, the information deﬁcit
should be I3 = log2 3 ≈ 1.585 bits. Following an optimal strategy, how many
binary questions do we need to ask on average in order to reveal the position
1
of the queen? We can ask “Is the queen in position 1?”. In 3 of the cases, the
2
answer will be “Yes” and we are done in one step. In the other 3 of the cases the
answer is “No” and we ask “Is the queen in position 2?”. In any case, no matter
if the question is “Yes” or “No”, after the second question we know where the
1
5
queen is. Hence, on average we need to ask 3 + 2 2 = 3 ≈ 1.667 binary questions.
3
5
Although this is not exactly equal to I3 it is quite close ( 3 /I3 ≈ 1.05). What if we
place the queen in one out of ﬁve possible positions? In that case, the information
deﬁcit should be I5 = log2 5 ≈ 2.322 bits. In this case, we can ask “Is the queen
in position 1 or 2?” In 2 of the cases the answer will be “Yes”, and we need one
5
more question to ﬁgure out where the queen is. In 3 of the cases, on the other
5
hand, the answer will be “No”. Then we can proceed as in the threeposition
5
case and will on average need 3 more questions. Hence, in total on average we
3
2
need to ask 5 2 + 5 (1 + 5 ) = 12 questions, such that 12 /I5 ≈ 1.03 which is even
3
5
5
closer to 1 than in the threeposition case. In the following we will be dealing
with very large conﬁguration spaces of size Z (E ). Then the diﬀerence between
log2 Z (E ) and the average number of binary questions required to ﬁgure out the
conﬁguration is totally negligible. Hence, in the following we will deﬁne
I = log2 Z (E )
(4.1.1)
as the information deﬁcit, even if in general it is not an integer number of bits.
4.2. THE CONCEPT OF ENTROPY
4.2
35
The Concept of Entropy
Entropy is another concept very closely related to information deﬁcit. It is a
measure of disorder. Entropy is deﬁned for any ensemble, i.e. for any probability
distribution ρ[n], not only for ensembles describing thermal equilibrium such as
the canonical or microcanonical ones. For a general ensemble the entropy is
deﬁned as
ρ[n] log ρ[n].
(4.2.1)
S = −kB
[ n]
The second law of thermodynamics states that the total entropy of the Universe
always increases with time. This does not mean that the entropy of a subsystem
cannot decrease. For example, we can establish some order in one part of the
Universe (for example, in our brain) at the expense of increasing the entropy of
the rest of the world.
To illustrate the basic idea behind entropy, let us again consider a simple
example related to chess. We divide the chess board into three regions as shown
in the ﬁgure:
The queen is placed on one of the 64 squares with equal probabilities. Once
the queen is placed, we no longer distinguish between the squares within a given
region and only ask in which region the queen is. On average, what is the minimum number of binary questions we need to ask in order to locate the queen?
First, we ask if the queen is on the left side of the board. In one half of the
cases the answer will be “Yes” and we are done. In the other half of the cases
we ask once more “Is the queen in the tophalf of the right side?”. The answer
will then allow us to locate the queen after two questions. Hence, on average we
1
need 1 + 2 2 = 3 binary questions to locate the queen, and the information deﬁcit
2
2
3
is thus I = 2 . What is the corresponding entropy? The queen on the divided
1
board is described by the following ensemble. With probability 2 it is in region
1
1 (the lefthalf of the board) and hence ρ[1] = 1 . With probabilities 4 the queen
2
is in regions 2 and 3 (the top and bottom half of the righthalf of the board) and
36
CHAPTER 4. INFORMATION AND ENTROPY
1
hence ρ[2] = ρ[3] = 4 . According to the previous deﬁnition, the corresponding
entropy is hence given by
S = −kB (ρ[1] log ρ[1] + ρ[2] log ρ[2] + ρ[3] log ρ[3])
11
1
1
1
1
1
1
= −kB ( log + 2 log ) = −kB ( log2 + log2 ) log 2
2
2
4
4
2
22
4
11
3
= kB ( + 2) log 2 = kB log 2 = kB I log 2.
(4.2.2)
22
2
The ﬁnal equality is not just a coincidence. As we will see later, entropy and
information deﬁcit are generally related by S = kB I log 2.
4.3
Entropy and Free Energy in the Canonical Ensemble
In the canonical ensemble the entropy is given by
S = −kB
ρ[n] log ρ[n]
[ n]
= kB
[ n]
1
exp(−β H[n])[β H[n] + log Z (β )]
Z (β )
= kB [β H + log Z (β )].
(4.3.1)
Introducing the socalled free energy F as
Z (β ) = exp(−βF ),
(4.3.2)
and putting H = E , the entropy takes the form
S = kB β (E − F ) =
4.4
1
(E − F ) ⇒ F = E − T S.
T
(4.3.3)
Entropy of Particles on a Ladder
Let us again consider N distinguishable particles on an energy ladder. In the
canonical ensemble the free energy of N distinguishable particles on the ladder
is given by
F =−
1
N
N
log Z (β ) = − log z (β ) =
log[1 − exp(−βǫ)],
β
β
β
(4.4.1)
4.4. ENTROPY OF PARTICLES ON A LADDER
37
and the entropy takes the form
S=
1
βǫ
(E − F ) = kB N {
− log[1 − exp(−βǫ)]}.
T
exp(βǫ) − 1
(4.4.2)
Let us also consider the system in the microcanonical ensemble, in which
the N particles share M basic units of energy ǫ. As we argued before, the
corresponding microcanonical partition function is given by
Z (E ) = Z (M, N ) =
(M + N − 1)!
.
M !(N − 1)!
(4.4.3)
How many binary questions do we need to pose in order to ﬁgure out the conﬁguration [n] of the system? According to our deﬁnition, the information deﬁcit is
given by
I = log2 Z (E ) = log2
(M + N − 1)!
(M + N − 1)!
= log
/ log 2.
M !(N − 1)!
M !(N − 1)!
(4.4.4)
Again, going to the thermodynamical limit M, N → ∞ and using Sterling’s
formula one obtains
I = [(M + N ) log(M + N ) − M log M − N log N ]/ log 2
N
= [(M + N ) log M + log(1 +
) − M log M − N log N ]/ log 2
M
M
N
) + N log ]/ log 2
= [(M + N ) log(1 +
M
N
N
M
M
) log(1 +
) + log ]/ log 2
= N [(1 +
N
M
N
= N [βǫ − log(exp(βǫ) + 1)]/ log 2.
(4.4.5)
We have again used
N
M
exp(βǫ)
= exp(βǫ), 1 +
=
,
M
N
exp(βǫ) − 1
1
M
= log
= −βǫ − log[1 − exp(−βǫ)].
log
N
exp(βǫ) − 1
1+
(4.4.6)
Remarkably, up to a factor kB log 2 the information deﬁcit in the microcanonical
ensemble is just the entropy in the canonical ensemble. In particular, in the
thermodynamical limit we can identify
S = kB I log 2.
(4.4.7)
This clariﬁes further the physical meaning of the entropy, it is the analog of the
information deﬁcit in the microcanonical ensemble and, in fact, it is a measure
of disorder.
38
CHAPTER 4. INFORMATION AND ENTROPY
4.5
The Principle of Maximum Entropy
The second law of thermodynamics states that entropy never decreases. Hence,
as a system approaches thermal equilibrium it should increase its entropy and
ﬁnally reach an entropy maximum in thermal equilibrium. Let us assume the second law of thermodynamics and thus postulate a principle of maximum entropy.
Which ensemble then has the maximum amount of entropy? In order to answer
this question we assume an arbitrary ensemble characterized by a probability
distribution ρ[n] and we maximize the entropy
S = −kB
ρ[n] log ρ[n].
(4.5.1)
[ n]
Of course, while we maximize the entropy, we cannot violate energy conservation.
Hence, we should maximize the entropy under the constraint that at least the
average energy
H[n]ρ[n] = E
(4.5.2)
H=
[ n]
is ﬁxed. Also the probability distribution itself must be normalized correctly to
ρ[n] = 1.
(4.5.3)
[ n]
The above constraints can be incorporated via two Lagrange multipliers, i.e. we
should maximize
S ′ = −kB
such that
[ n]
ρ[n] log ρ[n] − λ(
[ n]
H[n]ρ[n] − E ) − λ′ (
[ n]
ρ[n] − 1),
∂S ′
= −kB log ρ[n] − kB − λH[n] − λ′ = 0,
∂ρ[n]
(4.5.4)
(4.5.5)
and we obtain
ρ[n] = exp(−λH[n]/kB − λ′ /kB − 1) =
1
exp(−β H[n]).
Z (β )
(4.5.6)
Remarkably, this is just the canonical ensemble if we identify λ = 1/T . The other
Lagrange multiplier is given by
1
F
= exp(−λ′ /kB − 1) = exp(−βF ) ⇒ λ′ = − kB .
Z (β )
T
(4.5.7)
39
4.6. THE ARROW OF TIME
As we have seen earlier, the coupling to a heat bath leads to the canonical ensemble. As we see now, the canonical ensemble also arises from the principle of
maximum entropy, which in turn relies on the second law of thermodynamics.
Let us also consider the principle of entropy maximization in the context of
the microcanonical ensemble. In that case, energy is automatically conserved and
we only need to impose [n] ρ[n] = 1, such that
S ′ = −kB
and
[ n]
ρ[n] log ρ[n] − λ′ (
[ n]
ρ[n] − 1),
∂S ′
= −kB log ρ[n] − kB − λ′ = 0.
∂ρ[n]
(4.5.8)
(4.5.9)
Hence, we now obtain
ρ[n] = exp(−λ′ /kB − 1) =
1
.
Z (E )
(4.5.10)
Hence, this is indeed the microcanonical ensemble if we identify
λ′ = kB [log Z (E ) − 1] = kB I log 2 − kB = S − kB .
(4.5.11)
Hence, we see that both the canonical and the microcanonical ensemble can
be derived from the principle of maximum entropy.
4.6
The Arrow of Time
Why does time ﬂow from the past to the future? This is a very profound question
that we may hope to understand by studying physics. First of all, is there a
fundamental diﬀerence between the past and the future? In other words, do
the fundamental laws of Nature single out a preferred direction of time or are
they invariant against timereversal? Indeed, timereversal is a symmetry of most
fundamental forces in Nature. In particular, the laws of classical mechanics or
of electrodynamics are timereversal invariant. For example, if we change the
direction of all momenta of an isolated system of particles, the particles trace
back the same paths they originally went through. The resulting motion is the
same as the one obtained by reversing the direction of time. Indeed, Newton’s
laws or Maxwell’s equations make no fundamental diﬀerence between the past and
the future. Also the Schr¨dinger equation of quantum mechanics is timereversal
o
40
CHAPTER 4. INFORMATION AND ENTROPY
invariant. Yet in this context there are subtle issues related to the quantum
mechanical measurement process which we will not touch upon here.
Our present understanding of fundamental physics is based on relativistic
quantum ﬁeld theories which incorporate the basic principles of quantum mechanics as well as special relativity. In the context of relativistic quantum ﬁeld
theories one can prove the socalled CP T theorem, which states that the fundamental forces described by such theories are invariant under the combined
transformations of charge conjugation C , parity P , and timereversal T . While T
changes the direction of time, parity changes the sign of the three spatial coordinates, and charge conjugation exchanges all particles with their antiparticles. All
we know about elementary particle physics today is summarized in the socalled
standard model. As a relativistic quantum ﬁeld theory the standard model necessarily obeys the CP T theorem. The strongest fundamental forces of the standard
model, namely the strong interactions (responsible for quark conﬁnement and the
binding of the atomic nucleus) and the electromagnetic interactions are even invariant under the individual symmetries C , P , and T — not only under the
combination CP T . The weak interactions (e.g. responsible for radioactive decay), on the other hand, are not invariant separately under C and P , but only
under the combination CP . Hence, using the CP T theorem, even the weak interactions are invariant under T . Still, there is a very weak force (whose physical
origin is presently not very well understood) which indeed violates CP and thus
(due to the CP T theorem) also T . This interaction involves the socalled Higgs
ﬁeld whose quantum — the Higgs particle — has not yet been directly veriﬁed
experimentally. It is one of the physics objectives of the Large Hadron Collider
(LHC) presently being built at CERN to ﬁnd the Higgs particle (the start of these
experiments is planned for 2007). Although the Higgs particle itself has not yet
been found, the CP  and thus T violating interaction mediated by the Higgs ﬁeld
has been veriﬁed experimentally in the decays of neutral K mesons — elementary particles consisting of a quark and an antiquark. It should be noted that
T violation in the standard model involves all three quark generations. Ordinary
matter is made almost entirely of the up (u) and down (d) quarks of the ﬁrst generation, with only small admixtures of the charm (c) and strange (s) quarks of the
second generation. The third generation consists of the very heavy top (t) and
bottom (b) quarks which can only be produced in particle accelerators. Hence,
although Nature does make a fundamental diﬀerence between the past and the
future (e.g. in the decay of neutral K mesons), the corresponding processes are
extremely rare and have practically no eﬀect on the matter that surrounds us
today. It should be noted that the weakest fundamental force — gravity — is not
part of the standard model, because we presently don’t understand how to treat
general relativity (Einstein’s classical theory of gravity) quantum mechanically.
4.6. THE ARROW OF TIME
41
Some attempts to go in this direction involve string theory (a possible extension
of relativistic quantum ﬁeld theory) which may even violate CP T . Still, classical
gravity is perfectly T invariant. We may hence conclude that, although there is
indeed an extremely weak interaction that breaks timereversal, all fundamental
forces that govern the matter that surrounds us today do not distinguish the past
from the future.
Still, as weak as they may be today, CP  and thus T violating forces may have
played an important role immediately after the big bang. In particular, without
T violation there would be no baryon asymmetry, i.e. the young Universe would
have contained exactly equal amounts of matter and antimatter. In our Universe,
due to T violating forces, there was a very tiny surplus of matter compared to
antimatter. About one second after the big bang all the antimatter annihilated
with almost all the matter, thus generating the cosmic microwave background
radiation which carries almost all of the entropy of the Universe today. Only
the tiny surplus of matter escaped the mass extinction with the antimatter and
constitutes all the matter that exists in the Universe today. In particular, it is
the stuﬀ that condensed matter (including us) is made of. Without T violating
forces (and thus without a baryon asymmetry) the Universe would consist entirely
of cosmic background radiation. In particular, there would be nobody there to
worry about things like the arrow of time.
If the fundamental forces relevant for the dynamics of the matter that surrounds us today are invariant under timereversal, every physical process should
be reversible. In other words, the phenomena that we observe when playing
a movie backwards should be consistent with the fundamental laws of Nature.
However, this seems obviously not to be the case. We are familiar with an egg
falling down from the kitchen table, cracking open, and making a big mess; but
we have never seen the mess reassemble itself to an intact egg, then jumping back
up on the table. This is a consequence of the second law of thermodynamics: the
total entropy of the Universe is always increasing. Ordered conﬁgurations are
extremely rare compared to disordered ones. Hence, when a system starts from
an ordered conﬁguration (the intact egg on the table) and evolves in time, even
if the underlying fundamental dynamics are invariant under timereversal, it will
most likely evolve towards the much more probable disordered conﬁgurations (the
mess on the ﬂoor).
If this is true, why was the hen able to produce the egg in the ﬁrst place?
Or, for example, why are we able to put together the pieces of a puzzle? Living
creatures are out of thermal equilibrium. They eat sugar (a highly ordered form
of energy) and do some more or less useful things with it (like laying an egg or
42
CHAPTER 4. INFORMATION AND ENTROPY
putting together a puzzle). However, along with such activities they sweat and
thus produce heat. In total the entropy of the Universe still increases, but a small
part of the world (the egg or the puzzle) gets more ordered. Where did the sugar
come from? Ultimately, the sugar is energy from sunlight converted into chemical
energy through photosynthesis. So why is there sunlight? Well, there is hydrogen
gas in the Universe which collapses under its own gravity and makes stars like the
sun. The hydrogen gas itself formed about 400.000 years after the big bang. At
that time, the Universe had expanded so much that the hot plasma of positively
charged protons and negatively charged electrons had cooled down suﬃciently
that neutral atoms could form. Today the age of the Universe is about 13 × 109
years. After 5 × 109 more years or so, the sun will go through a red giant phase
and will eventually burn out. Even if the earth survives the red giant phase and
does not evaporate, this will be the end of energy supply to the earth and thus
the end of life on it. At some point (in about 1014 years or so) even the oldest
stars will be burnt out, all hydrogen in the Universe will be used up, and no new
stars can be born. By then all eggs will have turned into a big mess and no new
ones will be laid. There is no way out: entropy will win the game. Fortunately
for us, these events will take place only in the very distant future.
What will happen to the matter of the burnt out stars and the entropy contained in them? Through gravitational attraction, the matter will ultimately
collapse into black holes. Already today, the center of our and other galaxies
contains a supermassive black hole. What happens to the entropy of the matter
when it falls into a black hole? Or alternatively, what happens to information,
e.g. when you throw your laptop into a black hole? The solution of the black
hole entropy and information puzzle is still controversial. In any case, entropy
or information may not be lost, but may ultimately be returned to the rest of
the world in the form of Hawking radiation — photons (and other particles) produced quantum mechanically at the black hole horizon. Even the largest black
hole should eventually evaporate into Hawking radiation (after 10100 years or so)
and the Universe may ultimately be ﬁlled entirely with photons. This is probably
the absolute maximum of the entropy and a true sign of death of the Universe at
the end of time.
If, as far as the underlying fundamental forces are concerned, there is essentially no diﬀerence between the past and the future, why do we experience
time as ﬂowing in one direction. For example, why do we remember the past
and not the future? This may just be a matter of deﬁnition: we may simply
deﬁne a biological arrow of time by declaring the future as that part of time
we cannot remember. Alternatively, we may deﬁne a thermodynamical arrow
of time by declaring the future as the timedirection in which the total entropy
4.6. THE ARROW OF TIME
43
increases. Then an interesting question arises: why are the biological and the
thermodynamical arrows pointing in the same direction? In other words, why do
we only remember things that happened when the entropy was lower. In order
to remember the past, we must store information in our brain. Obviously, this
creates some order in the brain, and thus decreases its entropy. According to
the second law of thermodynamics, this is possible only if we use some energy
(we eat sugar), produce some heat (we sweat), and thus contribute to the overall
entropy increase of the rest of the world. Computers work in a similar way: they
consume energy to store and process information and they produce a lot of heat.
Consequently, we can only remember (or store information about) those times
at which the entropy of the rest of the world was lower. Entropy increase also
explains why we are born before and not after we die. Its again the same story
that leads from the egg to the big mess. Perhaps not surprisingly, when we age
it becomes harder to maintain a low entropy in our brain (i.e. we forget things),
and ﬁnally, when we die, our entropy has increased so much that we literally
approach thermal equilibrium with the rest of the world.
44
CHAPTER 4. INFORMATION AND ENTROPY
Chapter 5
Canonical Ensemble for the
Ideal Gas
In this chapter we apply the concept of the canonical ensemble to the classical
ideal gas. This leads to the MaxwellBoltzmann distribution as well as to the
barometric height formula. While classical particles are distinguishable, due to
their elementary quantum nature identical atoms or molecules are not. This has
drastic consequences for the conﬁguration space of a gas.
5.1
The MaxwellBoltzmann Distribution
Let us now consider the canonical ensemble for the classical ideal gas. The ideal
gas has only kinetic energy and its Hamilton function is given by
N
H[x, p] =
a=1
p2
a
.
2M
(5.1.1)
Hence the distribution function of the canonical ensemble takes the form
N
1
1
p2
ρ[x, p] =
exp(−β H[x, p]) =
exp(−β a ),
Z (β )
Z (β ) a=1
2M
(5.1.2)
which is known as the MaxwellBoltzmann distribution. Since the gas particles
are independent, the distribution function factorizes into contributions from the
45
46
CHAPTER 5. CANONICAL ENSEMBLE FOR THE IDEAL GAS
individual particles. Hence, the canonical partition function takes the form
D xD p exp(−β H[x, p])
Z (β ) =
N
=
a=1
Here
1
z (β ) =
∆
1
∆
d3 xa d3 pa exp(−β
p2
a
) = z (β )N .
2M
V
p2
)=
d xd p exp(−β
2M
∆
3
3
2πM
β
(5.1.3)
3/2
(5.1.4)
is the partition function of a single gas particle, and V is the spatial volume ﬁlled
by the gas.
Let us determine the average energy of the gas particles
H =−
∂ log Z (β )
∂ log z (β )
31
3
= −N
= N = N kB T.
∂β
∂β
2β
2
(5.1.5)
This conﬁrms that the temperature of the canonical ensemble is indeed identical
with the temperature that we deﬁned before as the average kinetic energy for an
ideal gas. Let us also consider the variance of the energy
(∆H)2 =
∂ 2 log z (β )
31
∂ 2 log Z (β )
=N
= N 2.
2
2
∂β
∂β
2β
(5.1.6)
The ratio of the variance and the average of the energy then takes the form
∆H
=
H
2
.
3N
(5.1.7)
In the thermodynamic limit N → ∞ of √ large number of particles, the ﬂuctuaa
tions of the energy are suppressed as 1/ N .
5.2
Ideal Gas in a Gravitational Field
Let us consider the earth’s atmosphere in the gravitational potential M gz of
the earth. The real atmosphere is a complicated dynamical system that is very
nontrivial to understand. In particular, there is weather which means that the
atmosphere is not in global thermal equilibrium. For example, the temperature is
not constant everywhere in the atmosphere but usually decreases with the height.
Here we idealize the atmosphere as an ideal gas in thermal equilibrium, i.e. with
47
5.2. IDEAL GAS IN A GRAVITATIONAL FIELD
a ﬁxed constant temperature independent of the height. The Hamilton function
of the ideal gas in the gravitational ﬁeld of the earth is given by
N
H[x, p] =
a=1
p2
a
+
2M
N
M gza .
(5.2.1)
a=1
The distribution function of the canonical ensemble now takes the form
N
ρ[x, p] =
1
1
p2
exp(−β H[x, p]) =
exp(−β a − βM gza ),
Z (β )
Z (β ) a=1
2M
(5.2.2)
and the canonical partition function is given by
D xD p exp(−β H[x, p])
Z (β ) =
N
=
a=1
1
∆
d3 xa d3 pa exp(−β
p2
a
− βM gza ) = z (β )N .
2M
(5.2.3)
Again the gas particles are independent and the singleparticle partition function
is now given by
z (β ) =
1
∆
d3 xd3 p exp(−β
A1
p2
− βM gz ) =
2M
∆ βM g
2πM
β
3/2
.
(5.2.4)
Here A is the area over which the atmosphere is considered.
Let us again determine the average energy of the gas particles
H =−
∂ log z (β )
31
N
5
∂ log Z (β )
= −N
=N+
= N kB T.
∂β
∂β
2β
β
2
(5.2.5)
Similarly, the variance of the energy is given by
(∆H)2 =
∂ 2 log Z (β )
∂ 2 log z (β )
51
=N
=N2
2
2
∂β
∂β
2β
(5.2.6)
and hence
∆H
=
H
2
.
5N
(5.2.7)
What is the density of the atmosphere as a function of the height z ? To
answer this question we simply identify the probability density ρ(z ) for a gas
particle to reach the height z which is given by
1
p2
− βM gz )
dx dy d3 p exp(−β
z (β )∆
2M
= βM g exp(−βM gz ) = ρ(0) exp(−βM gz ).
ρ(z ) =
(5.2.8)
48
CHAPTER 5. CANONICAL ENSEMBLE FOR THE IDEAL GAS
This expression — known as the barometric height formula — describes the
density decrease in the atmosphere.
5.3
Distinguishability of Classical Particles
Actual gases consist of atoms and molecules whose dynamics are governed by
quantum mechanics. When we work classically some things are radically diﬀerent
than in quantum mechanics. For example, a classical physicist (like Boltzmann)
may have thought of atoms like tiny classical billiard balls. Classical objects are
distinguishable. For example, we can imagine to label each billiard ball with a
number. Even if we don’t enumerate the particles, we can distinguish them by
their initial positions (“This is the particle that started with initial momentum
p1 from the initial position x1 ”). If we have two particles a and b with positions
and momenta (x1 , p1 ) and (x2 , p2 ) they can be in two possible conﬁgurations.
In the ﬁrst conﬁguration the particle with label a has position and momentum
(x1 , p1 ) and the particle with label b has position and momentum (x2 , p2 ), while
in the second conﬁguration the particle with label b has position and momentum
(x1 , p1 ) and the particle with label a has position and momentum (x2 , p2 ). The
two conﬁgurations are diﬀerent because the two particles are distinguishable (by
their labels a and b). Similarly, N distinguishable particles with positions and
momenta (x1 , p1 ), (x2 , p2 ), ..., (xN , pN ) can exist in N ! diﬀerent conﬁgurations.
The previous discussion is completely correct for a gas of classical objects,
e.g. the “gas” of balls on a billiard table or in a lottery machine. However, actual
atoms or molecules are not classical billiard balls. Instead, they follow the rules
of quantum mechanics. Most import, identical atoms or molecules are completely
indistinguishable. We cannot paint a label a or b on an atom (the paint itself
would consist of other atoms) or say something like “this is the atom with the
blue hair”. An elementary object like an atom has only a certain number of
distinguishing features. It has a momentum, an excitation energy, an angular
momentum, and that’s it. In particular, besides those basic physical properties,
it has no label or other identifying features that could distinguish it from other
identical atoms.
Let us imagine that we remove the labels from a set of classical billiard balls
and paint them all with the same color. Let us further assume that they are all
perfectly round, have the same radius, and have exactly the same content. Then
it would also become diﬃcult to distinguish them. However, we could still keep
track of the particles by following the individual balls along their trajectories and
5.4. THE ENTROPY OF THE CLASSICAL IDEAL GAS
49
say something like “this is the same ball that started originally with momentum
p1 at position x1 ”. Interestingly, this again does not work for actual atoms which
are quantum objects. In contrast to classical physics, due to the Heisenberg uncertainty principle, in quantum mechanics one cannot simultaneously measure
both momentum and position with inﬁnite precision. A quantum particle does
not even have simultaneously a welldeﬁned position and momentum. This has
farreaching physical consequences. In particular, this means that the concept of
a classical particle trajectory no longer makes sense quantum mechanically (after
all, the classical trajectory simultaneously speciﬁes both position and momentum
of a particle). If the concept of a trajectory no longer makes sense, we cannot
keep track of individual atoms and distinguish them in that way. It is an inescapable consequence of quantum mechanics that identical particles cannot be
distinguished. Hence, the combinatorial factor N ! of a classical “gas” of billiard
balls is absent for a system of identical atoms. Of course, diﬀerent atoms (like H
and O) can still be distinguished.
5.4
The Entropy of the Classical Ideal Gas
Let us consider the entropy for the classical ideal gas. First of all, the deﬁning
formula for the entropy
S = −kB
ρ[n] log ρ[n]
(5.4.1)
[ n]
is not readily applicable in this case because the conﬁguration space is not discrete
but continuous. In this case, it is natural to deﬁne the entropy as
S = −kB
D xD p ρ[x, p] log ρ[x, p].
(5.4.2)
In order to see that these two expressions for the entropy are consistent, let us
divide the 6N dimensional phase space into elementary hypercubic cells cn of
arbitrarily chosen volume ∆N . The cells cn are enumerated by a discrete label
n. We can then average the probability distribution over an elementary cell and
introduce
D xD p ρ[x, p],
(5.4.3)
ρ[n] =
cn
which is properly normalized by
ρn =
[ n]
[ n]
cn
D xD p ρ[x, p] =
D xD p ρ[x, p] = 1.
(5.4.4)
50
CHAPTER 5. CANONICAL ENSEMBLE FOR THE IDEAL GAS
In the limit of small cells (i.e. ∆ → 0) we have
ρ[n] = ρ[x, p].
(5.4.5)
Now the original deﬁnition of the entropy is applicable and one obtains
S = −kB
[ n]
ρ[n] log ρ[n] = −kB
D xD p ρ[x, p] log(ρ[x, p]).
(5.4.6)
One should ultimately take the limit ∆ → 0. In this limit the entropy diverges.
This may not be too surprising, since it indeed requires an inﬁnite number of binary questions to ﬁgure out the location of a particle in a continuous conﬁguration
space.
The divergence of the entropy of the classical ideal gas is an “ultraviolet
catastrophe”, similar to the Jeans catastrophe in classical black body radiation.
In that case, a black body would radiate an inﬁnite amount of energy — obviously
a nonsensical result that puzzled physicists more than a hundred years ago. The
puzzle was ﬁnally solved by Planck who introduced the quantum h. The fact
that h = 0 cutsoﬀ the radiation in the ultraviolet and thus prevents the Jeans
catastrophe. As we will see later, quantum mechanics also prevents the ultraviolet
catastrophe of the divergent entropy of a classical gas. Indeed, we will be able
to relate the volume of an elementary cell of phase space ∆ = h3 with Planck’s
quantum h.
In order to proceed further without switching to a quantum mechanical treatment, we will keep the phase space volume ∆ small but nonzero. For the classical
ideal gas we then obtain
S = −kB
= kB
=
D xD p ρ[x, p] log(ρ[x, p])
1
Z (β )
D xD p exp(−β H[x, p]){log Z (β ) + β H[x, p]}
1
(E − F ).
T
(5.4.7)
We have identiﬁed the free energy from
Z (β ) = exp(−βF ).
(5.4.8)
Using the previous result for Z (β ) of eqs.(5.1.3,5.1.4) we obtain
F = −kB T log Z (β ) = −N kB T log z (β ) = −N kB T log
V
∆
2πM
β
3/2
.
(5.4.9)
51
5.5. GIBBS’ PARADOX
Also using E = H = 3 N kB T we ﬁnally get
2
3
V
S = N kB + N kB log
2
∆
2πM
β
3/2
.
(5.4.10)
Again, we see that the entropy diverges in the limit ∆ → 0.
5.5
Gibbs’ Paradox
The previous result for the entropy of the classical ideal gas is related to the
socalled Gibbs paradox. The paradox is related to the fact that in the above
expression the entropy is not an extensive quantity. Extensive quantities are
proportional to the system size. For example, if we increase both the number
of particles N and the volume V by the same factor λ, an extensive quantity
should also increase by the same factor λ. Hence, if the entropy were an extensive
quantity, it should obey S (λN, λV ) = λS (N, V ). Since the volume V is contained
in the logarithmic term of eq.(5.4.10), in this case the entropy is indeed not
extensive.
Why is this paradoxical? Let us consider a container of volume V which is
divided into two parts of volume V /2 by a removable wall. Both parts are ﬁlled
with N/2 particles of an ideal gas at the same temperature T on both sides. Let
us now remove the wall. From a macroscopic point of view not much happens. In
particular, the gas remains in thermal equilibrium. When we slide the wall back
in, again not much happens and we essentially return to the initial situation.1
We say that removing the wall is a reversible (or adiabatic) process. In such
a process the system stays in thermal equilibrium. According to the rules of
thermodynamics, entropy is conserved in reversible (adiabatic) processes. On
the other hand, after removing the wall, according to eq.(5.4.10) the entropy
increases by
S (N, V ) − 2S (N/2, V /2) =
V
N kB log
∆
N kB log 2.
2πM
β
3/2
V
− N kB log
2∆
2πM
β
3/2
=
(5.5.1)
1
Of course, after sliding the wall back in, we cannot expect that exactly N/2 particles will
be in each half of the container. However, for suﬃciently large N this will still be the case to
very high accuracy.
52
CHAPTER 5. CANONICAL ENSEMBLE FOR THE IDEAL GAS
Indeed this seems to make sense, because after removing the wall our information
deﬁcit has increased. Compared to the situation with the wall, we now need to
ask one binary question for each particle in order to ﬁgure out in which half of
the container it presently is. However, this obviously contradicts the reversible
nature of the above process performed with an actual gas.
What is wrong with our model of the ideal gas? And thus, how can we resolve
Gibbs’ paradox? As we have pointed out before, actual atoms and molecules are
indistinguishable objects, while we treated the constituents of the classical ideal
gas as distinguishable. What would change if we would treat these constituents as
indistinguishable? Since there are N ! ways of enumerating N objects, we would
simply have to divide the partition function by the factor N !. The corresponding
partition function then takes the form
Z ′ (β ) =
1
1
Z (β ) =
z (β )N .
N!
N!
(5.5.2)
When we determine the average energy E ′ = H (or other thermal expectation
values) the additional factor N ! drops out since it is absorbed into the normalization of the total probability and hence E ′ = E . However, it aﬀects the free
energy F ′ and thus the entropy S ′ . In particular, we now obtain
1
1
Z (β ) =
exp(−βF ) = exp(−βF ′ ) ⇒
N!
N!
F′
F
βF ′ = βF + log N ! ⇒
= + kB (N log N − N ),
T
T
Z ′ (β ) =
(5.5.3)
where we have used Sterling’s formula. Hence, one ﬁnds
S′ =
1′
(E − F ′ ) = S − kB (N log N − N ).
T
(5.5.4)
Inserting the result of eq.(5.4.10) we now obtain
V
5
S = N kB + N kB log
2
N∆
′
2πM
β
3/2
.
(5.5.5)
Interestingly, this expression for the entropy is indeed extensive, i.e.
S ′ (λN, λV ) = λS ′ (N, V ).
(5.5.6)
Consequently, taking into account the fundamental indistinguishability of the
true elementary constituents of the gas — namely atoms or molecules — resolves
Gibbs’ paradox.
53
5.6. MIXING ENTROPY
5.6
Mixing Entropy
In order to further clarify these issues, let us also discuss a physically diﬀerent
situation. We consider again a container of total volume V divided into two
parts of volume V /2. However, this time the gas particles in subvolume 1 are
distinguishable from those in subvolume 2. For example, we can imagine that
the particles in subvolume 1 have mass M1 , while those in subvolume 2 have
mass M2 and are thus distinguishable from the particles in subvolume 1. Still,
we want to assume that the particles in each subvolume are indistinguishable
among themselves. After removing the wall, the two gases mix and both types
of particles can now occupy the entire volume. If we reintroduce the wall after a
while, the situation will be very diﬀerent from before. In particular, gas particles
of both types will occupy both subvolumes. Hence, the process of mixing the
two gases is irreversible. Indeed, the corresponding entropy increase is now given
by
′
′
′
′
S1 (N/2, V ) + S2 (N/2, V ) − S1 (N/2, V /2) − S2 (N/2, V /2) =
2V
N
kB log
2
N∆
V
N
− kB log
2
N∆
N kB log 2.
2πM1
β
2πM1
β
3/2
+
3/2
2V
N
kB log
2
N∆
N
V
− kB log
2
N∆
2πM2
β
2πM2
β
3/2
3/2
=
(5.6.1)
We see that the distinguishability of the two particle types leads to a nonzero
mixing entropy.
54
CHAPTER 5. CANONICAL ENSEMBLE FOR THE IDEAL GAS
Chapter 6
Grand Canonical Ensemble
Until now we have considered the microcanonical and canonical ensembles in
which the particle number of the system was ﬁxed. In this chapter we introduce
a new ensemble in which the particle number is not ﬁxed but is controlled by
a new parameter — the chemical potential. The corresponding ensemble is the
socalled grand canonical ensemble.
6.1
Introduction of the Grand Canonical Ensemble
Let us imagine an empty container in which we punch a small hole. When the
container is surrounded by air, some air molecules will ﬁnd the small hole and will
enter the container. Soon, the container will be ﬁlled with air and an equilibrium
will be reached. Still, some air molecules will enter the container but others will
leave. On the average, the number of entering and exiting air molecules per unit
of time is the same, but still the number of air molecules inside the container
is not always the same. The atmosphere surrounding the container provides a
particle reservoir, just like a heat bath provides an energy reservoir. Just like the
temperature controls the average energy, a new parameter controls the average
particle number. This parameter is the chemical potential. Temperature and
chemical potential are socalled thermodynamical potentials. While energy and
particle number are extensive quantities (they increase with the system size),
the thermodynamical potentials are intensive (and thus are independent of the
system size).
Let us assume that a system can exist in a countable number of conﬁgurations
55
56
CHAPTER 6. GRAND CANONICAL ENSEMBLE
[n]. The total energy of the system is given by H[n] and the total number
of particles is given by N [n]. In particular, not all conﬁgurations contain the
same number of particles. The grand canonical ensemble is then deﬁned by the
probability distribution
ρ[n] =
1
exp(−β [H[n] − µN [n]]).
Z (β, µ)
(6.1.1)
Here µ is the chemical potential, an energy that we hand to each particle that is
willing to leave the reservoir and enter the system. The grand canonical partition
function
Z (β, µ) =
[ n]
exp(−β [H[n] − µN [n]])
(6.1.2)
guarantees that the probability distribution is properly normalized to
ρ[n] = 1.
(6.1.3)
[ n]
The thermal average of some physical quantity O is given by
O=
1
Z (β, µ)
[ n]
O[n] exp(−β [H[n] − µN [n]]).
(6.1.4)
As in the canonical ensemble, the partition function can be used as a generating
functional for the thermal average and the variance of the total energy and the
particle number. At this point it is useful to treat β and βµ as independent
quantities. Then one can write
N=
∂ log Z (β, µ)
1
=
∂ (βµ)
Z (β, µ)
[ n]
N [n] exp(−β [H[n] − µN [n]]),
∂ 2 log Z (β, µ)
,
∂ (βµ)2
1
∂ log Z (β, µ)
H[n] exp(−β [H[n] − µN [n]]),
=
H =−
∂β
Z (β, µ)
(∆N )2 = N 2 − N
2
=
[ n]
(∆H)2 = H2 − H
2
=
∂ 2 log Z (β, µ)
∂β 2
.
(6.1.5)
6.2. GRAND CANONICAL ENSEMBLE OF PARTICLES ON A LADDER 57
In the grand canonical ensemble the entropy is given by
S = −kB
ρ[n] log ρ[n]
[ n]
kB
=−
Z (β, µ)
[ n]
exp(−β [H[n] − µN [n]])[−β [H[n] − µN [n]] − log Z (β, µ)]
1
= [E − µN − kB T log Z (β, µ)].
T
(6.1.6)
The analog of the free energy F in the canonical ensemble is the socalled grand
canonical potential J which is deﬁned by
Z (β, µ) = exp(−βJ ),
(6.1.7)
J = E − µN − T S.
(6.1.8)
such that
6.2
Grand Canonical Ensemble of Particles on a Ladder
For illustrative purposes, let us again consider the system of particles on the
energy ladder, now coupled to a particle reservoir. We can view this as a simple
model for the atmosphere. As before we will consider distinguishable particles.
The canonical partition function takes the form
Z (β, µ) =
[ n]
∞
exp(−β [H[n] − µN [n]])
Z (β ) exp(βµN ) =
=
N =0
=
∞
[z (β ) exp(βµ)]N
N =0
1 − exp(−βǫ)
1
=
.
1 − z (β ) exp(βµ)
1 − exp(−βǫ) − exp(βµ)
(6.2.1)
Here z (β ) is again the single particle partition function of the canonical ensemble.
It should be noted that we have implicitly assumed that exp(βµ) < 1 − exp(−βǫ).
Otherwise the geometric series diverges and the partition function is inﬁnite. We
now obtain
N= N =
∂ log Z (β, µ)
exp(βµ)
=
,
∂ (βµ)
1 − exp(−βǫ) − exp(βµ)
(6.2.2)
58
CHAPTER 6. GRAND CANONICAL ENSEMBLE
which implies
exp(βµ) =
1 − exp(−βǫ)
1 + 1/N
(6.2.3)
Similarly
E= H =−
ǫ exp(−βǫ)
exp(βµ)
∂ log Z (β, µ)
=
,
∂β
1 − exp(−βǫ) − exp(βµ) 1 − exp(−βǫ)
and therefore one ﬁnds
E=N
ǫ
,
exp(βǫ) − 1
(6.2.4)
(6.2.5)
which is exactly the same result as in the canonical ensemble. Hence, we see
that just like the canonical and microcanonical ensemble, also the canonical and
grand canonical ensemble are physically equivalent.
Finally, let us also calculate the grand canonical potential
J =−
1
1 − exp(−βǫ) − exp(βµ)
log Z (β, µ) = kB T log
,
β
1 − exp(−βǫ)
(6.2.6)
which then yields the entropy
1
(E − µN − J )
T
exp(βµ)
βǫ
= kB
− βµ
1 − exp(−βǫ) − exp(βµ) exp(βǫ) − 1
1 − exp(−βǫ) − exp(βµ)
− kB log
.
1 − exp(−βǫ)
S=
6.3
(6.2.7)
Chemical Potential of Particles on a Ladder
Let us also derive the chemical potential from the canonical ensemble. For this
purpose we consider the ﬁrst law of thermodynamics which takes the form
dE = T dS + µdN.
(6.3.1)
From this we can read oﬀ the relations
µ=
∂E
S ,
∂N
as well as
µ = −T
∂S
E .
∂N
(6.3.2)
(6.3.3)
59
6.3. CHEMICAL POTENTIAL OF PARTICLES ON A LADDER
Let us consider the entropy of N particles on the ladder
S = kB N
βǫ
− log[1 − exp(−βǫ)] .
exp(βǫ) − 1
(6.3.4)
The average energy is given by
E=N
and one then obtains
ǫ
Nǫ
⇒ exp(βǫ) = 1 +
,
exp(βǫ) − 1
E
(6.3.5)
E
Nǫ
Nǫ
) log(1 +
) − kB N log
.
(6.3.6)
ǫ
E
E
It is then straightforward to show that
∂S
E = kB T log[1 − exp(−βǫ)].
(6.3.7)
µ = −T
∂N
Indeed, in the thermodynamical limit N → ∞ this agrees with the result of
eq.(6.2.3). Hence, we see that the canonical and grand canonical ensembles yield
the same physical results for a large number of particles.
S = kB (N +
We can even go back to the microcanonical ensemble. Then the entropy is
given in terms of the information deﬁcit I as
S = kB I log 2 = kB log Z (E ) = kB log Z (M, N ) = kB log
(M + N − 1)!
. (6.3.8)
M !(N − 1)!
Using the Stirling formula in the thermodynamical limit M, N → ∞ implies
S = kB [(M + N ) log(M + N ) − M log M − N log N.
(6.3.9)
Keeping M (and thus the energy E = M ǫ) ﬁxed, the chemical potential is then
given by
∂S
M
µ = −T
M = −kB T log 1 +
.
(6.3.10)
∂N
N
Keeping in mind the earlier result
1
M
=
,
(6.3.11)
1+
N
1 − exp(−βǫ)
this again implies
exp(βµ) = 1 − exp(−βǫ).
(6.3.12)
Thus, as one would expect, the grand canonical and the microcanonical ensemble
are also equivalent in the thermodynamical limit.
We have produced quite a number of results for the system of particles on
the energy ladder for the microcanonical, the canonical, and the grand canonical
ensemble. Since these results are scattered through various chapters, here we
summarize the main results in table 6.1.
60
CHAPTER 6. GRAND CANONICAL ENSEMBLE
Ensemble
Particle
number
Energy
Inverse
temperature
Chemical
potential
thermodynamical
potential
Entropy
microcanonical
N
E = Mǫ
β = 1 log(1 +
ǫ
N
M)
βµ = − log(1 +
M
N)
information
deﬁcit
M +N −1)!
I = log2 (M !(N −1)!
M +N −1)!
S = kB log (M !(N −1)!
canonical
N
grand canonical
N=
[eβµ − eβ (µ−ǫ) − 1]−1
H = N ǫ/(eβǫ − 1)
1
β = kB T
H = N ǫ/(eβǫ − 1)
1
β = kB T
βµ = log(1 − e−βǫ )
βµ =
µ
kB T
free
energy
F = N log(1 − e−βǫ )
β
grand canonical
potential
e−βǫ βµ
1
J = β log 1−1−e−−e
βǫ
− log(1 − e−βǫ )}
βǫ
×( eβǫ −1 − βµ)
βµ
βǫ
S = N kB { eβǫ −1
e
S = kB 1−e−βǫ −eβµ
−kB log
1−e−βǫ −eβµ
1−e−βǫ
Table 6.1: Summary of results for particles on the energy ladder.
6.4
Chemical Potential of the Classical Ideal Gas
Just as we calculated the pressure of the ideal gas before, we will now calculate
its chemical potential. As before, the starting point is the ﬁrst law of thermodynamics which now reads
dE = T dS − pdV + µdN.
(6.4.1)
From this we can read oﬀ the relations
∂E
S,V ,
(6.4.2)
µ=
∂N
as well as
∂S
µ = −T
E,V .
(6.4.3)
∂N
Let us consider the entropy of the classical ideal gas consisting of indistinguishable
particles and express the inverse temperature as
β=
1
3N
=
.
kB T
2E
(6.4.4)
Then we obtain
S=
5
V
N kB + N kB log
2
N∆
4πM E
3N
3/2
,
(6.4.5)
61
6.5. GRAND CANONICAL ENSEMBLE FOR THE IDEAL GAS
and hence we ﬁnd
µ = −T
∂S
E,V
∂N
5
V
= − kB T − kB T log
2
N∆
V
= −kB T log
N∆
2πM
β
3/2
4πM E
3N
+ N kB T
1
3
+
N
2N
3/2
.
(6.4.6)
Indeed this is an intensive quantity. It is independent of the system size because
it depends only on the density n = N/V and the temperature (which are both
also intensive).
6.5
Grand Canonical Ensemble for the Ideal Gas
Let us again consider a classical ideal gas of indistinguishable particles. The
grand canonical partition function can be written as
∞
∞
Z (β ) exp(βµN ) =
Z (β, µ) =
N =0
N =0
= exp[z (β ) exp(βµ)] = exp[
z (β )N
exp(βµN )
N!
V
∆
2πM
β
3/2
exp(βµ)].
(6.5.1)
This implies
N= N =
V
∂ log Z (β, µ)
=
∂ (βµ)
∆
2πM
β
3/2
exp(βµ),
(6.5.2)
which then yields
µ = kB T log
N∆
V
β
2πM
3/2
(6.5.3)
in agreement with the result obtained in the canonical ensemble. Similarly, one
obtains
∂ log Z (β, µ)
V
3
E= H =−
= exp(βµ)
∂β
∆
2
2πM
β
1/2
again in agreement with the canonical ensemble result.
3
2πM
= N kB T, (6.5.4)
β2
2
62
CHAPTER 6. GRAND CANONICAL ENSEMBLE
The grand canonical potential is obtained as
J =−
1
V
log Z (β, µ) = −kB T
β
∆
2πM
β
3/2
exp(βµ),
(6.5.5)
and the entropy is hence given by
S=
=
1
(E − µN − J )
T
V
5
N kB + kB N log
2
N∆
2πM
β
3/2
.
(6.5.6)
Chapter 7
Pressure Ensemble
In this chapter we discuss the pressure ensemble in which the volume is not ﬁxed
but controlled by an intensive thermodynamical parameter — the pressure.
7.1
Introduction of the Pressure Ensemble
In the canonical ensemble the average energy is controlled by the temperature.
In the grand canonical ensemble, in addition, the particle number is controlled by
the chemical potential. In both cases, the physical system exchanges an extensive
quantity (the energy or the particle number) with a reservoir. This exchange is
controlled by an intensive thermodynamical potential (the temperature or the
chemical potential). Another extensive quantity is the volume of the system. In
the pressure ensemble, the system can exchange volume with a reservoir. This
exchange is controlled by another intensive thermodynamical potential — the
pressure. The pressure ensemble is, for example, realized by a gas in a container
that is closed by a movable piston. By moving the piston, the volume of the
system is traded against the volume of the rest of the world, controlled by the
ambient pressure.
The pressure ensemble is deﬁned by the probability distribution
ρ[n] =
1
exp(−β [H[n] + pV [n]]).
Z (β, p)
(7.1.1)
Here p is the pressure and V [n] is the volume occupied by a given conﬁguration
63
64
CHAPTER 7. PRESSURE ENSEMBLE
[n]. The partition function of the pressure ensemble takes the form
Z (β, p) =
[ n]
exp(−β [H[n] + pV [n]]).
(7.1.2)
As usual, the thermal average of a physical quantity O is given by
O=
1
Z (β, p)
[ n]
O[n] exp(−β [H[n] + pV [n]]).
(7.1.3)
The partition function can be used as a generating functional for the thermal
average and the variance of the volume. Then it is useful to treat β and βp as
independent quantities, and one obtains
V =−
∂ log Z (β, p)
1
=
∂ (βp)
Z (β, p)
(∆V )2 = V 2 − V
2
=
[ n]
V [n] exp(−β [H[n] + pV [n]]),
∂ 2 log Z (β, p)
.
∂ (βp)2
(7.1.4)
In the pressure ensemble the entropy is given by
S = −kB
=−
=
ρ[n] log ρ[n]
[ n]
kB
Z (β, p)
[ n]
exp(−β [H[n] + pV [n]])[−β [H[n] + pV [n]] − log Z (β, p)]
1
[E + pV + kB T log Z (β, p)].
T
(7.1.5)
The analog of the free energy F in the pressure ensemble is the socalled free
enthalpy G which is deﬁned by
Z (β, p) = exp(−βG),
(7.1.6)
G = E + pV − T S.
(7.1.7)
such that
7.2
The Pressure of the Classical Ideal Gas
We have seen earlier that the kinetic theory of the classical ideal gas yields the
ideal gas law pV = N kB T . Now we want to reproduce this result using the
7.3. THE PRESSURE ENSEMBLE FOR THE CLASSICAL IDEAL GAS
65
canonical ensemble. In order to compress a gas at pressure p by a volume dV one
must do the work −pdV . If the compression is adiabatic (i.e. reversible, without
entropy change) the work is entirely converted into internal energy of the gas and
dE = −pdV.
(7.2.1)
This is a special case (with dS = 0) of the ﬁrst law of thermodynamics
dE = T dS − pdV,
(7.2.2)
which expresses nothing but energy conservation. The pressure can now be determined as
∂E
S,N ,
(7.2.3)
p=−
∂V
or alternatively as
∂S
p=T
E,N ,
(7.2.4)
∂V
The subscripts S , N , and E specify which quantities are kept ﬁxed during the
compression. Let us use the previous equation to determine the pressure of the
ideal gas. We then indeed obtain
p = N kB T
7.3
∂ log V
⇒ pV = N kB T.
∂V
(7.2.5)
The Pressure Ensemble for the Classical Ideal Gas
Let us once more consider a classical ideal gas of indistinguishable particles, this
time in the pressure ensemble. The corresponding partition function then takes
the form
∞
∞
dV Z (β ) exp(−βpV ) = βp
Z (β, p) = βp
0
∞
0
1V
N! ∆
1
1
N ! βp∆
2πM
β
dV
= βp
=
2πM
β
dV
0
z (β )N
exp(−βpV )
N!
3/2 N
exp(−βpV )
3/2 N
Γ(N + 1).
(7.3.1)
Here Γ(N + 1) = N ! is the gammafunction, such that
1
Z (β, p) =
βp∆
2πM
β
3/2 N
.
(7.3.2)
66
CHAPTER 7. PRESSURE ENSEMBLE
This implies
V = V =−
∂ log Z (β, p)
N
=
,
∂ (βp)
βp
(7.3.3)
and thus leads again to the ideal gas law pV = N kB T . Similarly, we obtain
E= H =−
3
∂ log Z (β, p)
= N kB T,
∂β
2
(7.3.4)
again in agreement with the canonical ensemble result.
The free enthalpy is given by
G=−
1
N
log Z (β, p) =
log β p∆
β
β
β
2πM
3/2
,
(7.3.5)
such that the entropy takes the form
S=
=
1
(E + pV − G)
T
1
5
N kB + N kB log
2
βp∆
2πM
β
3/2
.
(7.3.6)
Scattered through various chapters, we have derived a variety of relations for
the classical ideal gas for indistinguishable particles. Table 7.1 summarizes these
results.
7.4
Overview of Diﬀerent Ensembles
If necessary, one can deﬁne further ensembles. Basically any other conserved
extensive quantity besides energy, particle number, or the volume of the available
space can be exchanged with a reservoir and thus be controlled by an appropriate
intensive parameter analogous to T , µ, or p. Along with any new ensemble come
new physical quantities analogous to the free energy F , like the grand canonical
potential J and the free enthalpy G. The subject of statistical mechanics may
sometimes seem rather formal because one may be overwhelmed by a zoo of
diﬀerent quantities whose physical meaning may not always be clear immediately.
However, as one continues to work with these concepts, one will appreciate their
physical relevance. Table 7.2 summarizes the ensembles and associated physical
quantities introduced so far.
67
7.4. OVERVIEW OF DIFFERENT ENSEMBLES
Physical
quantity
Energy
Inverse temperature
Volume
Pressure
Entropy
Classical ideal gas of
indistinguishable particles
3
E = 2 N kB T
β = 1/kB T = 3N/2E
V = N kB T /p
p = N kB T /V
5
2
S = N kB
3/2
2πM
β
V
N∆
+ log
(canonical)
Free energy
Chemical potential
F = −N kB T
µ = −kB T log
Particle number
N=
V
∆
V
N∆
1 + log
2πM
β
V
N∆
3/2
2πM
β
3/2
3/2
2πM
β
exp(βµ)
(grand canonical)
Entropy
S=
V
∆
2πM
β
3/2
5
2
exp(βµ)kB
(grand canonical)
Grand canonical potential
Entropy
V
J = −∆
2πM
β
3/2
S = 5 N kB + N kB log
2
− βµ
exp(βµ)kB T
1
βp∆
2πM
β
3/2
(pressure)
Free enthalpy
G = −N kB T log
1
βp∆
2πM
β
3/2
Table 7.1: Physical quantities for the classical ideal gas of indistinguishable particles.
68
CHAPTER 7. PRESSURE ENSEMBLE
Ensemble
ﬁxed parameters
Boltzmann factor
partition function
thermal averages
entropy
thermodynamical
potential
microcanonical
E, N, V
1
Z (E )
ﬁxed E
S = kB I log 2
information deﬁcit
I = log2 Z (E )
Ensemble
ﬁxed parameters
Boltzmann factor
partition function
thermal averages
entropy
grand canonical
β, µ, V
exp(−β [H − µN ])
Z (β, µ)
N = ∂ log(Z (β,µ)
∂ βµ)
1
S = T (E − µN )
+kB log Z (β, µ)
grand canonical
potential
J = E − µN − T S
thermodynamical
potential
canonical
β, N, V
exp(−β H)
Z (β )
H = − ∂ log Z (β )
∂β
S = E + kB log Z (β )
T
free energy
F = E − TS
pressure
β, N, p
exp(−β [H + pV ])
Z (β, p)
V = − ∂ log(Z (β,p)
∂ βp)
1
S = T (E + pV )
+kB log Z (β, p)
free enthalpy
G = E + pV − T S
Table 7.2: Comparison of various ensembles.
Chapter 8
Equilibrium Thermodynamics
Thermodynamics is a phenomenological science developed in the 19th century
before the microscopic structure of matter was understood. With our present
understanding of statistical mechanics based on microscopic systems of atoms or
molecules we can attempt to derive thermodynamics from the underlying microscopic physics. Here we consider statistical systems in thermal equilibrium.
8.1
The First Law of Thermodynamics
The ﬁrst law of thermodynamics is just a consequence of energy conservation.
Already in the 19th century physicists knew that the internal energy of a gas can
essentially change in two ways. Either the gas does some work (the goal of the
steam engine designers of those days) or it is heated (a necessary prerequisite to
let the engine work). The energy balance of a gas can hence be written as
dE = δW + δQ,
(8.1.1)
Here dE is an inﬁnitesimal change of the internal energy and
δW = −pdV
(8.1.2)
is an inﬁnitesimal amount of work done by the gas. If the gas expands by a small
volume dV > 0 at some pressure p it exerts a force on a movable piston. For
example, if the piston is moved by a distance dx and has a cross sectional area
A we have dV = Adx. The pressure p = F/A is a force per area, and hence the
69
70
CHAPTER 8. EQUILIBRIUM THERMODYNAMICS
work (force times distance) is given by
F dx =
F
Adx = pdV.
A
(8.1.3)
The gas is doing work pdV for us. The corresponding energy is lost to the gas
and shows up as −pdV in the internal energy balance of the gas itself. In order
to make the gas do work for us we must supply some energy by heating it. The
corresponding term δQ in the energy balance of the gas is the heat supplied from
outside.
8.2
Expansion of a Classical Ideal Gas
Let us ﬁrst consider the expansion of a gas without heat transfer (δQ = 0). This
is called adiabatic expansion. The ﬁrst law of thermodynamics then takes the
form
dE = −pdV.
(8.2.1)
For a classical ideal gas we have
3
3
E = N kB T = pV,
2
2
(8.2.2)
such that
3
5
dp
5 dV
3
=−
. (8.2.3)
dE = (pdV + dpV ) = −pdV ⇒ dpV = − pdV ⇒
2
2
2
p
3V
Integrating both sides of this equation we obtain
pf
pi
5
dp
=−
p
3
Vf
Vi
pf
Vf
pf
5
dV
⇒ log
= − log
⇒
=
V
pi
3
Vi
pi
Vi
Vf
5/3
.
(8.2.4)
Here pi and Vi are the initial pressure and volume (before the expansion) and
pf and Vf are the ﬁnal pressure and volume (after the expansion). It should be
noted that if Vf > Vi then pf < pi , i.e. adiabatic expansion implies a pressure
decrease. Using
pi Vi = N kB Ti , pf Vf = N kB Tf ,
(8.2.5)
the initial and ﬁnal temperatures Ti and Tf are related by
Tf
=
Ti
Vi
Vf
2/3
,
(8.2.6)
71
8.3. HEAT AND ENTROPY CHANGE
such that adiabatic expansion also implies cooling.
Now we want to consider expansion without cooling. This is possible only
if we transfer heat from outside. We want to supply enough energy to keep
the temperature constant. Such a process is known as isothermal expansion. In
3
order to maintain a constant temperature (according to E = 2 N kB T ) the internal
energy E must remain unchanged. In that case the ﬁrst law of thermodynamics
takes the form
dV
.
(8.2.7)
dE = δW + δQ = 0 ⇒ δQ = pdV = N kB T
V
The total work done by the gas then equals the heat transfered to the gas and is
given by
Vf
Vf
dV
W = N kB T
= N kB T log .
(8.2.8)
V
Vi
Vi
8.3
Heat and Entropy Change
Let us consider the pressure ensemble. Then we have
ρ[n] =
1
exp(−β [H[n] + pV [n]]),
Z (β, p)
(8.3.1)
and
E= H =
V= V =
[ n]
[ n]
H[n]ρ[n] ⇒ dE =
V [n]ρ[n] ⇒ dV =
[ n]
[ n]
H[n]dρ[n],
V [n]dρ[n].
(8.3.2)
It should be noted that H[n] and V [n] are completely ﬁxed for a given conﬁguration n and hence dH[n] = dV [n] = 0. For the heat we thus obtain
δQ = dE + pdV =
[ n]
(H[n] + pV [n])dρ[n].
(8.3.3)
Heat is a disordered form of energy contained in the chaotic motion of the
atoms or molecules of a gas. When we heat a gas, the motion of its constituents
becomes more chaotic which should increase the entropy. Hence, we expect a
relation between heat and entropy change. Let us consider the entropy
S = −kB
ρ[n] log ρ[n].
[ n]
(8.3.4)
72
CHAPTER 8. EQUILIBRIUM THERMODYNAMICS
In the pressure ensemble an inﬁnitesimal entropy change is then given by
dS = −kB
= −kB
= −kB
1
T
(dρ[n] log ρ[n] + ρ[n]d log ρ[n])
[ n]
dρ[n](log ρ[n] + 1)
[ n]
[ n]
dρ[n] (− log Z (β, p) − β (H[n] + pV [n]))
[ n]
dρ[n](H[n] + pV [n]).
(8.3.5)
dρ[n] = 0,
=
(8.3.6)
Here we have used
[ n]
which results from the normalization condition
identify
δQ = T dS,
[n] ρ[n]
= 1. Hence, we can now
(8.3.7)
i.e. heat transfer indeed implies a change of the entropy. The ﬁrst law of thermodynamics can now be written as
dE = T dS − pdV.
(8.3.8)
A change of the thermodynamical state of a system is called adiabatic if δQ =
T dS = 0, i.e. for adiabatic processes the entropy is conserved.
For a generalized ensemble which allows both volume and particle exchange
the ﬁrst law of thermodynamics is given by
dE = T dS − pdV + µdN.
(8.3.9)
The last term represents a contribution to the internal energy due to the addition of particles with energy µ (the chemical potential) per particle. The above
equation immediately implies other relations of a similar form. In particular, the
changes of the free energy, the grand canonical potential, and the free enthalpy
take the form
dF = d(E − T S ) = −SdT − pdV + µdN,
dJ = d(E − T S − µN ) = −SdT − pdV − N dµ,
dG = d(E − T S + pV ) = −SdT + V dp + µdN.
(8.3.10)
73
8.4. EQUATIONS OF STATE
8.4
Equations of State
The thermodynamical behavior of some substance can be characterized by its
equations of state. The socalled caloric equation of state takes the form
E = E (T, N, V ).
(8.4.1)
For a classical ideal gas the caloric equation of state takes the form
3
E = N kB T.
2
(8.4.2)
Similarly, the socalled thermal equation of state is given by
p = p(T, N, V ).
(8.4.3)
For the classical ideal gas the thermal equation of state takes the form
p=
N kB T
.
V
(8.4.4)
Finally, the socalled chemical equation of state is given by
µ = µ(T, N, V ),
(8.4.5)
which for a classical ideal gas takes the form
µ = kB T log
N∆
V
β
2πM
3/2
.
(8.4.6)
In general, these equations of state can be derived from the expression for the
entropy S = S (E, N, V ). Using the ﬁrst law of thermodynamics in the form
dS =
1
(dE + pdV − µdN )
T
(8.4.7)
oneS ≥ 0) this result is very puzzling.
dt
First, one might argue that the above calculation is meaningless, because the
above expression for the entropy is rather formal and indeed inﬁnite in the limit
of vanishing volume element of phase space (∆ → 0). As it was mentioned before,
this ultraviolet catastrophe of the classical theory will be automatically cured by
going to quantum mechanics. Then the volume of an elementary cell of phase
space ∆ = h3 > 0 is indeed nonzero and given in terms of Planck’s quantum
h. As we will see later, the above result of entropy conservation persists at the
quantum level. We should hence indeed conclude that, taking into account all its
microscopic constituents, the total entropy of a closed system is indeed conserved.
What does the second law of thermodynamics then really mean?
9.5
A Model with Entropy Increase
Let us consider a simple model in which the second law of thermodynamics, i.e.
entropy increase ( dS ≥ 0), can indeed be proved. The model consists of a single
dt
particle moving on a line of discrete points x = na. Here n ∈ Z labels the point
and a is the distance between neighboring points. The possible conﬁgurations [n]
are thus labeled by the integer n. At all positions x = na, i.e. in all conﬁgurations
[n], the particle has the same energy. Hence, in thermodynamical equilibrium we
are dealing with a microcanonical ensemble. Instead of imposing a deterministic
continuous Hamiltonian time evolution, we assume that the particle moves from
point to point by a probabilistic dynamics in discrete time steps. In each time
step, the particle can either stay where it is (i.e. n′ = n) or it can hop to a
neighboring point (i.e. n′ = n + 1 or n′ = n − 1). The probabilities for the hops
are described by a transition probability w[n → n′ ] as follows
w[n → n + 1] = w[n → n − 1] = p, w[n → n] = 1 − 2p.
(9.5.1)
82
CHAPTER 9. NONEQUILIBRIUM THERMODYNAMICS
All other transition probabilities are zero and p ∈ [0, 1 ]. Let us now assume that
2
there is an ensemble of particles with some initial probability distribution ρ0 [n].
In a discrete time step the probability distribution changes according to
ρi [n′ ] =
n
ρi−1 [n]w[n → n′ ] = p(ρi−1 [n′ + 1] + ρi−1 [n′ − 1]) + (1 − 2p)ρi−1 [n′ ].
(9.5.2)
Introducing
ρi−1 [n] =
˜
1
(ρi−1 [n + 1] + ρi−1 [n − 1]),
2
(9.5.3)
we thus obtain
ρi [n] = 2pρi−1 [n] + (1 − 2p)ρi−1 [n].
˜
(9.5.4)
After i − 1 time steps the entropy is given by
Si−1 = −kB
ρi−1 [n] log ρi−1 [n].
(9.5.5)
[ n]
We now want to show that Si ≥ Si−1 . For this purpose we use the concavity of
the function
s(ρ) = −ρ log ρ,
(9.5.6)
which is guaranteed by
1
d
d2 s
= − (log ρ + 1) = − ≤ 0.
2
dρ
dρ
ρ
(9.5.7)
The concavity of s(ρ) implies that
s(˜i [n]) = s(2pρi−1 [n] + (1 − 2p)ρi−1 [n]) ≥ 2ps(˜i−1 [n]) + (1 − 2p)s(ρi−1 [n]),
ρ
˜
ρ
(9.5.8)
such that
˜
Si ≥ 2pSi−1 + (1 − 2p)Si−1 .
(9.5.9)
Here we have deﬁned
˜
Si−1 = −kB
ρi−1 [n] log ρi−1 [n].
˜
˜
(9.5.10)
[ n]
In the next step we again use the concavity of s(ρ) which also implies
s(˜i−1 [n]) = s
ρ
ρi−1 [n + 1] + ρi−1 [n − 1]
2
≥
1
(s(ρi−1 [n + 1]) + s(ρi−1 [n − 1])) ,
2
(9.5.11)
83
9.6. A MODEL FOR DIFFUSION
such that
and hence
1
˜
Si−1 ≥ (Si−1 + Si−1 ) = Si−1 ,
2
(9.5.12)
˜
Si ≥ 2pSi−1 + (1 − 2p)Si−1 ≥ Si−1 .
(9.5.13)
Indeed, in this model with probabilistic dynamics the entropy is always increasing.
Obviously, the evolution of our ensemble is irreversible — the entropy will never
decrease again. This is perhaps surprising because the elementary timesteps are
reversible, i.e. the probability for the particle to jump backward is the same as
to jump forward.
9.6
A Model for Diﬀusion
The previous model describes a simple diﬀusion process. We will now derive the
diﬀusion equation by taking a continuum limit of the model. In particular, we
relate the discrete time steps i with a continuous time variable t = iǫ, where ǫ
is the duration of a single discrete time step. We can then rewrite the evolution
equation for the probability distribution as
pa2 ρi−1 [n′ + 1] + ρi−1 [n′ − 1] − 2ρi−1 [n′ ]
ρi [n′ ] − ρi−1 [n′ ]
=
.
ǫ
ǫ
a2
(9.6.1)
Taking the space and time continuum limits a → 0 and ǫ → 0 and keeping the
socalled diﬀusion coeﬃcient
pa2
(9.6.2)
γ=
ǫ
ﬁxed, the above diﬀerences turn into derivatives and we obtain the continuum
diﬀusion equation
∂ρ
∂2ρ
= γ 2.
(9.6.3)
∂t
∂x
Here we have identiﬁed ρ(x, t) = ρi [n] with x = na and t = iǫ. The continuum
probability distribution is normalized as
1
a
dx ρ(x, t) =
ρi [n] = 1.
(9.6.4)
n
Indeed the distribution remains properly normalized during the diﬀusion process
because
d
dt
dx ρ(x, t) =
dx
∂ρ
=γ
∂t
dx
∂2ρ
∂ρ ∞

= 0.
=
2
∂x
∂x −∞
(9.6.5)
84
CHAPTER 9. NONEQUILIBRIUM THERMODYNAMICS
Also one can write a continuity equation
∂ρ
∂ρ ∂j
+ δj =
+
= 0,
∂t
∂t
∂x
(9.6.6)
where the probability current density is given by
j = −γ
∂ρ
.
∂x
(9.6.7)
In the continuum limit the entropy can be identiﬁed as
S (t) = Si = −kB
n
ρi [n] log ρi [n] = −
kB
a
dx ρ(x, t) log ρ(x, t).
(9.6.8)
Hence, one obtains
dS
dt
kB
a
kB
=−
a
=−
kB
∂ρ
∂j
(log ρ + 1) =
(log ρ + 1)
dx
∂t
a
∂x
kB γ
∂
1 ∂ρ 2
≥ 0.
dx j (log ρ + 1) =
dx
∂x
a
ρ ∂x
dx
(9.6.9)
As expected from the discrete model, we hence conﬁrm that also in the continuum
diﬀusion model the entropy always increases.
9.7
Approach to Equilibrium
It is straightforward to verify the following solution of the diﬀusion equation
x2
a
).
exp(−
ρ(x, t) = √
4γt
4πγt
In particular, at t = 0 the probability
Afterward the distribution is Gaussian
√
t. The average distance squared that
by
x(t)2
(9.7.1)
distribution reduces to ρ(x, 0) = aδ(x).
with a width increasing proportional to
a particle reaches after a time t is given
= 2γt.
(9.7.2)
Let us also calculate the entropy as a function of time. It is convenient to ﬁrst
compute
kB γ
1 ∂ ρ 2 kB
dS
=
,
(9.7.3)
=
dx
dt
a
ρ ∂x
2t
85
9.7. APPROACH TO EQUILIBRIUM
which then implies a logarithmic behavior of the entropy itself
S (t) − S (ǫ) =
kB
t
log .
2
ǫ
(9.7.4)
In particular, for late times the entropy increases without bound. This indeed
makes sense, since the information deﬁcit keeps increasing as the particles diﬀuse
to inﬁnity. Hence, this system never reaches thermodynamical equilibrium.
Let us now consider the same system in a ﬁnite volume. In that case, the
particles cannot diﬀuse to inﬁnity and we indeed expect to approach thermodynamical equilibrium. For simplicity we impose periodic boundary conditions over
the interval x ∈ [0, L]. The solution of the diﬀusion equation then takes the form
ρ(x, t) = √
a
4πγt
m∈Z
exp(−
For t → ∞ the solution approaches ρ(x, t) =
S (∞) = −
kB
a
a
L
(x − mL)2
).
4γt
and the entropy turns into
L
dx ρ log ρ = kB log
0
(9.7.5)
L
.
a
(9.7.6)
86
CHAPTER 9. NONEQUILIBRIUM THERMODYNAMICS
Chapter 10
The Ising Model
The Ising model is one of the simplest models in classical statistical mechanics
and still it is applicable to a wide variety of physical systems. It describes idealized magnets, the mixing of ﬂuids, critical opalescence in boiling water at high
pressure, and even special features of the quarkgluon plasma that ﬁlled the early
Universe.
10.1
Deﬁnition and Basic Properties
Let us consider the simplest classical spin model — the socalled Ising model.
Here the word spin does not mean that we deal with quantized angular momenta.
All we do is work with classical spin variables that take values sx = ±1. The Ising
spins are located on the sites of a ddimensional spatial cubic lattice. The Ising
model is characterized by its classical Hamilton function (not a quantum Hamilton operator) which simply speciﬁes the energy of any conﬁguration of spins. The
Ising Hamilton function includes a sum of nearest neighbor contributions
H[s] = −J
xy
sx sy − µB
sx ,
(10.1.1)
x
with a ferromagnetic coupling constant J > 0 that favors parallel spins, plus a
coupling to an external magnetic ﬁeld B . The classical partition function of this
system is given by
Z=
[s]
exp(−β H[s]) =
87
x s x =±1
exp(−β H[s]).
(10.1.2)
88
CHAPTER 10. THE ISING MODEL
The sum over all spin conﬁgurations corresponds to an independent summation
over all possible orientations of individual spins. Thermal averages are computed
by inserting appropriate observables O[s]. For example, the magnetization is
given by
M[s] =
sx ,
(10.1.3)
x
and its thermal expectation value is given by
M=
1
Z
[s]
M[s] exp(−β H[s]) =
∂ log Z
.
∂ (βµB )
(10.1.4)
The magnetic susceptibility is deﬁned as
χ=
1 ∂ 2 log Z
1
( M2 − M 2 ) = d
,
Ld
L ∂ (βµB )2
(10.1.5)
where Ld is the spatial volume. Similarly, the spin correlation function is deﬁned
as
1
sx sy exp(−β H[s]).
(10.1.6)
sx sy =
Z
[s]
At large distances the socalled connected spin correlation function typically decays exponentially
sx sy
c
= sx sy − sx sy ∼ exp(−x − y /ξ ),
(10.1.7)
where ξ is the correlation length. The susceptibility is the connected correlation
function summed over all pairs of points x and y , i.e.
χ=
1
Ld
sx sy c .
(10.1.8)
x,y
At general temperatures the correlation length is typically just a few lattice
spacings. When one models real materials, the Ising model would generally be
a great oversimpliﬁcation, because real magnets, for example, not only have
nearestneighbor couplings. Still, the details of the Hamilton function at the scale
of the lattice spacing are not always important. There is a critical temperature Tc
at which ξ diverges and universal behavior arises. At this temperature a second
order phase transition occurs. Then the details of the model at the scale of the
lattice spacing are irrelevant for the longrange physics that takes place at the
scale of ξ . In fact, at their critical temperatures real materials behave just like
the simple Ising model. This is why the Ising model is so interesting. It is just
89
10.2. MEAN FIELD THEORY
a very simple member of a large universality class of diﬀerent models, which all
share the same critical behavior. This does not mean that they have the same
values of their critical temperatures. However, their correlation lengths diverge
at the critical temperature with the same exponent ν , i.e.
ξ ∝ T − Tc −ν ,
(10.1.9)
their magnetizations go to zero at the critical temperature with the same exponent β
M ∝ T − Tc β , T ≤ Tc ,
(10.1.10)
and their susceptibilities diverge with the same exponent γ
χ ∝ T − Tc −γ ,
(10.1.11)
i.e. the critical exponents ν , β , and γ are identical for diﬀerent systems in the
same universality class.
10.2
Mean Field Theory
Understanding critical behavior is a highly nontrivial issue. For example, there
is no generally applicable analytic method that allows us to determine critical
exponents exactly. An exception are theories in one or two dimensions. As we
will see, in one dimension the Ising model can be solved easily. In two dimensions
the socalled conformal symmetry allows exact calculations. In the 3dimensional
case the socalled ǫexpansion at least provides a systematic expansion whose
convergence is, however, not always guaranteed. In the next chapter we will
learn about the Monte Carlo method which provides an alternative numerical
tool to understand critical phenomena. To illustrate critical behavior we want
to use an approximate method — mean ﬁeld theory. Its results become exact in
the limit of inﬁnitely many dimensions, but should not be trusted quantitatively
in the physically most relevant case d = 3. The idea behind mean ﬁeld theory is
to assume that a spin sx that interacts with its ﬂuctuating nearest neighbors sy
can simply be coupled to a constant averaged mean value s m for the spin. The
mean ﬁeld approximation to the Hamilton function takes the form
Hm [s] = −J
sx d s
x
m
− µB
x
sx = −µBeﬀ
sx ,
(10.2.1)
x
where
µBeﬀ = µB + dJ s
m,
(10.2.2)
90
CHAPTER 10. THE ISING MODEL
is an eﬀective magnetic ﬁeld acting on the spin sx . In the full theory the magnetic
ﬁeld generated by the neighboring spins ﬂuctuates with the values of the spins
sy . In mean ﬁeld theory one neglects these ﬂuctuations and treats the magnetic
ﬁeld as an averaged constant.
In the mean ﬁeld approximation the partition function can now be evaluated
easily
Zm =
[s]
exp(−β Hm [s]) =
exp(βµBeﬀ
x s x =±1
sx )
x
exp(βµBeﬀ sx0 )]V = [2 cosh(βµBeﬀ )]V ,
=[
(10.2.3)
s x 0 =±1
where x0 is any arbitrarily chosen lattice point and V = Ld is the volume (the
total number of lattice sites). The factor d is just the number of bonds per spin
in d dimensions. Since Beﬀ depends on the mean ﬁeld value the of spin s m the
calculation is not yet complete. The next step is to determine the average spin
from a consistency condition. In mean ﬁeld theory we have
m
=
1
Zm
=
s
1
[
Zm
[s]
sx0 exp(−β Hm [s])
sx0 exp(βµBeﬀ sx0 )][2 cosh(βµBeﬀ )]V −1
s x 0 =±1
= tanh(βµBeﬀ ) = tanh(β (µB + dJ s
This is the consistency condition for s
m )).
(10.2.4)
m.
Next we assume that the external magnetic ﬁeld is switched oﬀ, i.e. B = 0.
Then the consistency condition takes the form
s
m
= tanh(βdJ s
m ).
(10.2.5)
For dβJ < 1 this has only the trivial solution s m = 0. However, for dβJ > 1
there is, in addition, a solution with s m = 0. This solution describes spontaneous symmetry breaking: without any bias by an external magnetic ﬁeld, the
spins decide collectively to point in a common direction. The nontrivial solution
appears at a critical temperature Tc that is given by
dβc J = 1 ⇒ Tc =
dJ
.
kB
(10.2.6)
Above this temperature the spin system is in the socalled symmetric (or unbroken) phase with s m = 0, while for T < Tc the system is in the broken (or
10.3. EXACT RESULTS FOR THE 1DIMENSIONAL ISING MODEL
91
ordered) phase in which spontaneous symmetry breaking occurs and s m = 0.
The two phases are separated by a phase transition. The value of the average
spin determines whether the system is in the unbroken or in the ordered phase.
Hence, s is known as an order parameter.
It remains to be shown that the nontrivial solution is indeed physically realized. For that purpose we compare the free energies of the two phases. In the
symmetric phase we have Beﬀ = 0 and hence
Zm = [2 cosh(βµBeﬀ )]V = 2V = exp(−βFs ),
(10.2.7)
In the broken phase, on the other hand, we have Beﬀ  > 0 and hence
exp(−βFb ) > exp(−βFs ) ⇒ Fb < Fs .
(10.2.8)
Since the free energy of the broken phase is smaller than the one of the symmetric
phase, the broken phase is thermodynamically stable for T < Tc .
Next we compute the average spin (magnetization per lattice site) for temperatures below but close to Tc . Expanding the consistency condition for small
sm
1
(10.2.9)
s m = tanh(βdJ s m ) ≈ βdJ s m − (βdJ s m )3 ,
3
one obtains
βdJ s
m
=
3(βdJ − 1) ⇒
s
m
∝
Tc − T = T − Tc β .
(10.2.10)
Here β is the critical exponent introduced before. Its mean ﬁeld value β = 1/2 is
not exact. Since the order parameter goes to zero continuously at Tc , the phase
transition is of second order.
At a ﬁrst order phase transition, on the other hand, the order parameter
makes a discontinuous jump. The Ising model undergoes a ﬁrst order phase
transition at T < Tc when on changes the external magnetic ﬁeld B from positive
to negative values. At B = 0 the magnetization changes abruptly from + s m to
− s m.
10.3
Exact Results for the 1dimensional Ising Model
Mean ﬁeld theory nicely illustrates the qualitative behavior of the Ising model
but it yields only approximate results. In this section we will derive some exact
92
CHAPTER 10. THE ISING MODEL
results for the Ising model in one dimension. The 1dimensional Ising model is
easy to solve analytically. Its partition function (for B = 0) is given by
Z=
exp(βJ
sx sy ).
x s x =±1
(10.3.1)
xy
We consider a lattice with L sites and with periodic boundary conditions (i.e.
sx+L = sx ). Introducing bond variables
b xy = sx sy = ±1,
(10.3.2)
the partition function can be rewritten as
Z=2
exp(βJ
xy b
xy
=±1
b xy )δQ xy
b
xy
,1 .
(10.3.3)
xy
The constraint
b xy = 1,
(10.3.4)
xy
that is enforced by the Kronecker δfunction is a consequence of periodic boundary
conditions. The δfunction can be rewritten as
1
(
b xy )m .
(10.3.5)
δQ xy b xy ,1 =
2 m=0,1
xy
Hence, the partition function takes the form
Z=
m=0,1 xy b
xy
=±1
exp(βJb xy )bm y = {[2 cosh(βJ )]L + [2 sinh(βJ )]L }.
x
(10.3.6)
This exact result diﬀers from the mean ﬁeld result of eq.(10.2.3).
Let us also compute the correlation function sx sy . For this purpose we write
the spin correlation as a string of bond variables
sx sy =
b wz .
(10.3.7)
wz
The product extends over all bonds wz connecting the points x and y . The
correlation function is then given by
sx sy
=
1
Z
b wz exp(βJb xy )bm y
x
m=0,1 xy b
xy
=±1 w z
1
{[2 cosh(βJ )]L−n [2 sinh(βJ )]n
Z
+ [2 sinh(βJ )]L−n [2 cosh(βJ )]n }.
=
(10.3.8)
10.4. EXACT RESULTS FOR THE 2DIMENSIONAL ISING MODEL
93
Here n = x − y  is the distance between the points x and y .
The susceptibility is now obtained from
χ=
1
L
sx sy
x,y
L−1
=
n=0
1
{[2 cosh(βJ )]L−n [2 sinh(βJ )]n
Z
+ [2 sinh(βJ )]L−n [2 cosh(βJ )]n }.
(10.3.9)
Applying the formula for an incomplete geometric series
L−1
xn =
n=0
1 − xL
1−x
(10.3.10)
for x = tanh(βJ ) as well as x = coth(βJ ) one obtains
χ=
=
10.4
1
1 − tanhL (βJ )
1 − cothL (βJ )
{[2 cosh(βJ )]L
+ [2 sinh(βJ )]L
}
Z
1 − tanh(βJ )
1 − coth(βJ )
1 − tanhL (βJ )
exp(2βJ ).
1 + tanhL (βJ )
(10.3.11)
Exact Results for the 2dimensional Ising Model
Next we consider the 2dimensional Ising model. In that case the bond variables
around an elementary lattice square with four sites w, x, y , and z satisfy the
constraint
b wx b xy b yz b z w = 1.
(10.4.1)
For each lattice square we introduce a variable m that implements this constraint
via
1
(10.4.2)
(b wx b xy b yz b z w )m .
δb wx b xy b yz b zw ,1 =
2
m =0,1
We now introduce the dual lattice with sites x at the centers of the squares
˜
The variable m can then be interpreted as a spin variable,
sx = 1 − 2m = ±1,
˜
.
(10.4.3)
94
CHAPTER 10. THE ISING MODEL
on the dual lattice. Summing over the bond variable b xy on the original lattice
then induces an interaction between the dual spins sx and sy at the centers of
˜
˜
the two squares x and y that share the bond xy . We have
˜
˜
m
˜
exp(βJb xy )b xy x
b
xy
+m
y
˜
=±1
˜h ˜ ˜
= exp(−β ˜ (sx , sy )),
(10.4.4)
which deﬁnes a Hamilton function
˜˜˜
h(sx , sy ).
˜
H [s] =
(10.4.5)
xy
˜˜
One obtains
˜˜
˜h
exp(−β h(1, 1)) = exp(−β ˜ (−1, −1)) = 2 cosh(βJ ),
˜˜
˜˜
exp(−β h(1, −1)) = exp(−β h(−1, 1)) = 2 sinh(βJ ).
(10.4.6)
In the original Ising model the ratio of the two Boltzmann factors was
exp(−βh(1, −1))/ exp(−βh(1, 1)) = exp(−2βJ ).
(10.4.7)
Similarly, the ratio of the two dual Boltzmann factors is
˜˜
˜˜
˜˜
exp(−β h(1, −1))/ exp(−β h(1, 1)) = tanh(βJ ) = exp(−2β J ).
(10.4.8)
˜
This equation determines the coupling constant J of a dual Ising model. When
the original Ising model is in the hightemperature phase (βJ small) the dual
˜˜
Ising model is in the lowtemperature phase (β J large) and vice versa. The exact
critical temperature Tc = 1/kB βc of the 2dimensional Ising model follows from
the selfduality condition
tanh(βc J ) = exp(−2βc J ),
(10.4.9)
which again diﬀers from the mean ﬁeld result of eq.(10.2.6).
10.5
Cluster Representation
In this section we will rewrite the Ising model in terms of spin and bond variables.
Parallel spins connected by activated bonds form correlated clusters, while spins
in diﬀerent clusters are uncorrelated. The susceptibility can be expressed in terms
of the cluster sizes. The cluster representation of the Ising model gives rise to an
extremely eﬃcient Monte Carlo algorithm — the socalled cluster algorithm.
95
10.5. CLUSTER REPRESENTATION
We begin by introducing bond variables b xy = 0, 1 which are diﬀerent from
the ones introduced before. An activated bond has b xy = 1 while a deactivated
bond has b xy = 0. We can now write
exp(−βh(sx , sy )) =
b
xy
=0,1
exp(−βh(sx , sy , b xy )),
(10.5.1)
with
exp(−βh(s, s, 1)) = exp(βJ ) − exp(−βJ ),
exp(−βh(s, s, 0)) = exp(−βJ ),
exp(−βh(s, −s, 1)) = 0,
exp(−βh(s, −s, 0)) = exp(−βJ ),
(10.5.2)
for s = ±1, such that indeed
exp(−βh(s, s)) = exp(βJ )
= exp(−βh(s, s, 1)) + exp(−βh(s, s, 0)),
exp(−βh(s, −s)) = exp(−βJ )
= exp(−βh(s, −s, 1)) + exp(−βh(s, −s, 0)). (10.5.3)
Note that a bond can be activated only if the two connected spins are parallel.
It should also be noted that the Boltzmann weight of a deactivated bond is
independent of the spin conﬁguration, i.e.
exp(−βh(1, 1, 0)) = exp(−βh(1, −1, 0)) =
exp(−βh(−1, 1, 0)) = exp(−βh(−1, −1, 0)) = exp(−βJ ).
(10.5.4)
This implies that the spin conﬁguration decomposes into clusters of parallel spins.
The spins connected by activated bonds belong to the same cluster. Spins in the
same cluster are hence parallel, while spins in diﬀerent clusters are uncorrelated.
It should be noted that spins completely unconnected to other spins form a cluster
by themselves. In this way, each spin belongs to exactly one cluster.
The cluster decomposition of a spin conﬁguration has interesting consequences.
In particular, when all spins of a given cluster are ﬂipped, the Boltzmann weight
of the spin and bond conﬁguration remains the same. This means that there are
2NC equally probable conﬁgurations, which are obtained by independently ﬂipping all NC clusters in the conﬁguration. From this fact one can derive a cluster
representation of the susceptibility. First, the total magnetization is a sum of
cluster magnetizations
M[s] =
sx =
MC ,
(10.5.5)
x
C
96
CHAPTER 10. THE ISING MODEL
where the cluster magnetization is given by
MC =
sx .
(10.5.6)
x∈C
In a ﬁnite volume the average magnetization always vanishes (even in the broken
phase), i.e.
M=
C
MC
= 0,
(10.5.7)
since each cluster of spins can be ﬂipped which leads to a change of sign of the
magnetization. Similarly, the susceptibility can be expressed as
χ=
=
2
1
1
M2 = d
Ld
L
1
Ld
C1 ,C2
C
MC1 MC2
MC
=
1
Ld
C
M2
C
.
(10.5.8)
In the last step we have used the fact that two diﬀerent clusters C1 and C2 are
uncorrelated, i.e. MC1 MC2 averages to zero under cluster ﬂip. Consequently,
only the square of the magnetization of the individual clusters M2 determines
C
the susceptibility. Since all spins of a cluster are parallel, up to a sign the cluster
magnetization is given by the cluster size C , i.e.
MC = ±C  = ±
1.
(10.5.9)
.
(10.5.10)
x∈C
Hence, the susceptibility can also be written as
χ=
1
Ld
C
C 2
This shows that the cluster size is directly related to a physical quantity. In this
sense the clusters are indeed physical objects.
Chapter 11
The Monte Carlo Method
A powerful numerical technique to solve problems in statistical mechanics is the
socalled Monte Carlo method. The idea is to compute expectation values by
generating spin conﬁgurations numerically. Of course, the partition function is an
extremely large sum, such that doing it with numerical brute force is completely
hopeless. In the Monte Carlo method predominantly those spin conﬁgurations
are generated that have the largest contribution to the partition function. In
fact, the Boltzmann factor exp(−β H[s]) is used as the probability to generate
the spin conﬁguration [s]. This method is also known as importance sampling.
11.1
The Concept of a Markov Chain
In a Monte Carlo simulation one generates a sequence of spin conﬁgurations
[s(1) ] → [s(2) ] → ... → [s(N ) ],
(11.1.1)
which form a socalled Markov chain, by applying an algorithm that turns the
conﬁguration [s(i) ] into [s(i+1) ]. The initial conﬁguration [s(1) ] is either picked at
random or selected otherwise. Ultimately, nothing should depend on this choice.
After a (possibly large) number M of Monte Carlo iterations (applications of
the algorithm) an equilibrium is reached, and the system has forgotten about
the initial conﬁgurations. Only the conﬁgurations generated after equilibration
are used in the actual calculation. To estimate the expectation value of some
observable one averages its values over all conﬁgurations of the Monte Carlo
97
98
CHAPTER 11. THE MONTE CARLO METHOD
sample
1
N →∞ N − M
O = lim
N
i=M +1
O[s(i) ].
(11.1.2)
In the limit N → ∞ the calculation becomes exact. At ﬁnite N − M one makes a
√
calculable statistical error that decreases in proportion to 1/ N − M − 1. Hence,
to increase the numerical accuracy by a factor of two one must run the Monte
Carlo algorithm four times as long. The Boltzmann factor exp(−β H[s]) is not
explicitly included in the above sum. It is implicitly included, because the conﬁgurations in the Markov chain occur with probability exp(−β H[s]).
11.2
Ergodicity and Detailed Balance
To demonstrate that a particular Monte Carlo algorithm converges to the correct
equilibrium distribution it is suﬃcient to show that it is ergodic and obeys detailed
balance. Ergodicity means that, starting from an arbitrary initial conﬁguration,
the algorithm can, at least in principle, reach any other spin conﬁguration. This
condition is obviously necessary, because the correct value for the expectation
value can be obtained only if all spin conﬁgurations are included. Detailed balance
means that
exp(−β H[s])w[s, s′ ] = exp(−β H[s′ ])w[s′ , s].
(11.2.1)
Here w[s, s′ ] is the transition probability for the algorithm to turn the conﬁguration [s] into [s′ ]. A Monte Carlo algorithm is completely characterized by the
corresponding w[s, s′ ]. Since the algorithm deﬁnitely generates a new conﬁguration the proper normalization is
w[s, s′ ] = 1.
(11.2.2)
[s′ ]
When the Monte Carlo algorithm converges to an equilibrium distribution p[s] of
spin conﬁgurations, this distribution is an eigenvector of w[s, s′ ] with eigenvalue
1, i.e.
p[s]w[s, s′ ] = p[s′ ].
(11.2.3)
[s]
Now we want to show that the canonical Boltzmann distribution
p[s] = exp(−β H[s])
(11.2.4)
99
11.3. THE METROPOLIS ALGORITHM
is indeed an eigenvector of w[s, s′ ] if the algorithm obeys detailed balance. We
ﬁnd
[s]
exp(−β H[s])w[s, s′ ] =
[s]
exp(−β H[s′ ])w[s′ , s]
= exp(−β H[s′ ])
w[s′ , s]
[s]
′
= exp(−β H[s ]).
(11.2.5)
Assuming ergodicity one can show that only one eigenvector with eigenvalue 1
exists, and that the equilibrium distribution is therefore unique.
11.3
The Metropolis Algorithm
A simple example of an algorithm that is ergodic and obeys detailed balance
is the socalled Metropolis algorithm. In this algorithm a new conﬁguration
[s′ ] is randomly chosen based on the old conﬁguration [s]. If the energy of the
new conﬁguration is smaller than the energy of the old conﬁguration, the new
conﬁguration is accepted, i.e.
H[s′ ] < H[s] ⇒ w[s, s′ ] = 1.
(11.3.1)
On the other hand, if the new energy is larger, the new conﬁguration is accepted
only with a certain probability, i.e.
H[s′ ] > H[s] ⇒ w[s, s′ ] = exp(−β (H[s′ ] − H[s])).
(11.3.2)
Otherwise the old conﬁguration is kept. This algorithm obeys detailed balance.
Let us consider two conﬁgurations [s] and [s′ ]. We can assume that H[s′ ] <
H[s] such that w[s, s′ ] = 1. Then of course, H[s] > H[s′ ] such that w[s′ , s] =
exp(−β (H[s] − H[s′ ])), and hence
exp(−β H[s])w[s, s′ ] = exp(−β H[s])
= exp(−β H[s′ ]) exp(−β (H[s] − H[s′ ]))
= exp(−β H[s′ ])w[s′ , s].
(11.3.3)
We still need to specify how a new conﬁguration is proposed. In the Ising
model one visits the spins one by one and proposes to ﬂip them. The resulting
change of the energy is calculated by investigating the neighboring spins. Then
100
CHAPTER 11. THE MONTE CARLO METHOD
following the Metropolis algorithm, it is decided whether a given spin is ﬂipped
or not. When all spins on the lattice have been updated in this way, one has
completed one Metropolis sweep. It is obvious that any spin conﬁguration can,
at least in principle, be reached in this way, i.e. the Metropolis algorithm is indeed
ergodic. A typical Monte Carlo simulation consists of a large number of sweeps,
say 1 million, for example.
11.4
Error Analysis
Since any practical Monte Carlo simulation has a ﬁnite length, the results are not
exact but are aﬀected by statistical errors. Hence, an important part of every
Monte Carlo calculation is the error analysis. An ideal Monte Carlo algorithm
(which does not really exist in practice) would generate a Markov chain of statistically independent conﬁgurations. If the Monte Carlo data for an observable
O are Gaussian distributed, the standard deviation from their average (i.e. their
statistical error) is given by
∆O = √
1
1
(O − O )2 = √
( O2 − O 2 ).
N −M −1
N −M −1
(11.4.1)
In order to reduce the statistical error by a factor of two, the number of independent equilibrated conﬁgurations N − M must hence be increased by a factor of
four.
Practical Monte Carlo algorithms (like the Metropolis algorithm) are not
ideal, i.e. they do not generate statistically independent conﬁgurations. In particular, the Metropolis algorithm is rather simple, but not very eﬃcient. Since
the new conﬁguration is generated from the previous conﬁguration in the Markov
chain, subsequent conﬁgurations are correlated. This implies that the actual statistical error is larger than the above naive estimate of the standard deviation.
In order to detect the autocorrelation of the Monte Carlo data, it is useful to
bin these data. For this purpose one averages a number Nb of subsequent measurements and treats this average as a statistically independent result. One then
computes the standard deviation based on the (N − M )/Nb statistically independent averages. Of course, if the bin size Nb is too small, the averages are still
correlated and the corresponding standard deviation still underestimates the true
statistical error. When one increases the bin size Nb , the corresponding standard
deviation increases until subsequent bin averages are indeed statistically independent. Once the standard deviation has reached a plateau (by increasing Nb ), one
has obtained a reliable estimate of the true statistical error.
11.5. THE SWENDSENWANG CLUSTER ALGORITHM
101
In order to estimate the number τ of Monte Carlo iterations that separate
statistically independent spin conﬁguration, it is also useful to determine the
autocorrelation function of some observable O
1
N →∞ N − M − t
O(i) O(i+t) = lim
N −t
i=M +1
O[s(i) ]O[s(i+t) ] ∝ exp(−t/τ ).
(11.4.2)
The autocorrelation time τ of the Metropolis algorithm actually increases when
one approaches a second order phase transition. At a second order phase transition the correlation length ξ diverges. One ﬁnds socalled critical slowing down
τ ∝ ξz ,
(11.4.3)
where z is a dynamical critical exponent characterizing the eﬃciency of a Monte
Carlo algorithm. For the Metropolis algorithm one ﬁnds z ≈ 2, which leads to
a very bad critical slowing down behavior. This is a good motivation to turn to
the much more eﬃcient cluster algorithms which can reach z ≈ 0.
11.5
The SwendsenWang Cluster Algorithm
The SwendsenWang cluster algorithm is a Monte Carlo method based on the
cluster representation of the Ising model, i.e. it operates on both spins and bonds.
First, an initial spin conﬁguration is selected. Then each bond is activated or
deactivated, depending on the orientation of the adjacent spins. If the two spins
connected by the bond are antiparallel the bond is deactivated. On the other
hand, if the two spins are parallel the bond can be activated or deactivated. The
corresponding Boltzmann weights are given by
exp(−βh(s, s, 1)) = exp(βJ ) − exp(−βJ ),
exp(−βh(s, s, 0)) = exp(−βJ ),
(11.5.1)
where s = ±1. Hence, the probability for activating the bond is
p=
exp(−βh(s, s, 1))
= 1 − exp(−2βJ ).
exp(−βh(s, s, 1)) + exp(−βh(s, s, 0))
(11.5.2)
Once each bond is activated or deactivated, one identiﬁes the clusters of sites
connected by activated bonds. By construction all spins in a common cluster are
parallel. Each cluster is ﬂipped (i.e. all spins in the cluster change sign) with 50
percent probability. This completes one sweep of the cluster algorithm. Then the
procedure is repeated, i.e. the bonds are updated again.
102
CHAPTER 11. THE MONTE CARLO METHOD
As a beneﬁt of the cluster algorithm we can make use of the cluster representation of the susceptibility
χ=
1
Ld
C
C 2
(11.5.3)
in order to obtain a socalled improved estimator. Instead of measuring just
M[s]2 for the given spin conﬁguration, we sum the squares of all cluster sizes C .
Eﬀectively, this increases the statistics by a factor 2Nc , where NC is the number
of clusters in the conﬁguration.
Let us ﬁrst show that the cluster algorithm is ergodic. There is a ﬁnite
(although perhaps very small) probability that no bonds are activated. Then
each spin forms its own cluster. By ﬂipping these individual spin clusters one can
obviously reach any possible spin conﬁguration.
We still need to show that the cluster algorithm obeys detailed balance, i.e.
exp(−β H[s])w[s, s′ ] = exp(−β H[s′ ])w[s′ , s].
(11.5.4)
It is suﬃcient to consider just one pair of neighboring spins. If the two spins are
antiparallel they necessarily belong to diﬀerent clusters. After cluster ﬂip they
will be parallel with 50 percent probability. In the next sweep the bond between
them will then be activated with probability p and deactivated with probability
(1 − p). The probability to turn back into the original antiparallel conﬁguration
is then 1 (1 − p). The corresponding detailed balance relation then takes the form
2
1
1
exp(−βh(s, −s)) = exp(−βh(s, s)) (1 − p) ⇒
2
2
1
1
exp(−βJ ) = exp(βJ ) exp(−2βJ ),
2
2
(11.5.5)
which is indeed satisﬁed. With the other 50 percent probability the originally
antiparallel spins will remain antiparallel. In that case, the bond between them
cannot be activated and thus with 50 percent probability we return to the original
conﬁguration. Detailed balance is then trivially satisﬁed. Finally, let us assume
that the two spins are originally parallel. Then we need to distinguish between
two cases. First, we assume that the two spins are already indirectly connected
through other activated bonds. In that case, it is irrelevant if the direct bond
between them is activated or not. The two spins remain parallel and detailed
balance is trivially satisﬁed. Next, we assume that the two parallel spins are not
indirectly connected by activated bonds. Then the direct bond between them
is activated with probability p and the spins remain parallel. With probability
11.6. THE WOLFF CLUSTER ALGORITHM
103
(1 − p) the bond is deactivated and the spins remain parallel only with 50 percent
probability. Again, detailed balance is then trivially satisﬁed. With the other 50
percent probability the two spins will become antiparallel. Then they cannot be
connected by an activated bond and they return into the parallel conﬁguration
with 50 percent probability. The detailed balance relation then again takes the
form
1
1
exp(−βh(s, s)) (1 − p) = exp(−βh(s, −s)) ⇒
2
2
1
1
exp(βJ ) exp(−2βJ ) = exp(−βJ ) .
2
2
(11.5.6)
Thus, in all cases detailed balance is indeed satisﬁed.
11.6
The Wolﬀ Cluster Algorithm
The Wolﬀ cluster algorithm is an interesting (and sometimes even more eﬃcient)
variant of the SwendsenWang algorithm. The SwendsenWang algorithm is a
multicluster algorithm, i.e. all clusters in a conﬁguration are identiﬁed and on
average half of them are ﬂipped. The Wolﬀ cluster algorithm, on the other hand,
is a singlecluster algorithm, i.e. a site is selected at random and only the one
cluster attached to this site is identiﬁed in this conﬁguration. In contrast to the
SwendsenWang algorithm, the single cluster is then ﬂipped with 100 percent
probability. In this case, no eﬀort is wasted for identifying clusters which are
then not ﬂipped. After the single cluster is ﬂipped, a new bond conﬁguration is
generated and the whole procedure is repeated.
As for the SwendsenWang algorithm, for the Wolﬀ cluster algorithm one can
also construct an improved estimator for the susceptibility. While in the multicluster algorithm all clusters contribute C 2 to the susceptibility, in the singlecluster algorithm the cluster is selected with a probability C /Ld (proportional
to its size C ). Hence, bigger clusters are selected more frequently than smaller
ones and one must correct for this bias. Hence, in the singlecluster algorithm
the improved estimator for the susceptibility takes the form
χ=
1
Ld
C 2
Ld
C 
= C  .
(11.6.1)
104
CHAPTER 11. THE MONTE CARLO METHOD
Chapter 12
Quantum Statistical Mechanics
Until now we have investigated classical statistical mechanics. From now on we
will incorporate the principles of quantum mechanics into our considerations,
and hence we will move on to quantum statistical mechanics. Remarkably, the
structure of statistical mechanics is suﬃciently general that, unlike other physical
theories, it does not undergo radical changes upon quantization. In particular, the
same ensembles of thermodynamical equilibrium also make sense at the quantum
level.
12.1
Canonical Ensemble in Quantum Statistics
In classical statistical mechanics we have introduced the canonical ensemble via
its probability distribution
ρ[n] =
1
exp(−β H[n]),
Z
(12.1.1)
where H[n] is the classical Hamilton function of a system in the conﬁguration [n].
The probability distribution is normalized to
ρ[n] = 1,
(12.1.2)
[ n]
such that the canonical partition function is given by
Z=
[ n]
exp(−β H[n]).
105
(12.1.3)
106
CHAPTER 12. QUANTUM STATISTICAL MECHANICS
The thermal average of a physical quantity O is then given by
O=
1
Z
[ n]
O[n] exp(−β H[n]).
(12.1.4)
While a classical system is characterized by its Hamilton function H, a quantum system is characterized by its Hamilton operator H . The Hamilton operator
can be viewed as a matrix in the Hilbert space of the theory. The solution of the
timeindependent Schr¨dinger equation
o
Hχn (x) = En χn (x),
(12.1.5)
determines the energy eigenvalues En and the corresponding energy eigenstates
χn (x). Quantum statistical mechanics is deﬁned via the socalled statistical operator or density matrix
1
(12.1.6)
ρ = exp(−βH ),
Z
where the exponential of the Hamiltonian is deﬁned by the corresponding power
series. In quantum statistical mechanics the density matrix ρ plays the same role
as the probability distribution ρ[n] in classical statistical mechanics. In particular,
we can deﬁne the matrix element
ρ[n] =
=
d3 x χn (x)∗ ρχn (x) =
1
exp(−βEn )
Z
d3 x χn (x)∗
dx χn (x)∗ χn (x) =
1
exp(−βH )χn (x)
Z
1
exp(−βEn ).
Z
(12.1.7)
Here we have used the normalization condition for the quantum mechanical wave
function
d3 x χn (x)∗ χn (x) = 1.
(12.1.8)
The quantity ρ[n] shall be interpreted as the probability to ﬁnd the quantum
system (which is again a member of an ensemble) in the state n. In analogy to
classical statistical mechanics we now demand that
ρn = 1,
(12.1.9)
n
which implies
Z=
n
exp(−βEn )
(12.1.10)
for the quantum statistical partition function. In particular, we see that the
quantum mechanical state n plays the same role as the classical conﬁguration
107
12.1. CANONICAL ENSEMBLE IN QUANTUM STATISTICS
[n]. Similarly, the quantum mechanical Hilbert space is analogous to the classical
conﬁguration or phase space. The sum over all states in the Hilbert space can
also be written as a trace, i.e.
Z = Tr exp(−βH ).
(12.1.11)
In this notation the normalization condition for the density matrix reads
Trρ = 1,
(12.1.12)
and the thermal average of a quantum mechanical observable (described by a
Hermitean operator O) takes the form
O = Tr[Oρ] =
1
Tr[O exp(−βH ).
Z
(12.1.13)
If O is diagonal in the basis of the energy eigenstates, i.e. if the observable O and
the energy are simultaneously measurable with arbitrary precision, one can write
O=
1
Z
O[n] exp(−βEn ),
(12.1.14)
d3 x χn (x)∗ Oχn (x)
(12.1.15)
n
where
O[n] =
is the matrix element of the operator O in the state n.
In analogy to classical statistical mechanics, for the thermal average of the
energy we can again write
E = H = Tr[Hρ] =
1
∂ log Z
Tr[H exp(−βH )] = −
.
Z
∂β
(12.1.16)
Also the entropy is deﬁned analogous to the classical case and is given by
S = −kB Trρ log ρ.
(12.1.17)
In the canonical ensemble we hence obtain
S = −kB
1
E
Tr[exp(−βH )(−βH − log Z )] =
− kB log Z.
Z
T
(12.1.18)
Again introducing the free energy F as
Z = exp(−βF ) ⇒ F =
1
log Z,
β
(12.1.19)
108
CHAPTER 12. QUANTUM STATISTICAL MECHANICS
we obtain
E
F
−
⇒ F = E − T S,
T
T
exactly as in classical statistical mechanics.
S=
(12.1.20)
In classical mechanics we have seen that the timeevolution of a general probability distribution is given by the Poisson bracket
dρ
= {ρ, H}.
dt
(12.1.21)
Similarly, the quantum mechanical evolution equation for a general density matrix
is
i
dρ
= [ρ, H ].
(12.1.22)
dt
In the canonical ensemble with ρ = exp(−βH )/Z we have [ρ, H ] = 0 such that
dρ/dt = 0. This shows that the canonical ensemble describes a stationary timeindependent distribution, which is what one expects in thermal equilibrium.
12.2
Canonical Ensemble for the Harmonic Oscillator
Let us illustrate quantum statistical mechanics using the harmonic oscillator. All
we need to know at this point is its quantum mechanical energy spectrum
1
1
En = ω (n + ) = ε(n + ),
2
2
(12.2.1)
where ω is the angular frequency of the oscillator and n = 0, 1, 2, ..., ∞. Just as for
the classical particle on the energy ladder, the energies of the quantum mechanical
eigenstates of the harmonic oscillator are equally spaced. The quantum statistical
partition function is given by
∞
Z = Tr exp(−βH ) =
n=0
exp(−βEn ) =
exp(−βε/2)
.
1 − exp(−βǫ)
(12.2.2)
Up to the irrelevant constant factor exp(−βε/2) this is identical with the classical
partition function of the particle on the ladder. Hence, we can use everything
we have learned earlier about the classical particle on the energy ladder and
reinterpret it in the context of the quantum mechanical harmonic oscillator.
Chapter 13
Hot Quantum Gases in the
Early Universe
Quantum mechanical particles come as two fundamentally diﬀerent types —
bosons and fermions — which diﬀer in both their spin and their statistics. Bosons
have integer and fermions have halfinteger spin (an intrinsic angular momentum of a particle). Furthermore, bosons have a wave function that is symmetric
against particle permutations, while fermions have antisymmetric wave functions.
As a consequence, an arbitrary number of bosons but at most one fermion can occupy a given quantum state. This gives rise to two fundamentally diﬀerent forms
of quantum statistics. Bosons obey the socalled BoseEinstein statistics, while
fermions obey FermiDirac statistics. To illustrate these issues we will consider
hot quantum gases in the early Universe.
13.1
BoseEinstein Statistics and Background Radiation
Historically the ﬁrst theoretical formula of quantum physics was Planck’s law
for black body radiation. For example, the cosmic background radiation — a
remnant of the big bang — is of the black body type. Planck was able to explain
the energy distribution of black body radiation by assuming that light of angular
frequency ω = 2πν is absorbed and emitted by matter only in discrete amounts
of energy
E = ω.
(13.1.1)
109
110
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
Nowadays, we would associate Planck’s radiation formula with a rather advanced
ﬁeld of quantum physics. Strictly speaking, it belongs to the quantum statistical
mechanics of the theory of the free electromagnetic ﬁeld, i.e. it is part of quantum
ﬁeld theory — clearly a subject of graduate courses. It is amazing that theoretical quantum physics started with such an advanced topic. Still, we will try to
understand Planck’s radiation formula in what follows.
Thermodynamics deals with systems of many particles that are in thermal
equilibrium with each other and with a thermal bath. The energies of the individual states are statistically distributed, following a Boltzmann distribution for
a given temperature T . The thermal statistical ﬂuctuations are of a diﬀerent nature than those related to quantum uncertainty. Thermal ﬂuctuations are present
also at the classical level, and e.g. reﬂect our inability to treat a system with a
large number of particles exactly. Following the classical concept of reality, this is
possible, at least in principle. In practice, it is, however, much more appropriate
to use a classical statistical description. In the thermodynamics of photons, i.e. in
quantum statistical mechanics, we deal with thermal and quantum ﬂuctuations
at the same time.
A system of photons in thermal equilibrium has been with us from the beginning of our Universe. Immediately after the big bang the energy density —
and hence the temperature — was extremely high, and all kinds of elementary
particles (among them photons, electrons and their antiparticles — positrons
— as well as neutrinos and antineutrinos) have existed as an extremely hot gas
ﬁlling all of space. These particles interacted with each other e.g. via Compton
scattering. As the Universe expanded, the temperature decreased and electrons
and positrons annihilated into a lot of photons. A very small fraction of the
electrons (actually all the ones in the Universe today) exceeded the number of
positrons and thus survived annihilation. At this time — a few seconds after the
big bang — no atom had ever been formed. As a consequence, there were no
characteristic colors of selected spectral lines. This is what we mean when we
talk about the cosmic photons as black body radiation. About 300000 years after
the big bang the Universe had expanded and cooled so much that electrons and
atomic nuclei could settle down to form neutral atoms. At that time the Universe
became transparent. The photons that emerged from the mass extinction of electrons and positrons were left alone, and are still ﬂoating through our Universe.
Of course, in the last 14 billion years the Universe has expanded further and the
cosmic photon gas has cooled down accordingly. Today the temperature of the
cosmic background radiation is 2.735 K. It is amazing that this temperature is to
very high accuracy the same, no matter what corner of the Universe the photons
come from. This was ﬁrst explained by Alan Guth from MIT using the idea of
13.1. BOSEEINSTEIN STATISTICS AND BACKGROUND RADIATION 111
the inﬂationary Universe.
How does one measure the temperature of a system of photons? The temperature is deﬁned via the Boltzmann distribution, in our case by the intensity
of radiation with a certain frequency. Hence, by measuring the photon spectrum
one can determine the temperature. This is exactly what the antennae of WMAP
satellite (the Wilkinson Microwave Anisotropy Probe) are doing. Equipped with
the idea of Planck, let us now derive this spectrum theoretically. For simplicity
we replace the Universe by a large box of spatial size L × L × L with periodic
boundary conditions. This is only a technical trick that will allow us to simplify
the calculation. At the end we let L → ∞. We will proceed in three steps. First
we work classically, and classify all possible modes of the electromagnetic ﬁeld
in the box. Then we switch to quantum mechanics and populate these modes
with photons. Finally, we do quantum statistical mechanics by summing over all
quantum states using Boltzmann’s distribution.
What are the modes of the electromagnetic ﬁeld in an L3 periodic box? First
of all, we can classify them by their wave vector k, which is now restricted to
discrete values
2π
m, mi ∈ Z.
(13.1.2)
k=
L
The frequency of this mode is given by
ω = kc.
(13.1.3)
Each of the modes can exist in two polarization states.
Now we turn to quantum mechanics and populate the classical modes with
photons. As we have learned, a mode of frequency ω can host photons of energy
E (k ) = ω = kc
(13.1.4)
only. Photons are bosons. This means that an arbitrary number of them can
occupy a single mode of the electromagnetic ﬁeld. Electrons and neutrinos, for
example, behave very diﬀerently. They are fermions, i.e. at most one of them
can occupy a single mode. All elementary particles we know are either fermions
or bosons. We can completely classify a quantum state of the electromagnetic
ﬁeld by specifying the number of photons n(k) ∈ {0, 1, 2, ...} occupying each
mode (characterized by wave vector k and polarization, which we suppress in
our notation). It is important to note that it does not matter “which photon”
occupies which mode. Individual photons are indistinguishable from each other,
they are like perfect twins. Hence specifying their number per mode determines
their state completely.
112
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
Now that we have classiﬁed all quantum states of the electromagnetic ﬁeld by
specifying the photon occupation numbers for all modes, we can turn to quantum
statistical mechanics. Then we must evaluate the partition function by summing
over all states. Since the modes are completely independent of one another, the
partition function
Z=
Z (k)
(13.1.5)
k
factorizes into partition functions for each individual mode. Here we consider a
single mode partition function
∞
∞
Z (k ) =
n(k )=0
exp(−n(k)E (k )/kB T ) =
n(k )=0
exp(−βn(k ) k c).
(13.1.6)
Each mode state is weighted by its Boltzmann factor exp(−n(k)E (k )/kB T ),
which is determined by its total energy of photons n(k)E (k ) occupying that
mode and by the temperature T . Now we make use of the wellknown summation formula for a geometric series
∞
xn =
n=0
1
.
1−x
(13.1.7)
Using x = exp(−β k c) we obtain the partition function corresponding to BoseEinstein statistics
1
.
(13.1.8)
Z (k) =
1 − exp(−β k c)
We are interested in the statistical average of the energy in a particular mode,
which is given by
n(k )E (k )
=
1
Z (k)
=−
∞
n(k )=0
n(k)E (k ) exp(−βn(k)E (k ))
kc
∂ log Z (k)
=
.
∂β
exp(β k c) − 1
(13.1.9)
Finally, we are interested in the average total energy as a sum over all modes
E =2
k
n(k)E (k ) → 2(
L3
)
2π
d3 k n(k)E (k ) .
(13.1.10)
Here a factor 2 arises due to the two polarization states. In the last step we have
performed the inﬁnite volume limit L → ∞. Then the sum over discrete modes
13.1. BOSEEINSTEIN STATISTICS AND BACKGROUND RADIATION 113
turns into an integral. It is no surprise that our result grows in proportion to the
volume L3 . We should simply consider the energy density ρ = E /L3 . We can
now perform the angular integration and also replace k = ω/c to obtain
ρ=
1
π 2 c3
∞
dω
0
ω3
.
exp(β ω ) − 1
(13.1.11)
Before we do the integral we read oﬀ the energy density per unit frequency for
modes of a given angular frequency ω
ω3
1
dρ(ω )
= 23
.
dω
π c exp( ω /kB T ) − 1
(13.1.12)
This is Planck’s formula that was at the origin of quantum mechanics. If you
have followed the above arguments: congratulations, you have just mastered a
calculation in quantum ﬁeld theory!
Since we have worked quite hard to produce this important result, let us
discuss it in some detail. Let us ﬁrst consider the classical limit → 0. Then we
obtain the classical RayleighJeans law
dρ(ω )
ω 2 kB T
=
.
dω
π 2 c3
(13.1.13)
Integrating this over all frequencies gives a divergent result from the high frequency end of the spectrum. This is the socalled ultraviolet Jeans catastrophe.
The classical thermodynamics of the electromagnetic ﬁeld gives an unphysical
result. Now we go back to Planck’s quantum result and perform the integral over
all frequencies ω . This gives the StefanBoltzmann law
ρ=
4
π 2 kB T 4
,
15 3 c3
(13.1.14)
which is not only ﬁnite, but also agrees with experiment. Again, in the classical
limit the result would be divergent. It is interesting that for high temperatures
and for low frequencies, i.e. for kB T ≫ ω , Planck’s formula also reduces to the
classical result. Quantum eﬀects become import only in the low temperature and
high frequency regimes.
Now we can understand how WMAP measures the temperature of the cosmic
background radiation. The energy density is measured for various frequencies,
and is then compared with Planck’s formula, which leads to a high precision
determination of T . The WMAP data tell us a lot about how our Universe
began. In fact, the early history of the Universe is encoded in the photons left
over from the big bang. Sometimes one must understand the very small, before
one can understand the very large.
114
13.2
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
Thermodynamical Distributions
Let us deﬁne the BoseEinstein distribution function
f (k) =
1
exp(β k c) − 1
.
(13.2.1)
Then we can write the energy and number density of a free photon gas as
E
2
=
3
L
(2π )3
N
2
n= 3 =
L
(2π )3
ρ=
d3 k k cf (k),
d3 k f (k).
(13.2.2)
Let us now consider a more general system of free relativistic particles of mass
m. The energy of a particle then is
E=
(pc)2 + (mc2 )2 =
( kc)2 + (mc2 )2 .
(13.2.3)
Here the momentum p = k characterizes the state of the particle. The particle
may also have a spin S . The component of the spin in the direction of momentum
is the socalled helicity, which is given by Sk = −S, ..., S , i.e. for a particle with
spin S there are g = 2S + 1 helicities. The spin determines the statistics of
particles. Particles with integer spin are bosons, while particles with halfinteger
spin are fermions. A particle state can hence be characterized by k and Sk .
Photons (and other massless particles) are slightly diﬀerent. Photons are bosons
with spin S = 1 but they only have two helicities Sk = ±1, i.e. g = 2. These are
the two polarization states that entered the derivation of Planck’s formula.
Now let us consider massive bosons. An arbitrary number of bosons can
occupy the same state, and the grand canonical partition function thus takes the
form
Z (β, µ) = Tr exp(−β (H − µN )).
(13.2.4)
The chemical potential is a useful concept only if particle number is conserved.
This is not always the case in relativistic theories. In particular, in an interacting
photon gas particle number is not conserved. For example, we have said that the
cosmic background radiation originated from a mass extinction of particles —
namely the annihilation of electrons and their antiparticles (positrons). In this
process an enormous number of photons has been created. Photon number hence
was not conserved in this process. Therefore a chemical potential for photons is
in general not a useful concept. The number of other bosons, however, may be
conserved. In such cases, it is appropriate to introduce a chemical potential µ.
115
13.2. THERMODYNAMICAL DISTRIBUTIONS
The trace in eq.(13.2.4) extends over all possible states
∞
Z (β, µ) =
k ,Sk n(k,Sk )=0
∞
=
exp(−β
(
k ,Sk n(k,Sk )=0
n(k, Sk )(
k,Sk
( kc)2 + (mc2 )2 − µ))
exp(−β ( ( kc)2 + (mc2 )2 − µ)n(k, Sk )))
1
=
k ,Sk
1 − exp(−β ( ( kc)2 + (mc2 )2 − µ))
.
(13.2.5)
Let us compute the expectation value of the particle number
N
=
=
=
1
Tr[N exp(−β (H − µN ))]
Z (β, µ)
∂ log Z (β, µ)
∂ (βµ)
1
∂
log
∂ (βµ)
1 − exp(−β ( ( kc)2 + (mc2 )2 − µ))
k ,Sk
exp(−β (
=g
k
=g
k
( kc)2 + (mc2 )2 − µ))
1 − exp(−β ( ( kc)2 + (mc2 )2 − µ))
exp(β (
1
( kc)2 + (mc2 )2 − µ)) − 1
(13.2.6)
Up to now we have summed over momenta, i.e. we have assumed that we are in
a ﬁnite volume V = L3 , e.g. with periodic boundary conditions, and thus with
wave numbers
2π
n, n ∈ Z3 .
(13.2.7)
k=
L
In the inﬁnite volume limit we obtain
k
→(
L3
)
2π
d3 k,
(13.2.8)
d3 k f (k),
(13.2.9)
such that the particle density takes the form
n=
N
g
=
3
L
(2π )3
where
f (k) =
1
exp(β ( (
kc)2
+ (mc2 )2 − µ)) − 1
(13.2.10)
116
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
is a generalization of the BoseEinstein distribution of eq.(13.2.1) to bosons with
nonzero mass m at nonzero chemical potential µ. Correspondingly, one ﬁnds
for the energy density
ρ=
H
g
=
3
L
(2π )3
d3 k
( kc)2 + (mc2 )2 f (k).
(13.2.11)
Let us now repeat the calculation for fermions. Since at most one fermion
can occupy a given quantum state, one obtains
1
Z (β, µ) =
(
k,Sk n(k,Sk )=0
=
k,Sk
( kc)2 + (mc2 )2 − µ)n(k, Sk )))
exp(−β (
(1 + exp(−β ( ( kc)2 + (mc2 )2 − µ))).
(13.2.12)
The corresponding expectation value of the particle number is
N
=
∂
∂ (βµ)
k ,Sk
log(1 + exp(−β ( ( kc)2 + (mc2 )2 − µ)))
( kc)2 + (mc2 )2 − µ))
exp(−β (
=g
k
1 + exp(−β ( ( kc)2 + (mc2 )2 − µ))
1
=g
k
exp(β (
(
kc)2
(13.2.13)
+ (mc2 )2 − µ)) + 1
From this we obtain the particle density
n=
N
g
=
3
L
(2π )3
d3 k f (k),
(13.2.14)
now with the socalled FermiDirac distribution
f (k) =
1
exp(β (
(
kc)2
+ (mc2 )2 − µ)) + 1
.
(13.2.15)
Let us also consider the nonrelativistic limit of the BoseEinstein and FermiDirac distributions. Then kB T ≪ mc2 and hence
f (k) =
1
exp(β (
( kc)2 + (mc2 )2 − µ)) ± 1
∼ exp(−β (mc2 +
2 k2
2m
− µ)),
(13.2.16)
13.3. ENTROPY CONSERVATION AND NEUTRINO TEMPERATURE 117
which is just the Boltzmann distribution wellknown from classical statistical
mechanics. This implies
n = g(
mkB T 3/2
) exp(−β (mc2 − µ)), ρ = mc2 n.
2π 2
(13.2.17)
Let us now consider massless fermions with a negligible chemical potential
(µ ≪ kB T ), and with spin 1/2 (g = 2). Then
∞
1
1
4
2
dk k2
4π
=
(1 −
3
2 β 3 3 c3
(2π )
exp(β kc) + 1
(2π )
0
3
3ζ (3)kB T 3
,
2π 2 3 c3
∞
1
kc
4
2
4π
dk k2
=
(1 −
(2π )3
exp(β kc) + 1
(2π )2 β 4 3 c3
0
4
7π 2 kB T 4
.
120 3 c3
n=
=
ρ=
=
13.3
1
)Γ(3)ζ (3)
4
1
)Γ(4)ζ (4)
8
(13.2.18)
Entropy Conservation and Neutrino Temperature
The second law of thermodynamics states that entropy never decreases — and
indeed usually increases. In general, an expanding system — for example, an
exploding gas — will not be in thermodynamical equilibrium. The expansion
of the Universe, however, is slow and the system remains in thermodynamical
equilibrium while it expands and cools. Such processes are called adiabatic. For
them the total entropy is conserved, and they are, in fact, reversible. The entropy
in a volume V = L3 is given by
S=
ρ+p 3
EF
−
=
L.
T
T
T
(13.3.1)
For a gas of relativistic particles (either bosons or fermions) the energy and
pressure are related by the equation of state
1
p = ρ,
3
and hence
S=
4ρ 3
L.
3T
(13.3.2)
(13.3.3)
118
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
Diﬀerent particle species will remain in thermal equilibrium, only if they interact
with each other often enough. Since the Universe expands, particle densities
become smaller and smaller, and ultimately the various particle species decouple
from each other. Still, one may assign a temperature Ti to each particle species
i. We want to compare these temperatures with the one of the photons T = Tγ ,
which today is 2.735 K. If the various particle species have diﬀerent temperatures
we obtain
S=
4
2π 2 kB
(
45 3 c3
gi Ti3 +
bosons
f ermions
7
gi T 3 )L3 .
8i
(13.3.4)
We will use the conservation of entropy to determine the temperature of the
neutrinos in the Universe. Let us go back to about 1 sec after the big bang.
At that time we have a system of electrons, positrons, photons, and neutrinos.
The neutrinos are no longer in thermal equilibrium with the other particles at
that moment, because they interact only weakly. Before that time neutrinos,
electrons, positrons, and photons had the same temperature T . When electrons
and positrons annihilate, their entropy goes into the photons, which thus get
heated up to a higher temperature Tγ . Before the electronpositron annihilation
we have
4
4
2π 2 kB
2π 2 kB
7
7
(gγ + (ge + ge + gνe + gνµ + gντ ))(T L)3 =
(4 + 5 × 2)(T L)3 .
3 c3
3 c3
45
8
45
8
(13.3.5)
After the electronpositron annihilation all entropy is in photons and neutrinos.
The neutrino temperature has decreased to Tν and the size of the Universe is now
L′ such that
Tν L′ = T L.
(13.3.6)
S=
The entropy is then given by
S=
=
4
2π 2 kB
7
(gγ (Tγ L′ )3 + (gνe + gνµ + gντ )(Tν L′ )3 )
45 3 c3
8
2 k4
2π B
7
(2(Tγ L′ )3 + 3 × 2(Tν L′ )3 ).
45 3 c3
8
(13.3.7)
Using entropy conservation we obtain
7
7
(Tγ L′ )3 + 3(Tν L′ )3 = (1 + 5)(Tν L′ )3 ⇒
8
8
Tγ 3
7
11
4
⇒ Tν = ( )1/3 Tγ .
( ) =1+ 2=
Tν
8
4
11
(13.3.8)
During the following expansion of the Universe these temperatures are simply
redshifted, but their ratio remains ﬁxed. The photons still interact with charged
13.3. ENTROPY CONSERVATION AND NEUTRINO TEMPERATURE 119
matter until they decouple about 300000 years after the big bang, when neutral
atoms are formed. The number of photons is, however, much larger than the
number of charged particles, such that the interactions cannot change the temperature of the photons. Since one observes the cosmic background radiation at
a temperature Tγ = 2.7 K, we expect a cosmic neutrino background of temperature Tν = 1.9 K. Unfortunately, neutrinos interact so weakly that the cosmic
neutrino background radiation has not yet been detected. Detecting it would be
very interesting, because it would tell us something about the socalled lepton
number asymmetry of the Universe.
120
CHAPTER 13. HOT QUANTUM GASES IN THE EARLY UNIVERSE
Chapter 14
Lattice Vibrations
In this chapter we discuss the vibrations of the crystal lattice of a solid. We
will model the vibrations of a cubic lattice of heavy atomic nuclei (actually ions)
with coupled harmonic oscillators. The resulting quantum mechanical model
can be solved analytically and gives rise to quantized lattice vibrations, known
as phonons. Just as photons, phonons are bosons which may form an ideal
Bose gas. However, unlike photons, they do not have a linear energymomentum
dispersion relation for large values of the momentum. At low temperature T , the
phonon gas determines the speciﬁc heat of solids, which is proportional to T 3 .
This behavior was ﬁrst derived by Peter Debye in 1912. In our considerations,
we will completely ignore the electrons that propagate in the background of the
crystal lattice of ions. However, there are interesting eﬀects that result from
the coupling of electrons and phonons. In particular, at very low temperatures
the electromagnetic repulsion between two electrons, which is mediated by the
exchange of photons, may be overcome by an attractive interaction mediated by
phonons. This attraction gives rise to the formation of Cooper pairs of electrons,
i.e. two electrons form a boson. The condensation of Cooper pair bosons gives rise
to superconductivity of metals at very low temperatures. In this chapter, ignoring
the electrons, we concentrate entirely on the physics of phonons. For simplicity,
we will ﬁrst consider a model in one dimension. In this context, we will also
discuss in which sense phonons may be considered as “particles”. Since they are
actually quantized waves, following Frank Wilczek, it may be more appropriate to
call them “wavicles”. In particular, in contrast to particles, a priori, phonons and
other wavicles (such as photons or electrons) do not have a welldeﬁned position
and are thus not localizable in the usual sense. Remarkably, the model that we
use to describe a vibrating solid has both a particle and a wavicle interpretation.
121
122
CHAPTER 14. LATTICE VIBRATIONS
The resulting particlewavicle complementarity should not be confused with the
particlewave duality sometimes discussed in the quantum mechanics textbook
literature.
14.1
A 1dimensional Model for Ions Forming a Crystal
The ions inside a solid form a regular crystal lattice. We will consider a simple
analytically solvable model in which the ions are described as point particles of
mass M . As a warmup exercise, let us ﬁrst consider the problem in one spatial
dimension. The ions are then described by their positions xn . Here the index
n ∈ Z enumerates the ions according to their order in the crystal. In particular,
the equilibrium position of the ion with label n is na, where a is the crystal lattice
spacing. In the model the ions are coupled only to their nearest neighbors on the
lattice via a harmonic oscillator potential. The Hamiltonian of the lattice model
is then given by
H=
n∈Z
1
p2
n
2
+ M ω0 (xn+1 − xn − a)2 .
2M
2
(14.1.1)
As a next step, we introduce the displacement of each ion from its equilibrium
position
yn = xn − na,
(14.1.2)
such that
H=
n∈Z
1
p2
n
2
+ M ω0 (yn+1 − yn )2 ,
2M
2
(14.1.3)
and we perform a discrete Fourier transform of the variables yn with respect to
their index n, i.e.
y (k) =
˜
yn exp(−ikna).
(14.1.4)
n∈Z
The wave number k takes values in the periodic Brillouin zone k ∈] − π/a, π/a].
Note that indeed y(k + 2π/a) = y (k). Since the original variables yn are real˜
˜
valued (and the corresponding quantum mechanical operator is thus Hermitean),
one obtains y(−k) = y (k)† . The inverse Fourier transform takes the form
˜
˜
yn =
a
2π
π /a
dk y(k) exp(ikna).
˜
−π/a
(14.1.5)
14.1. A 1DIMENSIONAL MODEL FOR IONS FORMING A CRYSTAL 123
Let us now rewrite the potential energy of our model in terms of the Fourier
transformed variables y (k), i.e.
˜
V
=
n∈Z
=
×
=
1
2
M ω0 (yn+1 − yn )2
2
a
2π
n∈Z
a
2π
2
1
2
M ω0
2
π /a
π /a
dk
−π/a
dk′ y (k)† y (k′ )
˜
˜
−π/a
[exp(ik(n + 1)a) − exp(ikna)][exp(−ik′ (n + 1)a) − exp(−ik′ na)]
π /a
dk
−π/a
1
2ˆ
M ω0 k2 a2 y (k)† y (k).
˜
˜
2
(14.1.6)
Here we have used
a
2π
n∈Z
exp(i(k − k′ )na) = δ(k − k′ ),
(14.1.7)
as well as
2 − exp(ika) − exp(−ika) = 2[1 − cos(ka)] =
2 sin
ka
2
2
ˆ
= k2 a2 .
(14.1.8)
In the last step we have introduced
ka
ˆ2
k = sin .
a
2
(14.1.9)
Similarly, we consider the kinetic energy
T=
n∈Z
p2
n
,
2M
pn = −i
d
.
dyn
(14.1.10)
By again applying the Fourier transform, we obtain
p(k) =
˜
n∈Z
pn exp(−ikna),
p(−k) = p(k)† ,
˜
˜
(14.1.11)
and correspondingly
pn =
a
2π
π /a
dk p(k) exp(ikna).
˜
−π/a
(14.1.12)
124
CHAPTER 14. LATTICE VIBRATIONS
The kinetic energy can now be written as
T
=
n∈Z
a
2π
=
=
p2
n
2M
a
2π
π /a
π /a
2
dk
−π/a
−π/a
π /a
dk
−π/a
dk′
1
p(k)† p(k′ )
˜
˜
2M
n∈Z
exp(i(k − k′ )na)
1
p(k)† p(k).
˜
˜
2M
(14.1.13)
Finally, the Hamilton operator takes the form
H=
a
2π
π /a
dk
−π/a
1
1
2ˆ
p(k)† p(k) + M ω0 k2 a2 y (k)† y (k) .
˜
˜
˜
˜
2M
2
(14.1.14)
This represents a set of harmonic oscillators (one for each value of the wave
number k) with the kdependent frequency
ka
ˆ
(14.1.15)
ω (k) = ω0 ka = 2ω0  sin .
2
While two individual ions would oscillate against each other with a ﬁxed frequency
ω0 , the crystal as a whole supports vibrations of arbitrary frequency ω (k) between
0 and 2ω0 .
14.2
Phonon Creation and Annihilation Operators
The quantized vibrations of a solid are known as phonons. In the context of a
single harmonic oscillator, we are familiar with raising and lowering operators.
This suggests to introduce
1
a(k) = √
2
α(k)˜(k) +
y
ip(k)
˜
α(k)
1
, a(k)† = √
2
α(k)˜(k)† −
y
ip(k)†
˜
α(k)
, (14.2.1)
with
α(k) =
M ω (k)
.
(14.2.2)
We then obtain
iα(k′ )
iα(k)
[y (k), p(k′ )† ] +
˜
˜
[p(k), y (k′ )† ]
˜
˜
2α(k′ )
2α(k)
iα(k)
iα(k)
=−
[y (k), p(−k′ )] +
˜
˜
[p(k), y (−k′ )].(14.2.3)
˜
˜
′)
2α(k
2α(k′ )
[a(k), a(k′ )† ] = −
125
14.2. PHONON CREATION AND ANNIHILATION OPERATORS
The commutation relations of the coordinates and momenta are given by
[yn , ym ] = 0, [pn , pm ] = 0, [yn , pm ] = i δnm ,
(14.2.4)
which implies
[yn , pm ] exp(−ikna + ik′ ma)
[˜(k), p(−k′ )] =
y
˜
n,m∈Z
=i
n∈Z
exp(i(k′ − k)na) = i
This ﬁnally leads to
[a(k), a(k′ )† ] =
2π
δ(k − k′ ).
a
2π
δ(k − k′ ).
a
(14.2.5)
(14.2.6)
Similarly, we also have
[a(k), a(k′ )] = 0, [a(k)† , a(k′ )† ] = 0.
(14.2.7)
Using
1
2
1
=
2
i
+
2
the Hamilton operator now
ip(k)†
˜
ip(k)
˜
α(k)˜(k) +
y
α(k)
α(k)
† p(k )
p(k) ˜
˜
α(k)2 y (k)† y (k) +
˜
˜
α(k)2
α(k)˜(k)† −
y
a(k)† a(k) =
H=
=
=
a
2π
a
2π
a
2π
[y(−k)˜(k) − p(−k)˜(k)] ,
˜
p
˜
y
(14.2.8)
takes the form
π /a
−π/a
π /a
dk ω (k) a(k)† a(k) −
i
[y (k), p(−k)]
˜
˜
2
dk ω (k) a(k)† a(k) +
N
2
−π/a
π /a
dk ω (k) n(k) +
−π/a
N
2
.
(14.2.9)
The number of ions N (which is inﬁnite in the inﬁnite volume limit) arises from
2π
δ(k) =
a
n∈Z
exp(−ikna) ⇒
2π
δ(0) =
a
1 = N.
(14.2.10)
n∈Z
We have introduced the phonon number operator
n(k) = a(k)† a(k),
(14.2.11)
which determines the number of phonons occupying a given mode with wave
number k.
126
14.3
CHAPTER 14. LATTICE VIBRATIONS
Phonons in One Dimension
Just as a quantum state of the electromagnetic ﬁeld is characterized by a photon
number n(k) for each mode k, a quantum state of the vibrating solid is characterized by a phonon number n(k) for each mode k. The ground state 0 of the
solid has n(k) = 0 for all modes, i.e. for all k
a(k)0 = 0.
(14.3.1)
It is interesting to note that even in the absence of phonons, i.e. when n(k) = 0
for all k, the solid has a nonvanishing zeropoint energy. Let us calculate the
total ground state energy of a solid consisting of N ions, which then has a ﬁnite
length L = N a. Just as in the derivation of Planck’s formula, before we take the
limit L → ∞, we may introduce periodic boundary conditions which quantize
the allowed kvalues to k = 2πm/L with m ∈ Z, such that
E0 =
k
L
1
ω (k) →
2
2π
π /a
dk
−π/a
1
ω (k).
2
(14.3.2)
In the inﬁnite volume limit, the ground state energy density thus takes the form
ρ0 =
E0
1
=
L
2π
π /a
dk
−π/a
ω0
1
ω (k) =
.
2
2πa
(14.3.3)
In the following we will ignore the ground state energy because it just corresponds
to an overall constant energy shift.
When one performs a similar calculation for the electromagnetic ﬁeld using
quantum ﬁeld theory, one ﬁnds that it also has a nonzero energy density in its
ground state. The state of the electromagnetic ﬁeld without any photons is known
as the vacuum. Hence, quantum ﬁeld theory predicts that there is a nonzero
vacuum energy density. Since the analog of the lattice spacing a, which plays the
role of a short distance cutoﬀ, is sent to zero in quantum ﬁeld theory, the vacuum
energy in quantum ﬁeld theory even diverges. This is related to the cosmological
constant problem, one of the big unsolved puzzles in theoretical physics. Why
is the observed vacuum energy, which manifests itself on cosmic scales, by about
120 orders of magnitude smaller than any calculation in quantum ﬁeld theory
would suggest?
The singlephonon states describing a phonon of momentum p = k result
from
k = a(k)† 0 .
(14.3.4)
127
14.4. FROM PARTICLES TO “WAVICLES”
These states have an energy diﬀerence to the ground state of
E (k) = ω (k) = 2 ω0  sin
ka
.
2
(14.3.5)
Interestingly, at least for small momenta p = k, just like photons, phonons also
have a linear energymomentum dispersion relation, i.e. E (k) = kc. In this
case, however, c is not the velocity of light. Instead
c = ω0 a
(14.3.6)
is the velocity of sound. After all, phonons are just quantized sound waves propagating inside a solid. Unlike for photons, the phonon energymomentum dispersion relation becomes ﬂat for large momenta at the edge of the Brillouin zone.
As a result, the velocity of sound
c(k) =
ka
dE
= ω0 a cos .
dp
2
(14.3.7)
is kdependent.
Let us also consider states describing two phonons with momenta p1 = k1
and p2 = k2 , which are given as
k1 k2 = a(k1 )† a(k2 )† 0 .
(14.3.8)
Since [a(k1 )† , a(k2 )† ] = 0 one obtains
k1 k2 = k2 k1 ,
(14.3.9)
i.e. the state is symmetric under the exchange of the two phonons. This means
that, just like photons, phonons are bosons. Consequently, the phonon number
of a given mode is unrestricted and takes values n(k) = 0, 1, 2, . . . , ∞.
14.4
From Particles to “Wavicles”
Just as photons are quantized electromagnetic waves, phonons are quantized
lattice vibrations. It is common to talk about photons and phonons as “particles”. However, they are not genuine particles in the sense of Newton, that
can be described by their position and momentum. Of course, as we learned
from Heisenberg, at the quantum level the position and momentum of a particle
cannot be measured simultaneously with arbitrary precision. It is important to
understand that quantized wave “particles”, such as photons and phonons, are
128
CHAPTER 14. LATTICE VIBRATIONS
qualitatively diﬀerent. A priori, their position is not even deﬁned. While we have
started with a model of ions, which are genuine particles in the sense of Newton, the phonons are collective vibrational excitations of the ion system. While
the position of a phonon is not even deﬁned, its momentum p = k as well as
its energy E (k) = ω (k) make perfect sense. In order to distinguish quantized
wave “particles” from genuine particles a la Newton, Frank Wilczek sometimes
speaks of “wavicles”. Since there is confusion in some quantum mechanics textbooks about particlewave duality, the distinction between particles and wavicles
may be quite useful. Once we have understood the diﬀerence, we may return to
common terminology and also call the wavicles “particles”.
Let us try to deﬁne the position of a photon or phonon. While one would
not speak about the position of a classical electromagnetic wave or a sound wave
(since they exist in whole regions of space simultaneously), we can, for example,
detect single photons in the cells of our retina. This means that a “wavicle” has
been localized in a certain region of space. This may imply that one could be
able to deﬁne its position after all. Since we have deﬁned momentum eigenstates
k for a phonon, quantum mechanics would suggest to deﬁne a corresponding
position eigenstate as
m =
π /a
a
2π
dk k exp(ikma).
(14.4.1)
dk a(k)† exp(ikma),
(14.4.2)
−π/a
Deﬁning the creation operator
a† =
m
a
2π
π /a
−π/a
one would then obtain
m = a† 0 .
m
(14.4.3)
Also introducing the annihilation operator
am =
a
2π
π /a
−π/a
dk a(k) exp(−ikma),
(14.4.4)
one ﬁnds the commutation relation
[am , a† ] =
n
=
a
2π
a
2π
2
dk
−π/a
π /a
−π/a
π /a
π /a
−π/a
dk′ [a(k), a(k′ )† ] exp(i(k′ n − km)a)
dk exp(ik(n − m)a) = δmn ,
(14.4.5)
129
14.4. FROM PARTICLES TO “WAVICLES”
as well as
[am , an ] = 0, [a† , a† ] = 0.
mn
(14.4.6)
This seems to suggest that we have indeed found a useful deﬁnition of an
operator that creates a phonon in the position eigenstate n . Hence, let us
consider the operator a† in more detail. Using the deﬁnition of a(k)† we obtain
m
a† =
m
a
2π
π /a
1
dk √
2
−π/a
α(k)˜(k)† −
y
ip(k)†
˜
α(k)
exp(ikma).
(14.4.7)
Since a† is the Fourier transform of the sum of two products, it can be expressed
m
as a convolution
i
a† =
fm−n yn − gm−n pn .
(14.4.8)
m
n∈Z
Since this expression involves a sum over all ion labels n, a† is not at all localized
m
at a single position m in the crystal lattice. In particular, the quantities
fm =
gm =
a
2π
a
2π
π /a
a
α(k)
dk √ exp(ikma) =
2π
2
−π/a
π /a
π /a
M ω (k)
exp(ikma),
2
dk
−π/a
a
1
exp(ikma) =
dk √
2π
2α(k)
−π/a
π /a
dk
−π/a
2M ω (k)
exp(ikma),
(14.4.9)
are nonlocal, i.e. they decay only slowly as m → ∞. In particular, for large m
one ﬁnds
1
1
fm ∼
.
(14.4.10)
gm ∼
3,
m
m
Some values for fm and gm (in units of α0 = M ω0 / ) are listed in table 1.
In contrast to particles which are pointlike objects (but do respect the positionmomentum uncertainty relation), wavicles are inherently nonlocal objects. In
particular, if we want to create a phonon at a position m in the crystal, we must
excite ion oscillations everywhere in the crystal in a particularly coordinated
manner. Hence, when we use the standard terminology and refer to phonons as
“particles”, we should not forget that, a priori, they do not even have a position
and are thus not localizable in the usual sense.
We can take two complementary points of views on the dynamics of the vibrating solid. On the one hand, we can use a particle description, in which we
work with a wave function that depends on the coordinates of the individual ions
130
CHAPTER 14. LATTICE VIBRATIONS
m
0
1
2
3
4
5
fm /α0
0.762759
 0.152552
 0.050850
 0.027381
 0.017717
 0.012655
gm α0
0.834627
0.278209
0.198721
0.162590
0.140911
0.126078
Table 14.1: Some values for fm and gm in units of α0 =
M ω0 / .
in the crystal. On the other hand, we can also use a wavicle description, in which
we characterize a state of the vibrating solid by specifying the phonon occupation
numbers n(k) for all modes k. It should be pointed out that this has nothing
to do with the particlewave duality discussed in the textbook literature, which
sometimes gives rise to confusion. The two views on the vibrating solid are a
manifestation of what one might call particlewavicle complementarity. Remarkably, in this case, one can really describe the same physical phenomena using
either a particle or a wavicle description.
Today, the most fundamental description of Nature is provided by the Standard Model of particle physics, which is a relativistic quantum ﬁeld theory. In
the Standard Model, photons, electrons, and other elementary “particles” arise as
wavicles. Hence, our current understanding suggests that the most fundamental
objects in Nature are indeed wavicles, which arise from the quantized oscillations of ﬁelds. In this sense, it would be more appropriate to speak about wavicle
rather than particle physics. In the case of the vibrating solid, the phonon “ﬁeld”
is determined by the positions xn of the ion particles, thus giving rise to particlewavicle complementarity. Such complementarity does not exist in the Standard
Model. However, one my wonder whether a particle (instead of a wavicle or
ﬁeld) description might again be possible, perhaps at an even more fundamental
level. A currently more popular but very speculative way of thinking is that
not particles but strings might be the most fundamental degrees of freedom in
Nature.
14.5. SPECIFIC HEAT OF A 1DIMENSIONAL “SOLID”
14.5
131
Speciﬁc Heat of a 1dimensional “Solid”
Let us now consider the canonical ensemble. Just as for photons, the phonon
partition function factorizes into a product of mode partition functions
Z=
Z (k),
(14.5.1)
k
with the single mode partition function given by
∞
Z (k) =
n(k )=0
exp(−βn(k)E (k)) =
1
.
1 − exp(−βE (k))
(14.5.2)
The average energy stored in a mode is then given by
n(k)E (k) = −
∂ log Z (k)
E (k)
=
.
∂β
exp(βE (k)) − 1
(14.5.3)
In the inﬁnite volume limit, the average total energy density then takes the form
ρ=
=
1
E
=
L
L
1
2π
n(k)E (k) →
k
π /a
1
2π
π /a
dk n(k)E (k)
−π/a
E (k)
.
exp(βE (k)) − 1
dk
−π/a
(14.5.4)
The expression for the speciﬁc heat is then given by
cV =
1∂ E
∂ρ
1
CV
=
=
=
L
L ∂T
∂T
2πkB T 2
π /a
dk
−π/a
E (k)2 exp(βE (k))
.
[exp(βE (k)) − 1]2
(14.5.5)
Unlike in the photon case, which yields the StefanBoltzmann law, in the phonon
case the corresponding integral cannot be done analytically. However, we may
consider very low temperatures. Then only a few phonons with low frequency can
be excited and the energy can be approximated by E (k) = kc. In addition,
the integral can safely be extended to k ∈ {−∞, ∞} because the integrand is
exponentially suppressed for large values of k, such that
ρ=
1
π
∞
dk
0
2
πkB T 2
kc
=
.
exp(β kc) − 1
6c
(14.5.6)
This is the analog of the StefanBoltzmann law for 1dimensional phonons at very
low temperatures. The corresponding speciﬁc heat is given by
cV =
2
πkB T
∂ρ
=
.
∂T
3c
(14.5.7)
132
CHAPTER 14. LATTICE VIBRATIONS
Unlike photons, at higher temperatures phonons will show a diﬀerent behavior
because their energy deviates from kc and the momentum integral is cut oﬀ at
π/a. These eﬀects become particularly noticeable when phonons of the highest
frequency ω (π/a) = 2ω0 become thermally excited. The energy of such phonons
is E (π/a) = 2 ω0 . Hence, when the temperature becomes of the order of the
Debye temperature
E (π/a)
2 ω0
TD =
=
,
(14.5.8)
kB
kB
we should expect deviations from the lowtemperature behavior. In the hightemperature limit T ≫ TD one obtains exp(βE (k)) − 1 ≈ βE (k) the speciﬁc heat
then takes the form
cV =
1
2πkB T 2
π /a
dk
−π/a
1
kB
=
,
2
β
a
(14.5.9)
and thus reaches a constant in the hightemperature limit. In the following we
will generalize these considerations to the realistic case of three dimensions.
14.6
Fluctuations in a 1dimensional “Solid”
One may suspect that our simple model could also explain the thermal expansion
of solids. As we will see now, this is not the case. Indeed, one must include
anharmonic corrections to the harmonic oscillator potentials between the ions, in
order to describe the thermal expansion. Anharmonic forces give rise to phononphonon interactions, which implies that phonons then no longer form an ideal
gas. Here we limit ourselves to the simple harmonic oscillator model in which
phonons are free particles.
In order to investigate the thermal expansion of our model solid, let us compute the thermal average of the distance
xn − xm = yn − ym + (n − m)a,
(14.6.1)
between the two ions at positions n and m in the crystal. We now use
yn − ym =
a
2π
=
a
2π
π /a
−π/a
π /a
−π/a
dk y (k) [exp(ikna) − exp(ikma)]
˜
dk √
1
[a(k) + a(−k)† ] [exp(ikna) − exp(ikma)] ,
2α(k)
(14.6.2)
133
14.6. FLUCTUATIONS IN A 1DIMENSIONAL “SOLID”
and consider the thermal average
a(k)
1
Tr {a(k) exp(−βH )}
Z
∞
1
n(k)a(k)n(k) exp(−β ω (k)n(k)) = 0.
Z (k)
=
=
n(k )=0
(14.6.3)
Here n(k) denotes a state with n(k) phonons occupying the mode k. Since a(k)
lowers and a(k)† raises the phonon number by 1, i.e.
a(k)n(k) =
†
a(k) n(k) =
n(k)n(k) − 1 ,
n(k) + 1n(k) + 1 ,
(14.6.4)
one obtains n(k)a(k)n(k) = 0. Similarly, one ﬁnds n(k)a(−k)† n(k) = 0.
Hence, also yn − ym = 0 and we thus obtain
xn − xm = (n − m)a,
(14.6.5)
such that the average distance of two ions is temperatureindependent and just
corresponds to the equilibrium distance (n − m)a.
Let us also calculate the average distance squared
(xn − xm )2 = (yn − ym )2 + (n − m)2 a2 ,
(14.6.6)
where we have used yn − ym = 0. We now obtain
(yn − ym )2 =
×
=
a
2π
π /a
π /a
2
dk
−π/a
′
−π/a
′
dk′ y (k)† y (k′ ) [exp(−ikna) − exp(−ikma)]
˜
˜
exp(ik na) − exp(ik ma)
a
2π
2
π /a
π /a
dk
−π/a
−π/a
dk′
[a(k)† + a(−k)][a(k′ ) + a(−k′ )† ]
2α(k)α(k′ )
× [exp(−ikna) − exp(−ikma)] exp(ik′ na) − exp(ik′ ma) . (14.6.7)
The thermal average [a(k)† + a(−k)][a(k′ ) + a(−k′ )† ] vanishes if k = k′ . Hence,
we only need to consider a(k)† a(k) as well as a(k)a(k)† (which is the same as
134
CHAPTER 14. LATTICE VIBRATIONS
a(−k)a(−k)† ). First, we obtain
a(k)† a(k)
=
=
=
=
1
Tr a(k)† a(k) exp(−βH )
Z
∞
1
n(k)a(k)† a(k)n(k) exp(−β ω (k)n(k))
Z (k)
1
Z (k)
n(k )=0
∞
n(k )=0
n(k) exp(−β ω (k)n(k))
exp(−β ω (k))
.
1 − exp(−β ω (k))
(14.6.8)
Similarly, we ﬁnd
a(k)a(k)†
=
=
=
=
1
Tr a(k)a(k)† exp(−βH )
Z
∞
1
n(k)a(k)† a(k)n(k) exp(−β ω (k)n(k))
Z (k)
1
Z (k)
n(k )=0
∞
[n(k) + 1] exp(−β ω (k)n(k))
n(k )=0
1
.
1 − exp(−β ω (k))
(14.6.9)
This ﬁnally implies
(yn − ym )2
=
a
2π
π /a
dk
−π/a
[a(k)† + a(−k)][a(k) + a(−k)† ]
2α(k)2
× [exp(−ikna) − exp(−ikma)] [exp(ikna) − exp(ikma)]
=
a
2π
π /a
dk
−π/a
[a(k)† + a(−k)][a(k) + a(−k)† ]
2M ω (k)
× 2 [1 − cos(k(n − m)a)]
=
a
2π
π /a
dk
−π/a
2 sin2 (k(n − m)a) 1 + exp(−β ω (k))
.
M ω (k)
1 − exp(−β ω (k))
(14.6.10)
This integral can not be solved in closed form. Still, one can consider the high
14.7. A 3DIMENSIONAL MODEL FOR IONS IN A CRYSTAL
135
temperature limit in which one obtains
(yn − ym )2
=
=
a
2π
a
2π
π /a
dk
4 sin2 (k(n − m)a)
βM ω (k)2
dk
2
sin2 (k(n − m)a)
=
2
2 n − m. (14.6.11)
βM ω0 sin(ka/2)2
βM ω0
−π/a
π /a
−π/a
In fact, at any nonzero temperature, for large separation n − m one ﬁnds
(yn − ym )2 ∼ n − m. The linear increase of this expectation value at large
distances is due to the fact that ω (k) ≈ kc0 vanishes at zero momentum. Even
in two dimensions the corresponding expectation value still increases, although
only logarithmically, while in three and more dimensions it approaches a constant
at large separation n − m. The linear respectively logarithmic increase in one
and two dimensions, which arises as a consequence of the socalled HohenbergMerminWagner theorem, implies that strict crystalline order exists only in three
or more dimensions.
The HohenbergMerminWagner theorem has incorrectly been invoked to argue that structures like graphene — a 2dimensional sheet of carbon atoms arranged to form a honeycomb lattice — cannot even exist, unless they are attached
to some substrate. However, freestanding graphene does indeed exists. In fact,
the Nobel prize 2010 was awarded to Andre Geim and Konstantin Novoselov
from Manchester University for “producing, isolating, identifying and characterizing graphene”. Just producing a single sheet of graphite alone may actually
not be that diﬃcult. After all, when we draw or write with a pencil, graphite is
cleaved into thin layers that end up on the paper. Some of these may contain
only a few sheets or even just a single sheet of graphite, i.e. graphene. In fact, the
HohenbergMerminWagner theorem does not imply that a 2dimensional crystal
like graphene cannot exist. It only means that perfect crystalline order cannot
persits over arbitrarily large distances. Since actual graphene sheets always have
a ﬁnite size, the theorem is not in conﬂict with any observations.
14.7
A 3dimensional Model for Ions in a Crystal
The atomic nuclei inside a solid form a regular crystal lattice. We will now extend
our simple model from one to three dimensions. We assume that the ions of mass
M with position vector xn form a cubic lattice. Here the index n ∈ Z3 enumerates
the ions according to their position in the crystal. In particular, the equilibrium
position of the ion with label n is na. Ions are again coupled only to their nearest
136
CHAPTER 14. LATTICE VIBRATIONS
neighbors on the lattice via a harmonic oscillator potential. The Hamiltonian of
the lattice model is then given by
2
1
−
(xn+i − xn − ia)2
∆n + M ω 2
H=
2M
2
i=1,2,3
n∈Z3
2
1
−
(yn+i − yn )2 .
(14.7.1)
∆n + M ω 2
=
2M
2
3
i=1,2,3
n∈Z
Here i denotes a unitvector in the spatial idirection,
yn = xn − na ,
∂2
∂2
∂2
+ 2+ 2
2
∂yn1 ∂yn2 ∂yn3
and
∆n =
(14.7.2)
(14.7.3)
is the Laplacian with respect to the position yn . The discrete Fourier transform
yj (k ) =
˜
n∈Z3
has the inverse
ynj =
ynj exp(−ik · na) , j ∈ {1, 2, 3} ,
a
2π
3
B
d3 k yj (k) exp(ik · na) ,
˜
(14.7.4)
(14.7.5)
where B =] − π/a, π/a]3 denotes the 3dimensional Brillouin zone. In analogy to
the 1dimensional case, we now obtain the potential energy as
V
=
n∈Z3
=
×
a
2π
1
2
M ω0
2
6
(yn+i − yn )2
i=1,2,3
1
2
M ω0
2
i=1,2,3 n∈Z3
B
yj (k)† yj (k′ )
˜
˜
d3 k′
d3 k
B
j =1,2,3
[exp(ik · (n + i)a) − exp(ik · na)]
× [exp(−ik′ · (n + i)a) − exp(−ik′ · na)]
a3
1
2
ˆ2
=
yj (k )† yj (k).
˜
ki a2
d3 k M ω0
2π
2
B
i=1,2,3
(14.7.6)
j =1,2,3
Here we have used
a
2π
3
n∈Z3
exp(i(k − k′ ) · na) = δ(k − k′ ),
(14.7.7)
137
14.7. A 3DIMENSIONAL MODEL FOR IONS IN A CRYSTAL
as well as
2 − exp(iki a) − exp(−iki a) = 2[1 − cos(ki a)] =
2 sin
ki a
2
2
ˆ2
= ki a2 . (14.7.8)
with
ki a
2
ˆ
.
ki = sin
a
2
This gives rise to the phonon dispersion relation in three dimensions
sin2
E (k ) = ω (k ) = 2 ω0
i=1,2,3
ki a
.
2
(14.7.9)
(14.7.10)
By applying the Fourier transform also to the momenta, i.e.
pnj exp(−ik · na),
pj (k) =
˜
pnj =
n∈Z3
a3
2π
B
d3 k pj (k ) exp(ik · na),
˜
(14.7.11)
one writes the kinetic energy as
T
=
n∈Z3
p2
n
2M
a
2π
=
6
a
2π
=
3
d3 k′
d3 k
B
B
d3 k
B
1
2M
1
2M
pj (k )† pj (k′ )
˜
˜
j =1,2,3
n∈Z3
pj (k)† pj (k ).
˜
˜
exp(i(k − k′ ) · na)
(14.7.12)
j =1,2,3
Altogether, the Hamilton operator takes the form
H=
=
a
2π
3
a
2π
3
d3 k
B
j =1,2,3
1
1
2ˆ
pj (k)† pj (k ) + M ω0 k2 a2 yj (k )† yj (k)
˜
˜
˜
˜
2M
2
d3 k ω (k)
B
nj (k ) +
j =1,2,3
N
2
.
(14.7.13)
Here we have introduced the number operator
nj (k ) = aj (k )† aj (k ),
(14.7.14)
138
CHAPTER 14. LATTICE VIBRATIONS
for phonons with momentum p = k and polarization j = 1, 2, 3. The phonon
creation and annihilation operators take the form
aj (k ) =
1
√
2
α(k )˜j (k ) +
y
aj (k)† =
1
√
2
α(k )˜j (k )† −
y
i
pj (k ) ,
˜
α(k )
i
α(k )
pj (k)†
˜
,
(14.7.15)
and they obey the commutation relations
[ai (k ), aj (k′ )† ] =
2π
a
3
δ(k − k′ )δij ,
[ai (k ), aj (k′ )] = 0, [ai (k )† , aj (k′ )† ] = 0.
14.8
(14.7.16)
Speciﬁc Heat of 3dimensional Solids
In real solids, which are somewhat more complicated than our simple model,
longitudinal and transverse phonons in general have diﬀerent dispersion relations.
In the simple harmonic oscillator model, on the other hand, all three phonon
polarization states have the same energymomentum dispersion relation E (k ) =
ω (k ). Again, for small momenta p = k, we obtain E (k) = k c, with the
velocity of sound given by c = ω0 a. The canonical partition function now takes
the form
Z=
Z (k),
(14.8.1)
j =1,2,3 k
with the single mode partition function given by
∞
Z (k) =
n(k )=0
exp(−βn(k )E (k )) =
1
.
(14.8.2)
E (k )
∂ log Z (k)
=
.
∂β
exp(βE (k )) − 1
(14.8.3)
1 − exp(−βE (k ))
The average energy stored in a mode reads
n(k)E (k ) = −
14.8. SPECIFIC HEAT OF 3DIMENSIONAL SOLIDS
139
Consequently, in the inﬁnite volume limit, the average total energy density results
as
ρ=
=
E
3
=3
3
L
L
3
(2π )3
k
d3 k
B
3
(2π )3
n(k)E (k ) →
E (k )
exp(βE (k )) − 1
d3 k n(k)E (k )
B
.
(14.8.4)
Here the factor 3 arises due to the three polarizations of phonons. Correspondingly, the expression for the speciﬁc heat now takes the form
cV =
CV
∂ρ
3
3∂E
=
=
=3
3
3k T 2
L
L ∂T
∂T
(2π ) B
d3 k
B
E (k )2 exp(βE (k ))
[exp(βE (k )) − 1]2
. (14.8.5)
At low temperatures, we can approximate E (k ) ≈ k c. Extending the momentum integral from B to R3 we then obtain
ρ=
3
2π 2
∞
0
dk k2
4
π 2 kB T 4
kc
=
.
exp(β kc) − 1
10 3 c3
(14.8.6)
Up to a factor 3/2, which is due to three polarizations of phonons versus two polarizations of photons, this is just the StefanBoltzmann law. The corresponding
speciﬁc heat is given by
4
2π 2 kB T 3
∂ρ
cV =
=
.
(14.8.7)
∂T
5 3 c3
The T 3 dependence is characteristic for the phonon contribution to the speciﬁc
heat of solids. As we will see in the next chapter, electrons, which we have ignored
until now, contribute to cV in proportion to T .
As in one dimension, at higher temperatures phonons display a diﬀerent behavior because their energy deviates from k c and the momentum integral extends over the ﬁnite Brillouin zone B only. In the hightemperature limit T ≫ TD
one again obtains exp(βE (k )) − 1 ≈ βE (k ) such that
cV =
3
(2π )3 kB T 2
d3 k
B
1
3kB
= 3.
β2
a
(14.8.8)
The fact that, at large temperatures, the speciﬁc heat reaches this T independent
value is known as the DulongPetit law.
140
CHAPTER 14. LATTICE VIBRATIONS
Chapter 15
Electrons in Solids
While we have neglected electrons in the previous chapter, in this chapter we will
neglect phonons and concentrate on the electrons. In fact, we will now impose a
rigid crystal lattice by hand and will discuss the motion of the electrons in the
background of the crystal lattice. In this chapter, the ions and thus the crystal
lattice are not allowed to vibrate and are thus considered to be static. The crystal
lattice of static ions gives rise to a periodic external potential for the electrons.
The minima of this periodic potential are centered at the static ions, which exert
an attractive force on the electrons. Instead of being localized near an individual
ion, electrons will tunnel from ion to ion, and may thus move through the entire
crystal. A periodic potential gives rise to quantum mechanically allowed energy
levels which form continuous energy bands. Diﬀerent bands may be separated
by gaps of forbidden values of the energy. In a crystal, the Coulomb repulsion
between electrons is screened by the positively charged ions. Hence, at least as a
ﬁrst approximation, it is reasonable to treat electrons in a solid as noninteracting.
In order to capture some important aspects of these dynamics, we will again
make a very simple (and in fact oversimpliﬁed) model which, however, can then
be solved analytically. In this model, which is a simple variant of the socalled
singleband Hubbard model, electrons are not even considered in the continuous space between lattice points. Instead, they are restricted to hop between
discrete lattice points with a hopping amplitude that reﬂects the tunneling rate
of electrons between neighboring ions. Since electrons are fermions, they follow
the Pauli exclusion principle and thus behave drastically diﬀerently than bosons.
Since only one fermion can occupy a given mode, even at zero temperature some
modes of nonzero energy must be populated by fermions. The surface in momentum space up to which modes are occupied with fermions at zero temperature is
141
142
CHAPTER 15. ELECTRONS IN SOLIDS
known as the Fermi surface. If there are unoccupied states arbitrarily close to
the Fermi surface, the system is a metal. On the other hand, if the Fermi surface
coincides with a bandgap, the system is an insulator. The electron dynamics
is strongly inﬂuenced by the lattice geometry. For example, massive electrons
propagating on a 2dimensional honeycomb lattice, such as in graphene, have
peculiar properties reminiscent of (almost) massless fermions such as neutrinos.
It should be noted that interesting eﬀects are missed when one ignores phonons.
In particular, in metals at very low temperatures, phonons mediate an attractive
interaction between electrons that can overcome the screened Coulomb repulsion.
As a result, two electrons are bound into Cooper pairs, which are bosons. The
Cooper pair condensation of these bosons then leads to superconductivity, i.e. to
the complete absence of electric resistance at temperatures below a few degrees
Kelvin. In 1972 John Bardeen, Leon Neil Cooper, and Robert Schrieﬀer received
the Nobel prize for the explanation of metallic superconductivity. In addition to
metallic superconductors, there are also the socalled hightemperature superconductors which remain superconducting up to temperatures as high as 100 degrees
Kelvin. Johannes Bednorz and Karl M¨ller were awarded the Nobel prize of 1987
u
for the discovery of these materials. Remarkably, hightemperature superconductors are not metals but are related to socalled Mott insulators. In these systems, the electrons cannot be treated as noninteracting particles, but are instead
strongly correlated. The dynamical mechanism responsible for hightemperature
superconductivity remains a subject of very intense research in condensed matter physics. Most experts agree, that the exchange of phonons alone cannot be
responsible for this intriguing phenomenon. In any case, even understanding
metallic superconductivity in the framework of the BardeenCooperSchrieﬀer
(BCS) theory is beyond the scope of these lectures.
15.1
Electron Creation and Annihilation Operators
Just like phonons, electrons can be described by creation and annihilation operators. However, since electrons are fermions while phonons are bosons, the
creation and annihilation operators for electrons obey anticommutation rather
than commutation relations. In addition, while phonons have three polarization
1
directions, electrons have two spin orientations s = ± 2 =↑, ↓. Let us ﬁrst consider creation and annihilation operators in the simplest context of just a single
lattice site. Then there is no lattice site index and we only need to consider the
spin s. The operator that annihilates an electron of spin s at the given lattice
site is denoted by cs , while the corresponding creation operator is c† . The Pauli
s
143
15.1. ELECTRON CREATION AND ANNIHILATION OPERATORS
principle of electrons is encoded in the following anticommutation relations
{cs , c†′ } = δss′ , {cs , cs′ } = 0, {c† , c†′ } = 0.
ss
s
(15.1.1)
In general the anticommutator of two operators A and B is given by
{A, B } = AB + BA.
(15.1.2)
Interestingly, the anticommutation relations are suﬃcient to implicitly deﬁne
how the operators cs and c† are acting. In other words, in order to work with
s
these operators, we need not know anything more than their anticommutation
relations. As a consequence of the anticommutation relations, two electrons of
the same spin cannot occupy the same lattice point. In particular, if we try to
create two electrons of the same spin s at the same lattice site by acting with
(c† )2 , we obtain
s
1
c†2 = {c† , c† } = 0.
(15.1.3)
s
2ss
Similarly, one can never annihilate two electrons of the same spin at the same
lattice site because
1
(15.1.4)
c2 = {cs , cs } = 0.
s
2
Let us ﬁrst consider the vacuum 0 , i.e. a state without electrons. This state
is annihilated by both c↑ and c↓ , i.e.
c↑ 0 = 0, c↓ 0 = 0.
(15.1.5)
By acting with the individual creation operators, we can now create states with
a single electron of either spin up or down
c† 0 =  ↑ , c† 0 =  ↓ .
↑
↓
(15.1.6)
Assuming that the vacuum is normalized to 1, i.e. 00 = 1, we can convince
ourselves that the 1particle states are also correctly normalized
↑  ↑ = 0c↑ c† 0 = 01 − c† c↑ 0 = 00 = 1,
↑
↑
↓  ↓ = 0c↓ c† 0 = 01 − c† c↓ 0 = 00 = 1.
↓
↓
(15.1.7)
Furthermore, the two 1particle states are orthogonal
↑  ↓ = 0c↑ c† 0 = − 0c† c↑ 0 = 0.
↓
↓
(15.1.8)
The normalization and orthogonality relations can be summarized as
ss′ = δss′ , s, s′ =↑, ↓ .
(15.1.9)
144
CHAPTER 15. ELECTRONS IN SOLIDS
Finally, we can also construct a 2particle state
c† c† 0 = c†  ↓ =  ↑↓ ,
↑↓
↑
(15.1.10)
which is again correctly normalized because
↑↓  ↑↓ = ↓ c↑ c†  ↓ = ↓ 1 − c† c↑  ↓ = 1.
↑
↑
Similarly, we obtain
c† c† 0 = c†  ↑ =  ↓↑ .
↓↑
↓
(15.1.11)
(15.1.12)
However, this is not a new 2particle state, since
 ↓↑ = c† c† 0 = −c† c† 0 = − ↑↓ .
↓↑
↑↓
(15.1.13)
Due to the Pauli principle, which reﬂects the fact that electrons are indistinguishable particles, there is only one 2particle state. In particular, the states
 ↑↑ = c† c† 0 = 0,  ↓↓ = c† c† 0 = 0,
↑↑
↓↓
(15.1.14)
simply vanish. As a result, the fermionic Hilbert space for electrons of spin up
and spin down at a single lattice site just consists of the four states 0 ,  ↑ , ↓ ,
and  ↑↓ , which are normalized and mutually orthogonal. A Hilbert space that
consists of sectors with diﬀerent particle numbers is also known as a Fock space.
It is interesting to note that the operator
n = n↑ + n↓ = c† c↑ + c† c↓
↑
↓
(15.1.15)
counts the number of electrons, i.e.
n0 = (c† c↑ + c† c↓ )0 = 0,
↑
↓
n ↑ = (c† c↑ + c† c↓ ) ↑ = c† 0 =  ↑ ,
↑
↓
↑
n ↓ = (c† c↑ + c† c↓ ) ↓ = c† 0 =  ↓ ,
↑
↓
↓
n ↑↓ = (c† c↑ + c† c↓ ) ↑↓ = c†  ↓ − c†  ↑ =  ↑↓ −  ↓↑ = 2 ↑↓ . (15.1.16)
↑
↓
↑
↓
The various states may alternatively be labeled by their occupation numbers
n↑ , n↓ ∈ {0, 1}, such that
n↑ n↓ = c†
↑
n↑
c†
↓
n↓
0 ,
(15.1.17)
and hence
00 = 0 , 10 = c† 0 =  ↑ , 01 = c† 0 =  ↓ , 11 = c† c† 0 =  ↑↓ .
↑
↓
↑↓
(15.1.18)
15.1. ELECTRON CREATION AND ANNIHILATION OPERATORS
145
Finally, let us introduce the spin operator
c†
s
S=
ss′
σss′
cs′ ,
2
(15.1.19)
where σ is the vector of Pauli matrices
01
10
σ = σ1 , σ2 , σ3 =
0 −i
i0
,
,
10
0 −1
.
(15.1.20)
Note that we have omitted the multiplicative factor of in our deﬁnition of the
spin. Let us now investigate the commutation relations of the spin operators
[S i , S j ] =
c†
s
ss′
=
1
4
=
1
4
=
1
4
=
1
4
=
1
4
ss′ rr ′
ss′ rr ′
ss′ r ′
ss′ r
c†
r
rr′
j
σrr′
cr′
2
j
i
σss′ σrr′ c† cs′ c† cr′ − c† cr′ c† cs′
s
r
r
s
j
i
σss′ σrr′ c† [δs′ r − c† cs′ ]cr′ − c† [δr′ s − c† cr′ ]cs′
r
s
s
r
j
i
σss′ σs′ r′ c† cr′ −
s
1
4
1
j
i
c† σss′ σs′ r′ cr′ −
s
4
′
ss′ r ′
= iεijk
i
σss′
cs′ ,
2
1
2
ji
σrs σss′ c† cs′
r
ss′ r
ji
c† σrs σss′ cs′
r
ss′ r
j
j
i
i
c† σss′ σs′ r′ − σss′ σs′ r′ cr′
s
k
c† σsr′ cr′ = iεijk S k .
s
(15.1.21)
ss′ r ′
Here we have used the commutation relation for the Pauli matrices
[σ i , σ j ] = 2iεijk σ k ,
(15.1.22)
to verify the standard commutation relation for the electron spin operators
[S i , S j ] = iεijk S k .
(15.1.23)
Let us now act with the 3component of the spin operator
S3 =
c†
s
ss′
3
σss′
1
1†
c↑ c↑ − c† c↓ = (n↑ − n↓ ) ,
cs′ =
↓
2
2
2
(15.1.24)
146
CHAPTER 15. ELECTRONS IN SOLIDS
on the four states in Fock space
1
1
S 3 0 = 0, S 3  ↑ =  ↑ , S 3  ↓ = −  ↓ , S 3  ↑↓ = 0.
2
2
(15.1.25)
Next, we will consider a rigid cubic lattice of sites x = na, where a is the
lattice spacing and n = (n1 , n2 , n3 ) ∈ Z3 is a vector pointing to a lattice site. We
will make a simple model in which electrons can only exist at the discrete lattice
sites x and not in the continuous space between lattice sites. In fact, we should
think of an electron at site x as an electron in a state localized near the ion at
position x. The operator that creates an electron of spin s at the lattice site x
†
is denoted by cx,s while the corresponding annihilation operator is given by cx,s .
The anticommutation relations then take the form
†
†
†
{cx,s , cx′ ,s′ } = δxx′ δss′ , {cx,s , cx′ ,s′ } = 0, {cx,s , cx′ ,s′ } = 0.
(15.1.26)
Correspondingly, the number operator for electrons at the site x is given by
†
cx,s cx,s ,
nx =
(15.1.27)
s
while the total number of electrons in the crystal is measured by the operator
N=
nx .
(15.1.28)
x
Similarly, the spin operator for electrons at the site x is given by
†
cx,s
Sx =
ss′
σss′
cx,s′ ,
2
(15.1.29)
while the spin of the entire crystal is measured by the operator
S=
Sx .
(15.1.30)
x
The vacuum state 0 (which does not contain any electrons) is now characterized
by
cx,s 0 = 0,
(15.1.31)
for all lattice sites x and both spins s =↑, ↓. All states in the Fock space can then
be obtained as linear combinations of the states
ψ =
†
cx,↑
x
nx ↑
†
cx,↓
nx↓
0 .
(15.1.32)
Here nx↑ , nx↓ ∈ {0, 1} are occupation numbers that characterize the state ψ . In
order to make the product of creation operators unambiguously deﬁned, one must
order the lattice sites x in some arbitrary but ﬁxed manner. Diﬀerent orderings
just lead to diﬀerent overall signs of the state ψ .
147
15.2. A MODEL FOR ELECTRONS HOPPING ON A LATTICE
15.2
A Model for Electrons Hopping on a Lattice
When an electron tunnels from a state localized near an ion at the site x to
another state localized near a neighboring ion at the site x + ia, in the model
this manifests itself as a hopping of the electron from the discrete lattice site x
to x + ia. A simple Hamilton operator that describes hopping between nearestneighbor lattice sites is given by
H = −t
†
cx,s cx+ia,s + c†
c
x+ia,s x,s
.
(15.2.1)
x,i,s
The Hamiltonian ﬁrst annihilates an electron of spin s at the site x and then
recreates it at the neighboring lattice site x + ia with the same spin. Here the
hopping parameter t controls the tunneling amplitude between neighboring lattice
sites. Since electron hopping does not change the total number N of electrons,
one has [H, N ] = 0. Similarly, since the hopping is spinindependent, one can
show that the total spin is also conserved, i.e. [H, S ] = 0.
Just like for phonons, in order to diagonalize the Hamiltonian for the electrons,
we perform a Fourier transformation and we obtain
cs (k ) =
˜
x
cx,s exp(−ik · x),
cs (k )† =
˜
x
†
cx,s exp(ik · x).
(15.2.2)
The corresponding anticommutation relations then take the form
{cs (k ), cs′ (k′ )† } =
x,x′
=
x,x′
=
x
†
{cx,s , cx′ ,s′ } exp(−ik · x + ik′ · x′ )
δxx′ δss′ exp(−ik · x + ik′ · x′ )
2π
a
exp(i(k′ − k) · x)δss′ =
3
δ(k − k′ )δss′ . (15.2.3)
Similarly, one obtains
{cs (k ), cs′ (k′ )} = 0, {cs (k)† , cs′ (k′ )† } = 0.
(15.2.4)
By an inverse Fourier transform we obtain
cx,s =
a
2π
3
B
d3 k cs (k) exp(ik · x),
˜
†
cx,s =
a
2π
3
B
d3 k cs (k)† exp(−ik · x).
˜
(15.2.5)
148
CHAPTER 15. ELECTRONS IN SOLIDS
Here B =] − π , π ] again denotes the periodic Brillouin zone of the cubic lattice.
aa
The Hamilton operator then takes the form
H = −t
†
cx,s cx+ia,s + c†
c
x+ia,s x,s
x,i,s
= −t
a
2π
×
exp −ik · x + ik′ · (x + ia) + exp −ik · (x + ia) + ik′ · x)
x,i
= −t
=
B
d3 k
B
3
a
2π
3
3
cs (k )† cs (k′ )
˜
˜
d3 k′
d3 k
B
a
2π
a
2π
=
6
s
cs (k)† cs (k )
˜
˜
s
d3 k
i
[exp(iki a) + exp(−iki a)]
ω (k) cs (k )† cs (k)
˜
˜
B
s
d3 k
ω (k) ns (k ).
B
(15.2.6)
s
Here we have introduced the number operator
ns (k) = cs (k )† cs (k).
˜
˜
(15.2.7)
The dispersion relation of electrons propagating in the cubic crystal is given by
E (k ) = ω (k ) = −t
[exp(iki a) + exp(−iki a)] = −2t
i
cos(ki a).
(15.2.8)
i
Similarly, the total number of electrons is given by
N
†
cx,s cx,s
=
x,s
=
6
=
15.3
a
2π
a
2π
3
B
B
d3 k
B
cs (k )† cs (k′ )
˜
˜
d3 k′
d3 k
s
s
a
cs (k)† cs (k ) =
˜
˜
2π
x
exp i(k′ − k ) · x
3
d3 k
B
ns (k ).
(15.2.9)
s
Grand Canonical Ensemble and Fermi Surface
Let us now consider the grand canonical partition function
Z = Tr exp(−β [H − µN ]),
(15.3.1)
15.3. GRAND CANONICAL ENSEMBLE AND FERMI SURFACE
149
where µ is the chemical potential. Introducing a ﬁnite volume L3 with periodic
boundary conditions, the partition function can be written as a product over
independent modes with discrete wave number vectors k = 2π m/L (with m ∈ Z3 )
Z=
Zs (k ).
(15.3.2)
s=↑,↓ k
The singlemode partition function takes the form
1
Zs (k) =
ns (k )=0
exp −β [ ω (k ) − µ]ns (k ) = 1+ exp −β [ ω (k ) − µ] . (15.3.3)
The average occupation number of a given mode is then given by
ns (k ) =
∂ log Zs (k )
1
.
=
∂ (βµ)
exp β [ ω (k ) − µ] + 1
(15.3.4)
Here the derivative is with respect to βµ only, which leaves the pure β dependence
unaﬀected. At zero temperature, i.e. at β → ∞, the chemical potential determines which modes are occupied by a fermion and which ones are not. In particular, if E (k ) = ω (k ) < µ, the average occupation number is ns (k ) = 1, i.e.
the mode is occupied with one fermion, while for modes with E (k ) = ω (k) > µ
the mode is empty because then ns (k ) = 0. The 2dimensional surface in the
3dimensional Brillouin zone for which
E (k ) = ω (k ) = µ,
(15.3.5)
is known as the Fermi surface. At zero temperature, the Fermi surface separates
occupied from empty modes. The energy EF = µ of states at the Fermi surface
is known as the Fermi energy. At zero temperature, the volume enclosed by the
Fermi surface determines the expectation value of the density of electrons
N
a
=2
3
L
2π
3
d3 k.
(15.3.6)
B ,E (k)<EF
Here the prefactor 2 is due to spin. Denoting the momentum of an electron as
p = k, the Fermi velocity is deﬁned as the following gradient evaluated at the
Fermi surface
vF = ∇p E (p) = ∇k ω (k).
(15.3.7)
For small wave numbers k, the dispersion relation can be approximated as
E (k ) = ω (k) = −2t
i
1−
2
ki a2
2
= −6t +
2 k2
2M
,
(15.3.8)
150
CHAPTER 15. ELECTRONS IN SOLIDS
where
2
M=
(15.3.9)
2ta2
plays the role of an eﬀective electron mass. The eﬀective mass can be larger
or smaller than the electron mass in vacuum by about a factor of 10. In some
exotic socalled heavy fermion materials the eﬀective mass can even be up to a
factor 1000 larger than the electron mass in vacuum. Let us consider the chemical
potential
µ = −6t + ε
(15.3.10)
with a small value of ε, which then implies
E (k ) − µ =
2 k2
2M
The Fermi surface then is a sphere of radius
√
2M ε
kF =
.
− ε.
(15.3.11)
(15.3.12)
Consequently, the average density of electrons is given by
k3 a3
a 3 4π 3
N
=2
kF = F 2 .
L3
2π
3
3π
In this case, the Fermi velocity is simply given by
pF
kF
=
.
M
M
Here pF = kF is known as the Fermi momentum.
vF =
(15.3.13)
(15.3.14)
The question whether a material is an insulator or a conductor depends on
the value of the chemical potential, which controls the number of electrons. The
number of mobile electrons in a given material depends on the type of atoms or
molecules which the material consists of. In our simple model of free electrons
the singleparticle energies can vary between −6t and 6t, while lower or higher
energies are forbidden. In real solids, the allowed energy levels form several energy
bands, while our model has just a single band. The value of the chemical potential
determines whether an energy band is completely or only partially ﬁlled. If the
chemical potential is above the upper band edge 6t, the band is completely ﬁlled.
In our model, this is the case when there are two electrons (one with spin up and
one with spin down) per lattice site. In that case, there are no available states
of higher energy which electrons could occupy when an external electric ﬁeld
is applied. Hence, materials with completely ﬁlled energy bands are insulators.
Materials with only partially ﬁlled bands, on the other hand, are metals. In that
case, when one applies an external electric ﬁeld, electrons can be lifted to higher
previously unoccupied energy levels and can thus contribute to the conductivity.
15.4. ELECTRONS AND THE SPECIFIC HEAT OF METALS
15.4
151
Electrons and the Speciﬁc Heat of Metals
Let us now consider the contribution of electrons to the speciﬁc heat of a metal
at small temperature and chemical potential. The average energy density of
electrons is then given by
a
∂ log Z
E
=
=−
3
L
∂β
2π
3
ω (k )
d3 k
.
(15.4.1)
exp β [ ω (k ) − µ] + 1
B
Similarly, the average number density of electrons is given by
∂ log Z
a
N
=
=
L3
∂ (βµ)
2π
3
1
d3 k
.
(15.4.2)
exp β [ ω (k ) − µ] + 1
B
At small temperature and small µ we can use
ω (k) − µ =
2 k2
2M
− ε,
(15.4.3)
and we can extend the integration over all of R3 such that
a3
E − µN
=2
L3
2π
k2
∞
dk
0
exp β
2 k2
2M
2 k2
2M
−ε
−ε
.
(15.4.4)
+1
In the smalltemperature limit, this integral can be expanded in powers of T and
one obtains
π 2 kB T 2
p2
E
=
, kB TF = F ,
(15.4.5)
L3
4TF
2M
where TF is the socalled Fermi temperature. In typical metals the Fermi temperature is around TF ≈ 40000K. The speciﬁc heat then results as
cV =
π 2 kB T
1∂E
=
.
3 ∂T
L
2TF
(15.4.6)
In contrast to the T 3 behavior of an ideal Bose gas of phonons, the speciﬁc heat
of the ideal Fermi gas of electrons is proportional to T . This implies that, at
low temperatures, the speciﬁc heat of metals is dominated by electrons, while at
higher temperatures it is dominated by phonons. In insulators the electrons are
in a completely ﬁlled band and can thus not be easily excited to higher energy
levels. Consequently, electrons do not contribute to the speciﬁc heat of insulators,
which is hence dominated by phonons even at low temperatures.
152
15.5
CHAPTER 15. ELECTRONS IN SOLIDS
Repulsive Hubbard Model at HalfFilling
Let us extend the simple Hamiltonian that describes the hopping of electrons
between neighboring lattice sites by an interaction term between the electrons.
In a solid the repulsive Coulomb repulsion between two electrons is screened by
the background of positively charged ions, and therefore eﬀectively becomes shortranged. In the Hubbard model, the repulsion is assumed to be nonzero only if
two electrons (of opposite spin) occupy the same lattice point. The corresponding
Hamiltonian takes the form
H = −t
†
cx,s cx+ia,s + c†
c
x+ia,s x,s
+U
x,i,s
nx,↑ nx,↓ .
(15.5.1)
x
Here
†
nx,s = cx,s cx,s ,
(15.5.2)
counts the number of electrons with spin s at the lattice site x. The interaction
term contributes U > 0 if a lattice site is occupied by both a spin up and a spin
down electron. It is instructive to convince oneself that the above Hamiltonian
still commutes with the total spin and with the total number of electrons, i.e.
[H, S ] = 0,
[H, N ] = 0,
with
N=
nx,s ,
x,s
†
cx,s
S=
x,s,s′
σss′
cx,s′ .
2
(15.5.3)
(15.5.4)
Let us consider the Hubbard model on a 2dimensional square lattice. This
model is expected to describe hightemperature superconductors which remain
superconducting up to temperatures as high as 100 K or even more. Ordinary
metallic lowtemperature superconductors superconduct only up to temperatures
of a few degrees Kelvin. While the mechanism responsible for lowtemperature
superconductivity is wellunderstood (phonon exchange leads to the binding of
two electrons in Cooper pairs, which then condense), the dynamical origin of
hightemperature superconductivity remains one of the greatest puzzles in modern condensed matter physics. Since nobody has been able to solve the Hubbard model (either numerically or analytically), one cannot even be sure that
it contains all the ingredients that are necessary to describe hightemperature
superconductivity. For example, the Hubbard model does not contain phonons.
While most experts think that phonons are not essential for hightemperature
superconductivity, there is no general agreement on this issue. Although hightemperature superconductors are not yet understood, their undoped precursors,
15.5. REPULSIVE HUBBARD MODEL AT HALFFILLING
153
which are quantum antiferromagnets, are among the quantitatively best understood strongly correlated condensed matter systems.
Let us consider the Hubbard model at halfﬁlling, i.e. with one electron per
lattice site. Note that the Pauli principle would allow maximally two electrons
per lattice site, one with spin up and one with spin down. In the limit of very
large Coulomb repulsion U ≫ t, doubly occupied lattice sites are forbidden and
hence, at halfﬁlling, there is exactly one electron on each lattice site. Since all
neighboring sites are already occupied, the large Coulomb repulsion then prevents
the electrons from hopping around. Since the electrons are not mobile, the system
is an insulator — known as a Mott insulator. In contrast to bandinsulators, Mott
insulators do not have a ﬁlled lowest band. In fact, band theory which relies on
the assumption of noninteracting electrons is not applicable in the presence of a
very strong Coulomb repulsion.
Since each electron may have spin up or spin down independent of the other
electrons, the ground state of a Mott insulator at U = ∞ is inﬁnitely degenerate.
Let us now make an expansion in t/U ≪ 1 using degenerate perturbation theory.
In other words, we now treat the hopping term as a small perturbation in addition
to the dominant Coulomb term. Using the rules of degenerate perturbation theory, we should diagonalize the perturbation, i.e. the hopping term, in the space
of degenerate states. To the linear order t the hopping term leads to a state with
one empty and one doubly occupied site. This state is not among the degenerate
ground states (which have exactly one electron on each lattice site). As a result,
the hopping term has vanishing matrix elements between the degenerate ground
states. Hence, ﬁrst order perturbation theory does not remove the inﬁnite degeneracy, and we hence proceed to second order perturbation theory. In order
to obtain a nonvanishing contribution of order t2 in second order perturbation
theory, after two electron hops, we must return to one of the degenerate ground
states. This is possible only if an electron hops to a neighboring site occupied by
an electron of opposite spin, and then one of the electrons hops back and again
ﬁlls the empty site. The intermediate state reached after the ﬁrst hop contains
one empty and one doubly occupied site, and is suppressed by the large Coulomb
repulsion U . Hence, the contribution to the energy in second order perturbation
theory is proportional to −t2 /U . It is important to note that this contribution
arises only when two electrons of opposite spin occupy neighboring lattice sites.
Otherwise, the Pauli principle forbids that both electrons occupy the same lattice
site. As a result, the inﬁnite degeneracy of ground states is lifted by a spinspin
interaction that favors antiparallel spins. A detailed calculation shows that for
U ≫ t the Hubbard model at halfﬁlling reduces to the antiferromagnetic Heisen
154
CHAPTER 15. ELECTRONS IN SOLIDS
berg model
H=J
x,i
Sx · Sx+ia ,
(15.5.5)
where the exchange coupling (resulting from the possible exchange of two electrons via the two hops) is given by
J =2
t2
> 0.
U
(15.5.6)
Since J is positive, it is energetically favorable that neighboring spins are antiparallel.
Chapter 16
Magnons in Ferro and
Antiferromagnets
In this chapter we will encounter another relevant excitation in condensed matter
physics, the socalled magnons or spin waves. These particles are again “wavicles”, in this case quantized ﬂuctuations of the magnetization in a magnetic solid.
The most familiar magnetic systems are ferromagnets, like iron, nickel, or cobalt.
In these materials, the magnetic moments of two electrons are correlated such
that they tend to point in the same direction. At suﬃciently low temperatures,
the electron spins (and hence the magnetic moments) get correlated over large
distances and collectively point more or less in the same direction. On macroscopic scales, this manifests itself as a net magnetization of the entire crystal.
The microscopic origin of ferromagnetism is somewhat complicated. In particular, it involves two bands for the electrons, as well as spin couplings between the
electrons in both bands which arise due to Hund’s rules. Understanding these
microscopic mechanisms is beyond the scope of these lectures. Instead, we will
simply postulate a microscopic model for quantum spins — the socalled quantum
Heisenberg model — to describe ferromagnets.
Interestingly, besides ferromagnets there are also antiferromagnets in which
the spins of neighboring localized electrons have a tendency to point in antiparallel directions. At a suﬃciently low temperature, the spins may then again
get ordered. However, in this case a socalled staggered (rather than uniform)
magnetization establishes itself. Quasitwodimensional antiferromagnets play an
important role as the undoped precursors of hightemperature superconductors.
Indeed, most hightemperature superconductors result from electron or hole
155
156
CHAPTER 16. MAGNONS IN FERRO AND ANTIFERROMAGNETS
doping of an antiferromgnet. In condensed matter physics, a “hole” denotes
a missing electron. Mathematically, a hole has similar properties as an antiparticle in particle physics. For example, just like a positron — the antiparticle
of the electron — a hole has electric charge +1. As we have seen, in contrast to
ferromagnetism, the microscopic origin of antiferromagnetism is relatively easy
to understand and can be addressed in a singleband Hubbard model with strong
electron repulsion. The tendency of neighboring localized electrons to have antiparallel spin then results from the Pauli principle. At halfﬁlling the Hubbard
model reduces to the antiferromagnetic Heisenberg model.
16.1
Antiferromagnetic Heisenberg Model
As we have seen in the previous chapter, at halfﬁlling the antiferromagnetic
Heisenberg model reduces to the quantum Heisenberg model with the Hamiltonian
H=J
Sx · Sx+ia .
(16.1.1)
x,i
For J > 0 it is energetically favorable that neighboring spins are antiparallel.
The Heisenberg model has an SU (2) spin symmetry that is generated by the total
spin, i.e.
[H, S ] = 0, S =
Sx .
(16.1.2)
x
At zero temperature, the 2dimensional antiferromagnetic Heisenberg model develops a nonzero expectation value of the staggered magnetization
Ms =
x
(−1)x1 /a+x2 /a Sx ,
(16.1.3)
in its ground state. This is a result of numerical simulations. It has not been
possible to ﬁnd an analytic expression for the ground state and exactly derive its
energy or other properties such as its staggered magnetization. The staggered
magnetization is known as the order parameter of antiferromagnetism.
Fluctuations of the staggered magnetization vector manifest themselves as
spin waves also known as magnons. Just as phonons are quantized lattice vibrations, antiferromagnetic magnons are quantized ﬂuctuations of the staggered
magnetization. It turns out that there are two magnon excitations (analogous to
the two possible polarizations of a photon), with the dispersion relation given by
E (k ) = ω (k ) = k c,
(16.1.4)
16.2. FERROMAGNETIC HEISENBERG MODEL
157
for small values of k . Just as for phonons, and in contrast to photons, the
magnon dispersion relation deviates from linearity in k for larger values of k ,
in particular near the edge of the Brillouin zone. In this case, c is the socalled
spin wave velocity (which is smaller than the velocity of light). In complete
analogy to photons and phonons, at low temperatures the magnon contribution
to the speciﬁc heat of an antiferromagnet is determined by the linear dispersion
relation. Since antiferromagnetic magnons and phonons have the same dispersion
relation, and there are two magnons as well as two polarizations of photons, at
low temperatures the mathematical expressions for the magnon energy density
ρ and the speciﬁc heat cV of a 3dimensional antiferromagnet are identical with
those of the photon gas, except that the velocity of light is now replaced by the
spin wave velocity. In particular, one obtains
ρ=
4
π 2 kB T 4
,
15 3 c3
cV =
4
4π 2 kB T 3
∂ρ
=
,
∂T
15 3 c3
(16.1.5)
which again has the T 3 behavior characteristic for the speciﬁc heat of noninteracting massless bosons with a linear dispersion relation.
16.2
Ferromagnetic Heisenberg Model
The microscopic origin of ferromagnetism is rather nontrivial. It involves electrons in two bands coupled to one another according to Hund’s rule. Although
it oversimpliﬁes the underlying dynamical mechanism, here we will model ferromagnetism by a simple quantum Heisenberg model with the Hamiltonian
H = −J
x,i
Sx · Sx+ia .
(16.2.1)
When J > 0, this Hamiltonian favors parallel spins. Since their Hamiltonians
diﬀer just by a minussign, the spectra of the ferro and antiferromagnetic Heisenberg models are related by a signchange. In other words, the ground state of
the ferromagnet corresponds to the highest excited state of the antiferromagnet,
and vice versa. Again, the ferromagnetic Heisenberg model has an SU (2) spin
symmetry, i.e.
[H, S ] = 0, S =
Sx .
(16.2.2)
x
The total spin represents the uniform magnetization, which is the order parameter
of a ferromagnet. In contrast to an antiferromagnet, the order parameter of
a ferromagnet (the total spin) is a conserved quantity (it commutes with the
Hamiltonian).
158
CHAPTER 16. MAGNONS IN FERRO AND ANTIFERROMAGNETS
Other than the antiferromagnetic Heisenberg model, to some extent the ferromagnetic model can be solved analytically. Let us construct a ground state of
the ferromagnet
0 =  ↑↑ . . . ↑ ,
(16.2.3)
in which all spins are up. Introducing the sum of two neighboring spins
J = Sx + Sx+ia ,
(16.2.4)
the product of the two spin vectors can be expressed as
12
2
2
J − Sx − Sx+ia .
2
(16.2.5)
2 has the eigenvalue J (J + 1), we obtain
Since the angular momentum operator J
2
2
J 2 = (Sx +Sx+ia )2 = Sx +Sx+ia +2Sx ·Sx+ia ⇒ Sx ·Sx+ia =
Sx · Sx+ia =
3
1
3
1
J (J + 1) − 2
=
J (J + 1) −
.
2
4
2
2
(16.2.6)
Here we have used the fact that
2
2
Sx = Sx+ia =
1
2
1
+1
2
3
=.
4
In the above ground state, all nearestneighbor spin
total spin J = 1 such that
Sx · Sx+ia =
1
2
(16.2.7)
pairs are coupled to the
3
1
3
1
1
J (J + 1) −
=
2−
=.
2
2
2
2
4
(16.2.8)
3JL3
0 ,
4a3
(16.2.9)
Consequently, we obtain
H 0 = −J
x,i
Sx · Sx+ia  ↑↑ . . . ↑ = −
such that the energy of the ground state is given by
E0 = −
3JL3
.
4a3
(16.2.10)
Here we have assumed a cubic lattice with periodic boundary conditions and L/a
points in each directions. The factor 3 arises because there are 3(L/a)3 nearestneighbor pairs on the lattice. The energy density of the ground state is thus given
by
3J
E0
(16.2.11)
ρ0 = 3 = − 3 .
L
4a
159
16.3. MAGNON DISPERSION RELATION
It is important to note that the ground state has a total spin value
S=
L3
.
2a3
(16.2.12)
Hence it is 2S + 1 = L3 /a3 + 1 fold degenerate. In other words, on a large lattice
there is an enormous number of degenerate ground states.
16.3
Magnon Dispersion Relation
Let us consider the lowenergy excitations above the ground state 0 . These are
ﬂuctuations of the magnetization known as ferromagnetic magnons. First, we
construct raising and lowering operators for the individual spins
+
1
2
Sx = Sx + iSx ,
−
1
2
Sx = Sx − iSx ,
(16.3.1)
which obey the commutation relations
+
−
3
1
2
2
1
[Sx , Sx′ ] = −i[Sx , Sx′ ] + i[Sx , Sx′ ] = 2δxx′ Sx ,
+
+
1
2
3
2
3
3
1
[Sx , Sx′ ] = [Sx , Sx′ ] + i[Sx , Sx′ ] = δxx′ iSx′ + Sx = δxx′ Sx ,
−
−
1
2
3
2
3
3
1
[Sx , Sx′ ] = [Sx , Sx′ ] − i[Sx , Sx′ ] = δxx′ iSx′ − Sx = −δxx′ Sx .(16.3.2)
As a next step, we perform a Fourier transform
S + (k ) =
x
+
Sx exp(−ik · x),
S − (k) =
x
−
Sx exp(ik · x) = S + (k)† , (16.3.3)
and we construct the onemagnon state
k = S − (k)0 .
(16.3.4)
The Hamilton operator can be written in the form
H = −J
x,i
Sx · Sx+ia = −J
x,i
1
−
33
S + S − + Sx S +
+ Sx Sx+ia , (16.3.5)
x+ia
2 x x+ia
160
CHAPTER 16. MAGNONS IN FERRO AND ANTIFERROMAGNETS
such that
[H, S − (k)] = −J
= −J
−
3
Sx′ δxx′ S −
′
x +ia
x,x′ ,i
−3
+ Sx′ Sx′ +ia δx,x′ +ia
−
3
3
Sx′ δxx′ Sx′ +ia − Sx′ S −
′
δ
x +ia x,x′ +ia
= −J
= −J
−
x,x′ ,i
1
−
−
33
S+S−
+ Sx′ S +
+ Sx Sx+ia , Sx exp(ik · x)
′
x′ +ia
2 x x′ +ia
3
Sx S −
x+ia
+ S−
S3
x−ia x
x,i
x,i
exp(ik · x)
−3
3
−
− Sx Sx+ia − Sx−ia Sx exp(ik · x)
−3
−3
Sx Sx−ia exp(−iki a) + Sx Sx+ia exp(iki a)
−3
−3
Sx Sx+ia − Sx Sx−ia exp(ik · x).
(16.3.6)
We now obtain
[H, S − (k)]0
=−
J
2
x
−
Sx exp(ik · x)
J
= − S − (k)
2
i
i
(exp(−iki a) + exp(iki a) − 2) 0
2 (cos(ki a) − 1) 0 = E (k)k ,
(16.3.7)
which implies
H k = HS − (k )0 = [H, S − (k )]0 + S − (k )H 0 = (E (k ) + E0 )k ,
(16.3.8)
with the magnon dispersion relation given by
E (k ) = −J
i
(cos(ki a) − 1) .
(16.3.9)
For small values of the wave number k we hence obtain the quadratic energymomentum dispersion relation
E (k) =
Ja2 2
k .
2
(16.3.10)
It should be noted that the states S (k )− S (k′ )− 0 are not exact twomagnon
energy eigenstates. This is because magnons interact with one another and are
thus in general not forming an ideal gas. Still, the magnonmagnon interactions
go to zero for low magnon energies. Consequently, at least at low temperature,
the magnons can still be treated as an ideal gas.
161
16.4. SPECIFIC HEAT OF A FERROMAGNET
16.4
Speciﬁc Heat of a Ferromagnet
Just as photons or phonons, ferromagnetic magnons are bosons. However, they
have a quadratic rather than a linear energymomentum dispersion relation. At
low temperatures, when the magnons can be treated as an ideal gas, the partition
function can be written as
∞
Z=
Z (k),
Z (k) =
n(k )=0
k
exp(−βn(k )E (k )) =
1
1 − exp(−βE (k ))
. (16.4.1)
The energy density is then given by
ρ=−
1 ∂ log Z
1
=3
L3 ∂β
L
E (k )
k
exp(βE (k )) − 1
.
(16.4.2)
In the inﬁnite volume limit, the sum over discrete modes turns into an integral
and we obtain
ρ=
=
=
=
1
(2π )3
d3 k
∞
Ja2
4π 2
0
E (k )
exp(βE (k )) − 1
k4
dk
exp(βJa2 k2 /2) − 1
∞
x3/2
1
(kB T )5/2
dx
exp(x) − 1
2π 2 (Ja2 )3/2
0
3
5
(kB T )5/2 .
ζ
2 )3/2
2
2(2πJa
√
(16.4.3)
Here we have extended the integration over the Brillouin zone to an integration
up to inﬁnity, which is indeed justiﬁed in the lowtemperature limit. The speciﬁc
heat then takes the form
cV =
∂ρ
15
ζ
=
∂T
4(2πJa2 )3/2
5
2
(kB T )3/2 .
(16.4.4)
Calculating the speciﬁc heat at higher temperatures is nontrivial because the
magnons can then no longer be treated as an ideal gas. Such calculations were
pioneered by Freeman Dyson but are still an area of current research.
Find millions of documents on Course Hero  Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.
Below is a small sample set of documents:
Sana'a University  PHYSICS  111
PHYSICS 140A : STATISTICAL PHYSICSHW ASSIGNMENT #1(1) To measure the heat capacity of an object, all you usually have to do is put it in thermalcontact with another object whose heat capacity is known. Suppose that a 100 g chunk ofmetal is immersed in
Sana'a University  PHYSICS  111
PHYS 402 HomeworkDue Monday April 11. Consider a particle of mass m confined to one dimensional infinite square well located between x=0 and x=L. Suppose that added to this a small perturbation of the form H ' = a ( x L / 4) . a. Calculate the first or
Sana'a University  PHYSICS  111
Sana'a University  PHYSICS  111
Physics 425, Spring 2009Final Exam May 4 12:001:50, MondayConstants: Boltzmann constant: k = 1.38 x 1023 J/K Mass of hydrogen atom: mH = 1.67 x 1027 kg Avogadro's number: Na = 6.02 x 1023 Planck's constant: h = 6.63 x 1034 J s dQ = TdS = dE + PdV1.
Sana'a University  PHYSICS  111
PHYS425  Final Exam  due Dec. 19, 2001December 4, 20011Instructions This exam is due at or before 5:00 pm, Wednesday, December 19, 2001. You maypick it up from and return it to the department oce oce within twentyfour hoursbetween now and then. F
Sana'a University  PHYSICS  111
g h euUfx f d C B C s Y 6 a 6 s a ' B 6 B E $ U7D&b&WQbvc70@8bX) y dB C B 3 s vT7XD7X0QUU@7 Y B y w d B C B 3 6 t % d 6 C s Y C % B C6 3 C6 C @7xxQbXDW7XqQUQvU&(Au &8UQI&@UUr7R&Q Qq8@9 UUC a 6 C B C B 3 B C % B C6 3 Y d B C B a 6% Y% V C6 C B H E B pi
Sana'a University  PHYSICS  111
Physics 219Homework 7Do problems: Reif 9.4, 9.7, 9.22, 9.23, 9.27and Sethna 8.3, 8.5, 8.6, 8.7(b) (optional), 8.13 (optional)and:A one dimensional system consists of a point mass m attached to a linear springwith spring coecient k . The other end of
Sana'a University  PHYSICS  111
1Key to Solutions: Homework Problem Set 1R2.2 There is no constraint in the particles' position, so the two particles can be anywhere with their positions x1 and x2 as 0 < x1 < L and 0 < x2 < L. The total kinetic energy of the two particles is given as
Sana'a University  PHYSICS  111
1Solutions to Homework Problem Set 2Stowe 6.2 (a) For each particle, 2 states are possible, either in the front or the back of the room. Then for 28 N particles, the total number of states is simply = 2N = 210 . The form y N may be written in 28 27 the
Sana'a University  PHYSICS  111
1Solutions to Homework Problem Set 2Stowe 6.2 (a) For each particle, 2 states are possible, either in the front or the back of the room. Then for 28 N particles, the total number of states is simply = 2N = 210 . The form y N may be written in 28 27 the
Sana'a University  PHYSICS  111
1Solutions to Homework Problem Set 20Stowe 16.8m The speed distribution is such P (v)dv = 4v 2 ( 2kT )3/2 e 2kT dv. Here P (v) is the probability density, and P (v)dv is the probability in the speed range of v and v + dv. At the maximum of themv 2di
Sana'a University  PHYSICS  111
Physics 115A Statistical PhysicsSpring 2000 due 12:30 pm Thursday, May 18Reading: Chapter 5 and 6 in Reif. Midterm: Tuesday May 9 in class. The test will cover all material through Chapter 5PROBLEM SET 5although we have not yet discussed all the appli
Sana'a University  PHYSICS  111
Physics 115A Statistical PhysicsSpring 2001 due 12:30 pm Thursday, May 24Reading: Chapter 7 sections 7.1 7.10 in Reif.1. Reif 6.2 2. Reif 6.3 3. Reif 7.4 4. Reif 7.6 5. Reif 7.7 6. Reif 7.14PROBLEM SET 77. In class see notes for lecture 10 we conside
Sana'a University  PHYSICS  111
Physics 311: Thermal Physics16 April 2012WinterSpring 2012READING SCHEDULE: 1627 AprilMon. 16 Apr.: L36  Sec. 8.10, Sec. 7.4, Ch. 9, Sec 9.10Wed. 18 Apr.: L37  Ch. 9, Sec 9.10  9.12; Test 3 Review 67:30 pm, Rm. 202Thurs. 19 Apr.: Test #3: 67:
Sana'a University  PHYSICS  111
Physics 114 Statistical Physics Seminar 7 Wednesday, March 22, 2006 Reading  Rief Chapter 9 and Schroeder Chapter 7 You have seen quantum statistics in the form of the Pauli Exclusion Principle and perhaps in the form of eigenvector symmetries, but this
Sana'a University  PHYSICS  111
Physics 114 Statistical Physics Seminar 5 Tuesday, February 27, 2007 Reading  Rief Chapter 7 and Schroeder Chapter 6 (still). Now we are prepared to apply statistical mechanics to a variety of basic problems. We will consider three famous examples of "si
Sana'a University  PHYSICS  111
The problem is not that there are problems.The problem is expecting otherwise and thinking thathaving problems is a problem. Theodore RuskinStatistical Mechanics, Problems for Tutorial No. 1The following problems from Mandl: 1.2,1.3,1.4,1.5,1.61) Co
Sana'a University  PHYSICS  111
NIU Physics PhD Candidacy Exam Fall 2012 Quantum MechanicsDo ONLY THREE out of the four problems. Total points on each problem = 40.Problem 1. A particle of mass m moves in three dimensions in an attractive potential thatis concentrated on a spherical
Sana'a University  PHYSICS  111
Im Schatten der EnergetikerK. KroyLeipzig, 2009The script is not meant to be a substitute for reading proper textbooks nor for dissemination. Comments and suggestions are highly welcome.iContentsIIIInteracting Many Particle Systems11 Packing stru
Sana'a University  PHYSICS  111
Sana'a University  PHYSICS  111
University of FribourgDepartment of ChemistryStatistical ThermodynamicsCondensed scriptJuraj FedorWS 2011This script is not a full, independently readable, textbook. It is very condensed and summarizes only themain points discussed during the lectu
Sana'a University  PHYSICS  111
University of FribourgDepartment of ChemistryStatistical ThermodynamicsCondensed scriptJuraj FedorWS 2011This script is not a full, independently readable, textbook. It is very condensed and summarizes only themain points discussed during the lectu
Sana'a University  PHYSICS  111
1 III.4 Paramagnetism class schedule: Apr 8 reading assignments: S17, R7.8 examples marked with * are important and should be studied carefully. A charged particle with a spin or orbit motion generates a loop current, which can interact with an external m
Sana'a University  PHYSICS  111
1 III.7 Phonon gas: the Debye model of solids Class schedule: April 23 Reading assignments: S22AB, R10.2 Review: from the example of blackbody radiation, it is seen that the total energy of a particle system can be derived if we know (1) the occupation n
Sana'a University  PHYSICS  111
                  
Sana'a University  PHYSICS  111
Statistical MechanicsUweJens WieseAlbert Einstein Center for Fundamental PhysicsInstitute for Theoretical PhysicsBern UniversityDecember 23, 20102Contents1 Introduction92 Kinetic Theory of the Classical Ideal Gas132.1Atoms and Molecules . .
Sana'a University  PHYSICS  111
Sana'a University  PHYSICS  111
Sana'a University  PHYSICS  111
DeVry NJ  COMP  230
First 23 words of TitleAmazing classComp230Daniel PaganBraceroMay 25, 20111First 23 words of TitleAbstractYour abstract should be one paragraph and should not exceed 120 words. It is a summary of themost important elements of your paper. All nu
DeVry NJ  COMP  230
Specification:1. Will ask you for a number if the number is greater than or equal to you will receive acertain message2. Will ask you for a number and display that number3. Will create a desktop short of your notepadTest Plan:TestInput25Input 15C
DeVry NJ  COMP  230
Specification:This program will run a statementTest Plan:TestInput 150helloExpectedThe number you entered isgreater than 100Get out homieResultsThe number you entered isgreater than 100get out homie56,7,8,9,,6,7,8,9Summary:Programmed the
DeVry NJ  COMP  230
Daniel PaganBraceroWeek 56/4/2011Devry UniversityGenevieve SapijaszkoSpecification:1:Askstheusertoenteranumber.Whentheuserentersanegativenumberthe programwillstopaskingforinputandwilloutputacountofnumberstheuserentered andtheaverageofthosenumbers.
DeVry NJ  COMP  230
option explicit dim numbernumber = 100number = inputbox ("what is your age")if number => 18 then wscript.echo "you are an adult"elsewscript.echo "you are a child"end if
DeVry NJ  COMP  230
option explicit dim numbernumber = 100number = inputbox ("what is your age")if number => 18 then wscript.echo "you are an adult"elsewscript.echo "you are a child"end if
DeVry NJ  COMP  230
option explicitdim num, answernum = inputbox("Please enter a value.")answer = 0 + num *3wscript.echo "The triple of " & num & " is " & answer
DeVry NJ  COMP  230
option explicitdim num, answernum = inputbox("Please enter a value.")answer = 0 + num *3wscript.echo "The triple of " & num & " is " & answer
DeVry NJ  COMP  230
Daniel PaganBraceroDeVry UniversityGenevieve SapijaszkoMay 31, 2011Week 3 Lab NotesExercise #1This exercise requires that we create a user defined function (UDF) to print our name. Thepsuedocode for the program would look like this:Pseudocodeprog
DeVry NJ  NETWORK  320
NETW320 Week 1 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 2 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 2 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 2 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 4 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 5 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 6 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
DeVry NJ  NETWORK  320
NETW320 Week 7 Lab Report1. Read through the lab instructions before executing the lab steps and creating thereport.2. Follow all procedures in the lab instructions for the items you will need to includein your report.3. After executing all steps con
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 8MULTIPLE CHOICEa1. In assessing sampling risk, the risk of incorrect rejection and the risk of assessing control risk toohigh relate to thea. Efficiency of the audit.b. Effectiveness of the audit.c. Selection of the sample.d. Audit qualit
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 7MULTIPLE CHOICEb1. The application of statistical sampling techniques is least related to which of the followinggenerally accepted auditing standards?a. The work is to be adequately planned, and assistants, if any, are to be properly supervi
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 11MULTIPLE CHOICEb1. A client erroneously recorded a large purchase twice. Which of the following control procedureswould be most likely to detect this error in a timely and efficient manner?a. Footing the purchases journal.b. Reconciling ve
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 10MULTIPLE CHOICEc1.a.b.c.d.A sales cutoff test of billings complements tests ofSales returns.Cash.Accounts receivable.Sales allowances.(AICPA ADAPTED)a2. An auditor should perform alternative procedures to substantiate the existenc
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 9MULTIPLE CHOICEc1. After the auditor has prepared a flowchart of internal control for sales and cash receiptstransactions and evaluated the design of the system, the auditor would perform tests of controlson all control proceduresa. Documen
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 6MULTIPLE CHOICEb1. In general, a material weakness in internal control may be defined as a condition in whichmaterial errors or irregularities may occur and not be detected within a timely period bya. An independent auditor during tests of c
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 15MULTIPLE CHOICEd1. A company holds bearer bonds as a shortterm investment. Responsibility for custody of thesebonds and submission of coupons for periodic interest collections probably should be delegated tothea. Chief accountant.b. Inte
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 14MULTIPLE CHOICEb1. To strengthen control procedures over the custody of heavy mobile equipment, the client wouldmost likely institute a policy requiring a periodica. Increase in insurance coverage.b. Inspection of equipment and reconciliat
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 13MULTIPLE CHOICEc1. An auditor will ordinarily determine whether payroll checks are properly endorsed during thetesting ofa. Time cards.b. The voucher system.c. Cash in bank.d. Accrued payroll.(AICPA ADAPTED)a2. Which of the following
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 12MULTIPLE CHOICEc1. When an auditor selects a sample of items from the vouchers payable register for the last monthof the period audited and traces the items to underlying documents, the auditor is gatheringevidence primarily in support of t
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 2MULTIPLE CHOICEa.1.a.b.c.d.Due professional care requiresA critical review of the work done at every level of supervision.The examination of all corroborating evidence available.The exercise of errorfree judgment.A consideration of i
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 18 (COMPLIANCE & INTERNAL AUDITING)MULTIPLE CHOICEd1. In addition to assessing the fairness of the entity's financial statements, a governmental financialstatement audit is concerned with evaluatinga. Internal control.b. Efficiency.c. Accur
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 20 (LEGAL LIABILITY)MULTIPLE CHOICEc1. Which of the following best describes a trend in litigation involving CPAs?a. A CPA cannot render an opinion on a company unless the CPA has audited all affiliates of thatcompany.b. A CPA may successful
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 1MULTIPLE CHOICEb1. To become a Certified Public Accountant (CPA), an individual must pass the Uniform CPAExamination anda. Demonstrate his or her independence.b. Comply with state education and experience requirements.c. Obtain employment
Abraham Baldwin Agricultural College  ACC  101
CHAPTER 3MULTIPLE CHOICEc1. An auditor's report contains the following sentences:We did not audit the financial statements of B Company, a consolidated subsidiary, whichstatements reflect total assets and revenues constituting 20 percent and 22 perce