These lecture notes for the second week of Thermal and Statistical Physics involve relating entropy and temperature in the microcanonical ensemble, using a paramagnet as an example. These notes include a few small group activities.
This is the second week of PH 441.Readings: (K&K 2, Schroeder 6)
This week we will be following Chapter 2 of Kittel and Kroemer, which uses a microcanonical approach (or Boltzmann entropy approach) to relate entropy to temperature. This is an alternative derivation to the Gibbs approach we used last week, and it can be helpful to have seen both. In a few ways the Boltzmann approach is conceptually simpler, while there are a number of other ways in which the Gibbs approach is simpler.
The difference between these two approaches is in what is considered the fundamental assumption. In the Gibbs entropy approach we assumed that the entropy was a “nice” function of the probabilities of microstates, which gave us the Gibbs formula. From there, we could maximize the entropy to find the probabilities under some set of constraints.
The Boltzmann approach makes what is perhaps simpler assumption, which is that if only microstates with a given energy are permitted, then all of the microstates with that energy are equally probable. (This scenario with all microstates having the same energy is the microcanonical ensemble.) Thus the macrostate with the most corresponding microstates will be most probable macrostate. The number of microstates corresponding to a given macrostate is called the multiplicity \(g(E,V)\). In this approach, multiplicity (which did not show up last week!) becomes a fundamentally important quantity, since the macrostate with the highest multiplicity is the most probable macrostate.
This quick version will tell you all the essential physics results for the week, without proof. The beauty of statistical mechanics (whether following the text or using the information-theory approach of last week) is that you don't actually need to take on either faith or experiment the connection between the statistical theory and the empirical definitions used in thermodynamics.
The multiplicity sounds sort of like entropy (since it is maximized), but the multiplicity is not extensive (nor intensive), because the number of microstates for two identical systems taken together is the square of the number of microstates available to one of the single systems. This naturally leads to the Boltzmann definition of the entropy, which is \begin{align} S(E,V) = k_B\ln g(E,V). \end{align} The logarithm converts the multiplicity into an extensive quantity, in a way that is directly analogous to the logarithm that appears in the Gibbs entropy.
For large systems (e.g. systems composed of \(\sim 10^{23}\) particles$), the most probable configuration is essentially the same as any remotely probable configuration. This comes about due for the same reason that if you flip \(10^{23}\) coins, you will get \(5\times 10^{22} \pm 10^{12}\) heads. On an absolute scale, that's a lot of uncertainty in the number of heads that would show up, but on a fractional scale, you're pretty accurate if you assume that 50% of the flips will be heads.
From Energy and Entropy (and last week), you will remember that \(dU = TdS - pdV\), which tells us that \(T = \left(\frac{\partial U}{\partial S}\right)_V\). If we assume that only states with one particular energy \(E\) have a non-zero probability of being occupied, then \(U=E\), i.e. the thermodynamic internal energy is the same as the energy of any allowed microstate. Then we can replace \(U\) with \(E\) and conclude that \begin{align} T &= \left(\frac{\partial E}{\partial S}\right)_V \\ \frac1T &= \left(\frac{\partial S}{\partial E}\right)_V \\ &= \left(\frac{\partial k_B\ln g(E,V)}{\partial E}\right)_V \\ &= k_B \frac1g \left(\frac{\partial g}{\partial E}\right)_V \end{align} From this perspective, it looks like our job is to learn to solve for \(g(E)\) and from that to find \(S(E)\), and once we have done those tasks we will know the temperature (and soon everything else).
The above assumes that \(g(E)\) is a differentiable function, which means that the number of microstates must be a continuous function of energy! This highlights one of the distinctions between the microcanonical approach and our previous (cannonical) Gibbs approach.
In reality, we know from quantum mechanics that any system of finite size has a finite number of eigenstates within any given energy range, and thus \(g(E)\) cannot be either continuous or differentiable. Boltzmann, of course, did not know this, and assumed that there were an infinite number of microstates possible within any energy range, and would strictly speaking interpret \(g(E)\) in terms of a volume of phase space.
The resolution to this conundrum is to invoke large numbers, and to assume that we are averaging \(g(E)\) over a range of energies in which there are many, many states. For real materials with \(N\approx 10^{23}\), this assumption is pretty valid. Much of this chapter will involve learning to work with this large \(N\) assumption, and to use it to extract physically meaningful results. In the Gibbs approach this large \(N\) assumption was not needed.
As Kittel discusses towards the end of the chapter, we only really need to know \(g(E)\) up to some constant factor, since a constant factor in \(g\) becomes a constant additive change in \(S\), which doesn't have any physical impact.
The “real” \(g(E)\) is a smoothed average over a range of energies. In practice, doing this can be confusing, and so we tend to focus on systems where the energy is always an integer multiple of some constant. Thus a focus on spins in a magnetic field, and harmonic oscillators.
So now the question becomes how to find the number of microstates that correspond to a given energy \(g(E)\). Once we have this in an analytically tractable form, we can everything else we might care for (with effort). This is essentially a counting problem, and much of what you need is introduced in Chapter 1. We will spend some class time going over one example of computing the multiplicity. Consider a paramagnetic system consisting of spin \(\frac12\) particles that can be either up or down. Each spin has a magnetic moment in the \(\hat z\) direction of \(\pm m\), and we are interested in the total magnetic moment \(\mu_{tot}\), which is the sum of all the individual magnetic moments. Note that the magnetization \(M\) used in electromagnetism is just the total magnetic moment of the material divided by its volume. \begin{align} M &\equiv \frac{\mu_{tot}}{V} \end{align}
Work out how many ways a system of 4 spins can have any possible magnetization of enumerating all the microstates corresponding to each magnetization.
Now find a mathematical expression that will tell you the multiplicity of a system with an even number \(N\) spins and just one \(\uparrow\) spin. Then find the multiplicity for two \(\uparrow\) spins, and for three \(\uparrow\) spins.
Now find a mathematical expression that will tell you the multiplicity of a system with an even number \(N\) spins and total magnetic moment \(\mu_{tot}=2sm\) where \(s\) is an integer. We call \(s\) the spin excess, since \(N_\uparrow = \frac12N + s\). Alternatively, you could write your expression in terms of the number of up spins \(N_\uparrow\) and the number of down spins \(N_\downarrow\).
\(\uparrow\uparrow\uparrow\uparrow\) g=1
To generalize this to \(g(N,s)\), we need to come up with a systematic way to count the states that have the same spin excess \(s\). Clearly if \(s=\pm N/2\), \(g=1\), since that means that all the spins are pointed the same way, and there is only one way to do that. \begin{align} g(N,s=\pm \frac12N) &= 1 \end{align} Now if we have just one spin going the other way, there are going to be \(N\) ways we could manage that: \begin{align} g\left(N,s=\pm \left(\frac12N-1\right)\right) &= N \end{align} Now when we go to flip it so we have two spins up, there will be \(N-1\) ways to flip the second spin. But then, when we do this we will end up counting every possibility twice, which means that we will need to divide by two. \begin{align} g\left(N,s=\pm \left(\frac12N-2\right)\right) &= N(N-1)/2 \end{align} When we get to adding the third \(\uparrow\) spin, we'll have \(N-2\) spins to flip. But now we have to be even more careful, since for the same three up-spins, we have several ways to reach that microstate. In fact, we will need to divide by \(6\), or \(3\times 2\) to get the correct answer (as we can check for our four-spin example). \begin{align} g\left(N,s=\pm \left(\frac12N-3\right)\right) &= \frac{N(N-1)(N-2)}{3!} \end{align} At this stage we can start to see the pattern, which comes out to \begin{align} g\left(N,s\right) &= \frac{N!}{\left(\frac12 N + s\right)!\left(\frac12N -s\right)!} \\ &= \frac{N!}{N_\uparrow!N_\downarrow!} \end{align}
As you can see, we now have a bunch of factorials. Once we compute the entropy, we will have a bunch of logarithms of factorials. \begin{align} N! &= \prod_{i=1}^N i \\ \ln N! &= \ln\left(\prod_{i=1}^N i\right) \\ &= \sum_{i=1}^N \ln i \end{align} So you can see that the log of a factorial is a sum of logs. When the number of things being summed is large, we can approximate this sum with an integral. This may feel like a funny business, particularly for those of you who took my computational class, where we frequently used sums to approximate integrals! But the approximation can go both ways. In this case, if we approximate the integral as a sum we can find an analytic expression for the factorial: \begin{align} \ln N! &= \sum_{i=1}^N \ln i \\ &\approx \int_1^{N} \ln x dx \\ &= \left.x \ln x - x\right|_{1}^{N} \\ &= N\ln N - N + 1 \end{align} At this point, we should recognize that the \(1\) that we see is much smaller than the other two terms, and is actually likely to be wrong. Importantly, there is a larger error being made here, which we can see if we zoom into the upper end of our integral. We are missing \(\frac12 \ln N\)! The reason is that our integral went precisely to \(N\), but if we imagine a midpoint rule picture (or trapezoidal rule) we are missing half of that last point. This gives us: \begin{align} \ln N! &\approx \left(N+\frac12\right)\ln N - N \end{align} We could find the constant term correctly (it is not 1), but that is more work, and even the \(\frac12\) above is usually omitted when using Stirling's approximation, since it is much smaller than the others when \(N\gg 0\)
I'm going to use a different approach than the text to find the entropy
of this spin system when there are many spins and the spin excess is
relatively small. \begin{align}
S &= k\ln g\left(N,s\right) \\
&= k\ln\left(\frac{N!}{\left(\tfrac12 N + s\right)!\left(\tfrac12N
-s\right)!}\right)
\\
&= k\ln\left(\frac{N!}{N_\uparrow!N_\downarrow!}\right)
\\
&= k\ln\left(\frac{N!}{\left(h + s\right)!\left(h
-s\right)!}\right)
\end{align} At this point I'm going to define for convenience
\(h\equiv \tfrac12 N\), just to avoid writing so many \(\tfrac12\). I'm
also going to focus on the \(s\) dependence of the entropy.
\begin{align}
\frac{S}{k} &= \ln\left(N!\right)
- \ln\left(N_\uparrow!\right)
- \ln\left(N_\downarrow!\right)
\\
&= \ln N! - \ln(h+s)! - \ln(h-s)!
\\
&= \ln N! - \sum_{i=1}^{h+s} \ln i - \sum_{i=1}^{h-s} \ln i
\end{align} At the last step, I wrote the log of the factorial as a sum
of logs. This is still looking pretty hairy. So let's now consider the
difference between the entropy with \(s\) and the entropy when
\(s=0\) (which I will call here \(S_0\) for compactness and
convenience). \begin{align}
\frac{S(s)-S_0}{k_B}
&= - \sum_{i=1}^{h+s} \ln i - \sum_{i=1}^{h-s} \ln i + \sum_{i=1}^{h} \ln i + \sum_{i=1}^{h} \ln i
\\
&= -\sum_{i=h+1}^{h+s} \ln i + \sum_{j=h-s+1}^{h} \ln j
\end{align} where I have changed the sums to account for the difference
between the sums with \(s\) and those without. At this stage, our
indices are starting to feel a little inconvenient given the short range
we are summing over, so let's redefine our index ov summation so the
sums will run up to \(s\). In preparation for this, at the last step, I
renamed one of my dummy indexes. \begin{align}
i &= h + k & j &= h + 1 - k
\end{align} With these indexes, each sum can go from \(k=1\) to \(k=s\),
which will enable us to combine our sums into one. \begin{align}
\frac{S-S_0}{k}
&= -\sum_{k=1}^{s} \ln(h+ k) + \sum_{k=1}^{s} \ln (h+1-k)
\\
&= \sum_{k=1}^{s} \left(\ln (h+1-k) - \ln(h+ k)\right)
\end{align} At this point, if you're anything like me, you're thinking
“I could turn that difference of logs into a log of a ratio!” Sadly,
this doesn't turn out to help us. Instead, we are going to start trying
to get the \(h\) out of the way in preparation for taking the limit as
\(s\ll h\). \begin{align}
\frac{S-S_0}{k}
&= \sum_{k=1}^{s} \ln h + \ln\left(1-\frac{k-1}{h}\right) -
\ln h - \ln\left(1+ \frac{k}{h}\right)
\\
&= \sum_{k=1}^{s} \left(\ln\left(1-\frac{k-1}{h}\right) -
\ln\left(1+ \frac{k}{h}\right)\right)
\end{align} It is now time to make our first approximation: we assume
\(s\ll N\), which means that \(s\ll h\). That enables us to simplify
these logarithms drastically! \(\ddot\smile\) \begin{align}
\frac{S-S_0}{k}
&\approx \sum_{k=1}^{s} \left(-\frac{k-1}{h} - \frac{k}{h}\right)
\\
&= -\frac2{h}\sum_{k=1}^{s} \left(k-\tfrac12\right)
\\
&= -\frac4{N}\sum_{k=1}^{s} \left(k-\tfrac12\right)
\end{align}
Now we have this sum to solve. You can find this sum either
geometrically or with calculus. The calculus involves turning the sum
into an integral. As you can see in the figure, the integral
\begin{align}
\int_0^s x dx = \tfrac12 s^2
\end{align} has the same value as the sum, since the area under the
orange curve (which is the sum) is equal to the area under the blue
curve (which is the integral).
The geometric way to solve this looks visually very much the same as the integral picture, but instead of computing the area from the straight line, we cut the stair-step area “half” and fit the two pieces together such that they form a rectangle with width \(s/2\) and height \(s\).
Taken together, this tells us that when \(s\ll N\) \begin{align} S(N,s) &\approx S(N,s=0) - k\frac{4}{N}\frac{s^2}{2} \\ &= S(N,s=0) - k\frac{2s^2}{N} \end{align} This means that the multiplicity is gaussian: \begin{align} S &= k \ln g \\ g(N,s) &= e^{\frac{S(N,s)}{k}} \\ &= e^{\frac{S(N,s=0)}{k} - \frac{2s^2}{N}} \\ &= g(N,s=0)e^{-\frac{2s^2}{N}} \end{align} Thus the multiplicity (and thus probability) is peaked at \(s=0\) as a gaussian with width \(\sim\sqrt{N}\). This tells us that the width of the peak increases as we increase \(N\). However, the excess spin per particle decreases as \(\sim\frac{1}{\sqrt{N}}\). So that means that our fractional polarization becomes far more sharply peaked as we increase \(N\).
Suppose we put two systems in contact with one another. This means that energy can flow from one system to the other. We assume, however, that the contact between the two systems is weak enough that their energy eigenstates are unaffected. This is a bit of a contradiction you'll need to get used to: we treat our systems as non-interacting, but assume there is some energy transfer between them. The reasoning is that the interaction between them is very small, so that we can treat each system separately, but energy can still flow.
We ask the question: “How much energy will each system end up with after we wait for things to settle down?” The answer to this question is that energy will settle down in the way that maximizes the number of microstates.
Let us consider two simple systems: a 2-spin paramagnet, and a 4-spin paramagnet.
What is the total number of microstates when you consider systems \(A\) and \(B\) together as a combined system? Answer
We need to multiply the numbers of microstates for each system separately, because for each microstate of \(A\), it is possible to have \(B\) be in any of its microstates. So the total is \(2^32^4 = 128\).
Since we have two separate systems here, it is meaningful to ask what the probability is for system \(A\) to have energy \(E_A\), given that the combined system has energy \(E_{AB}\).
Given that these two systems are able to exchange energy, they ought to have the same temperature. To find the most probable energy partition between the two systems, we need to find the partition that maximizes the multiplicity of the combined system: \begin{align} g_{AB}(E_A) &= g_A(E_A)g_B(E_{AB}-E_A) \\ 0 &= \frac{d g_{AB}}{d E_A} \\ &= g_A'g_B - g_B' g_A \\ \frac{g_A'}{g_A} &= \frac{g_B'}{g_B} \\ \frac{1}{g_A(E_A)} \frac{\partial g_A(E_A)}{\partial E_A} &= \frac{1}{g_B(E_B)} \frac{\partial g_B(E_B)}{\partial E_B} \end{align} This tells us that the “thing that becomes equal” when the two systems are in thermal contact is this strange ratio of the derivative of the multiplicity with respect to energy divided by the multiplicity itself. You may be able to recognize this as what is called a logarithmic derivative. \begin{align} \frac{\partial}{\partial E_A}\ln(g_A(E_A)) &= \frac{1}{g_A(E_A)} \frac{\partial g_A(E_A)}{\partial E_A} \end{align} thus we can conclude that when two systems are in thermal contact, the thing that equalizes is \begin{align} \beta &\equiv \left(\frac{\partial \ln g}{\partial E}\right)_V \end{align} At this stage, we haven't shown that \(\beta=\frac1{kT}\), but we have shown that it should be a function of \(T\), since \(T\) is also a thing that is equalized when two systems are in thermal contact.
By dimensional reasoning, you can recognize that this could be \(\frac1{kT}\), and we're just going to leave this at that.
face Lecture
120 min.
ideal gas entropy canonical ensemble Boltzmann probability Helmholtz free energy statistical mechanics
These notes, from the third week of Thermal and Statistical Physics cover the canonical ensemble and Helmholtz free energy. They include a number of small group activities.face Lecture
120 min.
Gibbs entropy information theory probability statistical mechanics
These lecture notes for the first week of Thermal and Statistical Physics include a couple of small group activities in which students work with the Gibbs formulation of the entropy.assignment Homework
The goal of this problem is to show that once we have maximized the entropy and found the microstate probabilities in terms of a Lagrange multiplier \(\beta\), we can prove that \(\beta=\frac1{kT}\) based on the statistical definitions of energy and entropy and the thermodynamic definition of temperature embodied in the thermodynamic identity.
The internal energy and entropy are each defined as a weighted average over microstates: \begin{align} U &= \sum_i E_i P_i & S &= -k_B\sum_i P_i \ln P_i \end{align}: We saw in clase that the probability of each microstate can be given in terms of a Lagrange multiplier \(\beta\) as \begin{align} P_i &= \frac{e^{-\beta E_i}}{Z} & Z &= \sum_i e^{-\beta E_i} \end{align} Put these probabilities into the above weighted averages in order to relate \(U\) and \(S\) to \(\beta\). Then make use of the thermodynamic identity \begin{align} dU = TdS - pdV \end{align} to show that \(\beta = \frac1{kT}\).
assignment Homework
The goal of this problem is to show that once we have maximized the entropy and found the microstate probabilities in terms of a Lagrange multiplier \(\beta\), we can prove that \(\beta=\frac1{kT}\) based on the statistical definitions of energy and entropy and the thermodynamic definition of temperature embodied in the thermodynamic identity.
The internal energy and entropy are each defined as a weighted average over microstates: \begin{align} U &= \sum_i E_i P_i & S &= -k_B\sum_i P_i \ln P_i \end{align} We saw in clase that the probability of each microstate can be given in terms of a Lagrange multiplier \(\beta\) as \begin{align} P_i &= \frac{e^{-\beta E_i}}{Z} & Z &= \sum_i e^{-\beta E_i} \end{align} Put these probabilities into the above weighted averages in order to relate \(U\) and \(S\) to \(\beta\). Then make use of the thermodynamic identity \begin{align} dU = TdS - pdV \end{align} to show that \(\beta = \frac1{kT}\).
face Lecture
120 min.
Planck distribution blackbody radiation photon statistical mechanics
These notes from the fourth week of Thermal and Statistical Physics cover blackbody radiation and the Planck distribution. They include a number of small group activities.assignment Homework
As discussed in class, we can consider a black body as a large box with a small hole in it. If we treat the large box a metal cube with side length \(L\) and metal walls, the frequency of each normal mode will be given by: \begin{align} \omega_{n_xn_yn_z} &= \frac{\pi c}{L}\sqrt{n_x^2 + n_y^2 + n_z^2} \end{align} where each of \(n_x\), \(n_y\), and \(n_z\) will have positive integer values. This simply comes from the fact that a half wavelength must fit in the box. There is an additional quantum number for polarization, which has two possible values, but does not affect the frequency. Note that in this problem I'm using different boundary conditions from what I use in class. It is worth learning to work with either set of quantum numbers. Each normal mode is a harmonic oscillator, with energy eigenstates \(E_n = n\hbar\omega\) where we will not include the zero-point energy \(\frac12\hbar\omega\), since that energy cannot be extracted from the box. (See the Casimir effect for an example where the zero point energy of photon modes does have an effect.)
Show that the free energy is given by \begin{align} F &= 8\pi \frac{V(kT)^4}{h^3c^3} \int_0^\infty \ln\left(1-e^{-\xi}\right)\xi^2d\xi \\ &= -\frac{8\pi^5}{45} \frac{V(kT)^4}{h^3c^3} \\ &= -\frac{\pi^2}{45} \frac{V(kT)^4}{\hbar^3c^3} \end{align} provided the box is big enough that \(\frac{\hbar c}{LkT}\ll 1\). Note that you may end up with a slightly different dimensionless integral that numerically evaluates to the same result, which would be fine. I also do not expect you to solve this definite integral analytically, a numerical confirmation is fine. However, you must manipulate your integral until it is dimensionless and has all the dimensionful quantities removed from it!
Show that the entropy of this box full of photons at temperature \(T\) is \begin{align} S &= \frac{32\pi^5}{45} k V \left(\frac{kT}{hc}\right)^3 \\ &= \frac{4\pi^2}{45} k V \left(\frac{kT}{\hbar c}\right)^3 \end{align}
Show that the internal energy of this box full of photons at temperature \(T\) is \begin{align} \frac{U}{V} &= \frac{8\pi^5}{15}\frac{(kT)^4}{h^3c^3} \\ &= \frac{\pi^2}{15}\frac{(kT)^4}{\hbar^3c^3} \end{align}
assignment Homework
Show that a Fermi electron gas in the ground state exerts a pressure \begin{align} p = \frac{\left(3\pi^2\right)^{\frac23}}{5} \frac{\hbar^2}{m}\left(\frac{N}{V}\right)^{\frac53} \end{align} In a uniform decrease of the volume of a cube every orbital has its energy raised: The energy of each orbital is proportional to \(\frac1{L^2}\) or to \(\frac1{V^{\frac23}}\).
Find an expression for the entropy of a Fermi electron gas in the region \(kT\ll \varepsilon_F\). Notice that \(S\rightarrow 0\) as \(T\rightarrow 0\).
group Small Group Activity
30 min.
assignment Homework
Suppose that a system of \(N\) atoms of type \(A\) is placed in diffusive contact with a system of \(N\) atoms of type \(B\) at the same temperature and volume.
Show that after diffusive equilibrium is reached the total entropy is increased by \(2Nk\ln 2\). The entropy increase \(2Nk\ln 2\) is known as the entropy of mixing.
If the atoms are identical (\(A=B\)), show that there is no increase in entropy when diffusive contact is established. The difference has been called the Gibbs paradox.
Since the Helmholtz free energy is lower for the mixed \(AB\) than for the separated \(A\) and \(B\), it should be possible to extract work from the mixing process. Construct a process that could extract work as the two gasses are mixed at fixed temperature. You will probably need to use walls that are permeable to one gas but not the other.
This course has not yet covered work, but it was covered in Energy and Entropy, so you may need to stretch your memory to finish part (c).
assignment Homework