A Markov Picture of Heat

physicsstats

Aug 23, 2025

A derivation of the heat equation from a stochastic model of energy exchange

Motivation

Fourier’s law and the heat equation are phenomenological; they describe how temperature fields evolve at macroscopic scales. However, statistical mechanics is concerned with deriving macroscopic properties from microscopic interactions and behaviors. Thus, a natural question is whether we can derive these laws from assumptions that are (i) local, (ii) Markov (memoryless), and (iii) energy-conserving. In the below, we go through a derivation of the discrete Laplacian and how it converges to the heat equation in the hydrodynamic limit.

[EDIT] About halfway through writing this post, I found the KMP paper Heat Flow in an Exactly Solvable Model, which actually introduces this concept. What's derived here is essentially their model, and their work is a far more rigorous and extensive version of what's presented here.

Model

We will consider the 1 dimensional lattice of \(N\) cells with spacing \(a > 0\). Then, our total length is \(L \doteq Na\). Our system's state at a time \(t\) is the energy vector \begin{equation} \mathbf{e}(t) \doteq (e_1(t), e_2(t), \dots, e_N(t)) \in [0,\infty)^N. \end{equation}

We define energy exchange via a Markov jump process. Specifically, we assume that each bond between particle \(i\), \(i+1\) (a bond is defined only between neighbors) independently operates on a Poisson clock with rate \(\lambda\). When the clock "rings", the two energy sites are redistributed uniformly in a way that preserves their sum, i.e. \begin{equation} (e_i, e_{i+1}) \to \left(US, (1-U)S\right) \text{ for } U \sim \operatorname{Unif}[0,1], S \doteq e_i + e_{i+1}. \end{equation} This defines a continuous time Markov jump process on \([0, \infty)^N\) that preserves the total energy \(E \doteq \sum_i e_i\). It is clear that the process is memmoryless, but verifying the other conditions is a bit more involved:

Heat Exchange is a Markov Jump Process

Holding time: the next ring time is the minimum of \(N-1\) independent \(\operatorname{Exp}(\lambda)\) clocks; thus \(\Delta T \sim \operatorname{Exp}(\lambda(N-1))\)
Conservation of Total Energy: By construction, our interaction model preserves total energy between neighbors, summing over all bonds gives that total energy of the system is conserved.
Generator: the generator \(\mathcal{L}\) encodes the infinitesimal evolution of the process. It is defined as \((\mathcal{L}f)(\mathbf{e}) = \lim_{h \to 0} \frac{\mathbb{E}[f(\mathbf{e}_{t+h}) - f(\mathbf{e}_t) \mid \mathbf{e}_t = \mathbf{e}]}{h}.\) Intuitively, this is just the expectation of the change in \(f\) over an infinitesimal time period \(h\), conditioned on the current state (and normalized by \(h\)). In our case, the probability that bond \(i\) rings in a time period \([t, t+h]\) is approximately \(\lambda h + o(h)\). If that bond rings, the system jumps to a new state (as specified by the transition above) and the expectation of the difference is \(\mathbb{E}_U [f(\mathbf{e}^{(i, U)}) - f(\mathbf{e})]\), and summing over all \(N-1\) bonds gives the generator

\begin{equation} (\mathcal{L}f)(\mathbf{e}) = \lambda \sum_{i=1}^{N-1} \left(\mathbb{E}_U[f(\mathbf{e}^{(U,i)})] - f(\mathbf{e})\right). \end{equation}

We will work with Neumann boundary conditions, i.e. that energy exchange only happens between internal bonds and so there are no boundary terms in the generator.

Continuity Equation

Discrete Laplacian

For \(L\) the generator of our Markov Jump Process, we have that for any observable \(f\) (to be precise, \(f\) must map to the reals, i.e. \(f: [0, \infty)^N \to \mathbb{R}\)), \begin{equation} (\mathcal{L}f)(\mathbf{e}) = \lambda \sum_{i=1}^{N-1} \left(\mathbb{E}_U\left[f(\dots, US, (1-U)S, \dots)\right] - f(\mathbf{e})\right) + \text{(boundary terms)} \end{equation} (this is identical to what we derived above).

Let us consider only the coordinate observable \(f_i(\mathbf{e}) = e_i\). Only the adjacent bonds contribute, so we have that a single update on \((i-1, i)\) sends \begin{equation} e_i \mapsto (1-U)(e_{i-1} + e_i) \implies \mathbb{E}_U [e_i^\prime - e_i] = \frac{e_{i-1} - e_i}{2}. \end{equation} Similarly, an update on \((i, i+1)\) gives \begin{equation} e_i \mapsto U(e_i + e_{i+1}) \implies \mathbb{E}_U [e_i^\prime - e_i] = \frac{e_{i+1} - e_i}{2}. \end{equation}

Plugging this back into our generator, we have that \begin{equation} (\mathcal{L}f_i) (\mathbf{e}) = \frac{\lambda}{2} \left( e_{i-1} + e_{i+1} - 2e_i \right) \end{equation} and if we define \(m_i(t) \doteq \mathbb{E}[e_i(t)]\) and use the Kolmogorov forward equation, we arrive at the discrete diffusion equation: \begin{equation} \frac{d}{dt} m_i(t) = \frac{\lambda}{2} \left( m_{i-1}(t) + m_{i+1}(t) - 2m_i(t) \right). \end{equation} This is the discrete Laplacian acting on the mean energy.

Continuity Equation

The jump rule defines local energy conservation in every bond. Let us define the local microscopic current across the bond \((i, i+1)\) \begin{equation} j_{i, i+1}(t) \doteq \frac{\lambda}{2}(e_i(t) - e_{i+1}(t)). \end{equation}

Now, if we examine the change in \(e_i(t)\) over a small increment in time \(dt\), we can see that

energy flows into site \(i\) from \(i-1\) when bond \((i-1, i)\) rings, and
energy flows out of site \(i\) to \(i+1\) when bond \((i, i+1)\) rings

Putting these together, we can rewrite the discrete Laplacian: \begin{equation} \frac{d}{dt} e_i(t) = j_{i-1, i}(t) - j_{i, i+1}(t). \end{equation} The above equation is a pretty intuitive picture of the local energy change; the change in local energy is inflow minus outflow.

Continuum Limit

Let's go back to the discrete mean-energy evolution (the discrete Laplacian from above), but introduce a spatial scaling. Let us define the spatial coordinate of cell \(i\) be \(x_i \doteq ia\), recalling that \(a\) is the lattice spacing. Now, we may define a smooth profile \(e^a(x,t) \doteq m_i(t)\), so that as \(a\) gets small, the discrete vector \((m_i)\) becomes samples of a smooth field \(e(x,t)\).

Since \(e(x,t)\) is smooth, we may Taylor expand the neighbors: \begin{equation} m_{i-1}(t) = e^a(x-a, t) = e^a(x,t) - a \partial_x e^a(x,t) + \frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a^3), \end{equation} \begin{equation} m_{i+1}(t) = e^a(x+a, t) = e^a(x,t) + a \partial_x e^a(x,t) + \frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a^3). \end{equation} Substituting into the RHS of the discrete Laplacian and simplifying yields \begin{equation} \partial_t e^a(x,t) = \frac{\lambda}{2} (a^2 \partial_{xx}e^a(x,t) + O(a^3)). \end{equation}

Note that as \(a \to 0\), the RHS goes to 0 as well, and nothing really happens. Thus, we need to compensate in one of two ways: (i) either rescale time so we can see "slow" dynamics, or (ii) increasing the update rate so that microscopic effects add up to an observable macroscopic effect.

Diffusive Time Scaling

First, let's consider what happens if we keep the rate \(\lambda\) fixed, but define a macroscopic time \(t^\prime \doteq \frac{t}{a^2}\). Via the chain rule, we have \(a^2 \partial_t = \partial_{t^\prime}\). Plugging into the discrete Laplacian, we have \begin{equation} \partial_{t^\prime} e^a(x,t^\prime) = \frac{\lambda}{2} \partial_{xx} e^a(x,t^\prime) + O(a). \end{equation} In the limit \(a \to 0\), we recover \begin{equation} \partial_{t^\prime} e^a(x,t^\prime) = D \partial_{xx} e^a(x,t^\prime), \text{ for } D \doteq \frac{\lambda}{2}. \end{equation}

Jump Rate Scaling

Alternatively, we can also fix the time \(t\) fixed, but scale the update rate as \(\lambda = \frac{2D}{a^2}\), where \(D\) is defined the same way as above. Plugging into the discrete Laplacian, we get \begin{equation} \partial_{t} e^a(x,t) = \frac{2D}{a^2}\frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a), \end{equation} which after simplifying and taking the continuum limit yields \begin{equation} \partial_{t} e^a(x,t) = D \partial_{xx} e^a(x,t), \text{ for } D \doteq \frac{\lambda}{2}, \end{equation} same as above.

This tracks pretty well with our intuition of random walks: each jump moves energy by distance \(a\), so to get macroscopic diffusion, we need \(O(1/a^2)\) jumps per unit time (since variance grows like \(\text{step}^2 \times \text{jumps}\)).

Whether we choose to scale the jump rate or the time, we arrive at the same continuum PDE, where the diffusion constant \(D\) is set by the microscopic parameters \(\lambda, a\).

Fourier's Law

If we write energy as proportional to temperature (\(e = cT\)) then \begin{equation} \partial_t T(x,t) = \frac{D}{c} \partial_{xx} T(x,t), \end{equation} and Fourier's law \(J = - \kappa \partial_x T\) appears with \(\kappa = cD\).

1-D diffusion gif — 1-D heat diffussion governed by the equation derived above.