A Markov Picture of Heat
Aug 23, 2025
A derivation of the heat equation from a stochastic model of energy exchange
Motivation
Fourier’s law and the heat equation are phenomenological; they describe how temperature fields evolve at macroscopic scales. However, statistical mechanics is concerned with deriving macroscopic properties from microscopic interactions and behaviors. Thus, a natural question is whether we can derive these laws from assumptions that are (i) local, (ii) Markov (memoryless), and (iii) energy-conserving. In the below, we go through a derivation of the discrete Laplacian and how it converges to the heat equation in the hydrodynamic limit.[EDIT] About halfway through writing this post, I found the KMP paper Heat Flow in an Exactly Solvable Model, which actually introduces this concept. What's derived here is essentially their model, and their work is a far more rigorous and extensive version of what's presented here.
Model
We will consider the 1 dimensional lattice of \(N\) cells with spacing \(a > 0\). Then, our total length is \(L \doteq Na\). Our system's state at a time \(t\) is the energy vector \begin{equation} \mathbf{e}(t) \doteq (e_1(t), e_2(t), \dots, e_N(t)) \in [0,\infty)^N. \end{equation}We define energy exchange via a Markov jump process. Specifically, we assume that each bond between particle \(i\), \(i+1\) (a bond is defined only between neighbors) independently operates on a Poisson clock with rate \(\lambda\). When the clock "rings", the two energy sites are redistributed uniformly in a way that preserves their sum, i.e. \begin{equation} (e_i, e_{i+1}) \to \left(US, (1-U)S\right) \text{ for } U \sim \operatorname{Unif}[0,1], S \doteq e_i + e_{i+1}. \end{equation} This defines a continuous time Markov jump process on \([0, \infty)^N\) that preserves the total energy \(E \doteq \sum_i e_i\). It is clear that the process is memmoryless, but verifying the other conditions is a bit more involved:
- Holding time: the next ring time is the minimum of \(N-1\) independent \(\operatorname{Exp}(\lambda)\) clocks; thus \(\Delta T \sim \operatorname{Exp}(\lambda(N-1))\)
- Conservation of Total Energy: By construction, our interaction model preserves total energy between neighbors, summing over all bonds gives that total energy of the system is conserved.
- Generator: the generator \(\mathcal{L}\) encodes the infinitesimal evolution of the process. It is defined as \((\mathcal{L}f)(\mathbf{e}) = \lim_{h \to 0} \frac{\mathbb{E}[f(\mathbf{e}_{t+h}) - f(\mathbf{e}_t) \mid \mathbf{e}_t = \mathbf{e}]}{h}.\) Intuitively, this is just the expectation of the change in \(f\) over an infinitesimal time period \(h\), conditioned on the current state (and normalized by \(h\)). In our case, the probability that bond \(i\) rings in a time period \([t, t+h]\) is approximately \(\lambda h + o(h)\). If that bond rings, the system jumps to a new state (as specified by the transition above) and the expectation of the difference is \(\mathbb{E}_U [f(\mathbf{e}^{(i, U)}) - f(\mathbf{e})]\), and summing over all \(N-1\) bonds gives the generator
We will work with Neumann boundary conditions, i.e. that energy exchange only happens between internal bonds and so there are no boundary terms in the generator.
Continuity Equation
Discrete Laplacian
For \(L\) the generator of our Markov Jump Process, we have that for any observable \(f\) (to be precise, \(f\) must map to the reals, i.e. \(f: [0, \infty)^N \to \mathbb{R}\)), \begin{equation} (\mathcal{L}f)(\mathbf{e}) = \lambda \sum_{i=1}^{N-1} \left(\mathbb{E}_U\left[f(\dots, US, (1-U)S, \dots)\right] - f(\mathbf{e})\right) + \text{(boundary terms)} \end{equation} (this is identical to what we derived above).
Let us consider only the coordinate observable \(f_i(\mathbf{e}) = e_i\). Only the adjacent bonds contribute, so we have that a single update on \((i-1, i)\) sends \begin{equation} e_i \mapsto (1-U)(e_{i-1} + e_i) \implies \mathbb{E}_U [e_i^\prime - e_i] = \frac{e_{i-1} - e_i}{2}. \end{equation} Similarly, an update on \((i, i+1)\) gives \begin{equation} e_i \mapsto U(e_i + e_{i+1}) \implies \mathbb{E}_U [e_i^\prime - e_i] = \frac{e_{i+1} - e_i}{2}. \end{equation}
Plugging this back into our generator, we have that \begin{equation} (\mathcal{L}f_i) (\mathbf{e}) = \frac{\lambda}{2} \left( e_{i-1} + e_{i+1} - 2e_i \right) \end{equation} and if we define \(m_i(t) \doteq \mathbb{E}[e_i(t)]\) and use the Kolmogorov forward equation, we arrive at the discrete diffusion equation: \begin{equation} \frac{d}{dt} m_i(t) = \frac{\lambda}{2} \left( m_{i-1}(t) + m_{i+1}(t) - 2m_i(t) \right). \end{equation} This is the discrete Laplacian acting on the mean energy.
Continuity Equation
The jump rule defines local energy conservation in every bond. Let us define the local microscopic current across the bond \((i, i+1)\) \begin{equation} j_{i, i+1}(t) \doteq \frac{\lambda}{2}(e_i(t) - e_{i+1}(t)). \end{equation}Now, if we examine the change in \(e_i(t)\) over a small increment in time \(dt\), we can see that
- energy flows into site \(i\) from \(i-1\) when bond \((i-1, i)\) rings, and
- energy flows out of site \(i\) to \(i+1\) when bond \((i, i+1)\) rings
Putting these together, we can rewrite the discrete Laplacian: \begin{equation} \frac{d}{dt} e_i(t) = j_{i-1, i}(t) - j_{i, i+1}(t). \end{equation} The above equation is a pretty intuitive picture of the local energy change; the change in local energy is inflow minus outflow.
Continuum Limit
Let's go back to the discrete mean-energy evolution (the discrete Laplacian from above), but introduce a spatial scaling. Let us define the spatial coordinate of cell \(i\) be \(x_i \doteq ia\), recalling that \(a\) is the lattice spacing. Now, we may define a smooth profile \(e^a(x,t) \doteq m_i(t)\), so that as \(a\) gets small, the discrete vector \((m_i)\) becomes samples of a smooth field \(e(x,t)\).Since \(e(x,t)\) is smooth, we may Taylor expand the neighbors: \begin{equation} m_{i-1}(t) = e^a(x-a, t) = e^a(x,t) - a \partial_x e^a(x,t) + \frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a^3), \end{equation} \begin{equation} m_{i+1}(t) = e^a(x+a, t) = e^a(x,t) + a \partial_x e^a(x,t) + \frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a^3). \end{equation} Substituting into the RHS of the discrete Laplacian and simplifying yields \begin{equation} \partial_t e^a(x,t) = \frac{\lambda}{2} (a^2 \partial_{xx}e^a(x,t) + O(a^3)). \end{equation}
Note that as \(a \to 0\), the RHS goes to 0 as well, and nothing really happens. Thus, we need to compensate in one of two ways: (i) either rescale time so we can see "slow" dynamics, or (ii) increasing the update rate so that microscopic effects add up to an observable macroscopic effect.
Diffusive Time Scaling
First, let's consider what happens if we keep the rate \(\lambda\) fixed, but define a macroscopic time \(t^\prime \doteq \frac{t}{a^2}\). Via the chain rule, we have \(a^2 \partial_t = \partial_{t^\prime}\). Plugging into the discrete Laplacian, we have \begin{equation} \partial_{t^\prime} e^a(x,t^\prime) = \frac{\lambda}{2} \partial_{xx} e^a(x,t^\prime) + O(a). \end{equation} In the limit \(a \to 0\), we recover \begin{equation} \partial_{t^\prime} e^a(x,t^\prime) = D \partial_{xx} e^a(x,t^\prime), \text{ for } D \doteq \frac{\lambda}{2}. \end{equation}Jump Rate Scaling
Alternatively, we can also fix the time \(t\) fixed, but scale the update rate as \(\lambda = \frac{2D}{a^2}\), where \(D\) is defined the same way as above. Plugging into the discrete Laplacian, we get \begin{equation} \partial_{t} e^a(x,t) = \frac{2D}{a^2}\frac{a^2}{2} \partial_{xx} e^a(x,t) + O(a), \end{equation} which after simplifying and taking the continuum limit yields \begin{equation} \partial_{t} e^a(x,t) = D \partial_{xx} e^a(x,t), \text{ for } D \doteq \frac{\lambda}{2}, \end{equation} same as above.This tracks pretty well with our intuition of random walks: each jump moves energy by distance \(a\), so to get macroscopic diffusion, we need \(O(1/a^2)\) jumps per unit time (since variance grows like \(\text{step}^2 \times \text{jumps}\)).
Whether we choose to scale the jump rate or the time, we arrive at the same continuum PDE, where the diffusion constant \(D\) is set by the microscopic parameters \(\lambda, a\).
Fourier's Law
If we write energy as proportional to temperature (\(e = cT\)) then \begin{equation} \partial_t T(x,t) = \frac{D}{c} \partial_{xx} T(x,t), \end{equation} and Fourier's law \(J = - \kappa \partial_x T\) appears with \(\kappa = cD\).