How Weather Predicts the Future
The forecast on your phone uses an algorithm more sophisticated than any AI product. Here’s how it works — and what AI should steal from it.
In the winter of 1961, Edward Lorenz, an MIT meteorologist, was running a weather simulation on a Royal McBee LGP-30 — a computer roughly the size of a desk that could execute about 60 operations per second. He wanted to re-examine a sequence he’d run earlier, so he typed in the initial conditions again and went to get coffee.
When he came back, the forecast had diverged completely.
The problem: he’d rounded one variable from 0.506127 to 0.506. That tiny difference — one part in five thousand — had cascaded through the simulation until the two forecasts bore no resemblance to each other. The same equations, nearly the same starting point, utterly different outcomes.
Lorenz published this finding in 1963 as “Deterministic Nonperiodic Flow“ in the Journal of the Atmospheric Sciences. It became the founding paper of chaos theory. The popular version — a butterfly flapping its wings in Brazil causing a tornado in Texas — came later. But the scientific insight was immediate and devastating: for chaotic systems, single-point predictions are inherently unreliable. You can run the same model twice, change one decimal place, and get a completely different future.
The meteorological response to this wasn’t despair. It was architecture.
If You Can’t Predict One Future, Predict Many
The question after Lorenz was: if a single forecast is unreliable because tiny measurement errors explode into wildly different outcomes, what do you do?
The answer that emerged over the next three decades was ensemble forecasting — running dozens of parallel simulations, each starting from slightly different initial conditions, and treating the spread as the signal. Where the simulations agree, you have high confidence. Where they diverge, you know uncertainty is real and structural, not just a rounding error.
Every weather forecast you’ve ever seen on your phone is not a single prediction. It’s a summary of an ensemble — a probability distribution generated by running the same physics model many times with slightly perturbed starting points. The “70% chance of rain” isn’t a guess. It’s a statement that 70% of the ensemble members produced rain at that location and time.
This is a fundamentally different kind of forecasting. Most prediction systems produce a number and bolt error bars onto it after the fact. Weather forecasting builds the uncertainty in from the start. The uncertainty isn’t added — it’s structural.
The Norwegian Who Made It Scale
The ensemble concept was obvious in principle but computationally nightmarish in practice.
The classical approach — the Kalman filter, invented by Rudolf Kalman in 1960 — required maintaining and updating a covariance matrix that tracks how every variable in the system relates to every other variable. For a weather model with a billion grid points, that’s 10^18 entries. More entries than there are grains of sand on Earth. Not feasible. Not close to feasible.
In 1994, a Norwegian mathematician named Geir Evensen was studying ocean currents when he had an insight that changed the field. Instead of maintaining the full covariance matrix analytically, you could approximate it by running an ensemble of parallel simulations and computing the statistics from the spread. Twenty to a hundred parallel runs, each slightly different, each evolving forward under the full nonlinear physics. No linearization. No shortcuts. Just brute-force parallel reality.
Evensen published this as “Sequential Data Assimilation with a Nonlinear Quasi-Geostrophic Model Using Monte Carlo Methods to Forecast Error Statistics“ in the Journal of Geophysical Research. The Ensemble Kalman Filter (EnKF) has since accumulated thousands of citations and now runs operationally at every major weather center on Earth.
His follow-up, “The Ensemble Kalman Filter: Theoretical Formulation and Practical Implementation“ (2003), became the definitive reference — over 3,000 citations and still climbing.
How It Actually Works
The cycle runs every six hours, twenty-four hours a day, 365 days a year. It has been running continuously for over thirty years. Here’s what happens:
Generate. Start with the best estimate of the current atmosphere. Perturb it slightly — add tiny, statistically calibrated noise to the initial conditions. Run the full atmospheric model forward for each perturbation. Each run produces one possible future. ECMWF runs 51 of them. Environment Canada runs 256 — the largest operational ensemble on Earth, subdivided into eight groups of 32 for computational efficiency.
Observe. Satellites, weather stations, radiosondes (weather balloons), ships, buoys, aircraft, and ocean floats feed in. ECMWF receives 800 million observations per day, of which roughly 60 million quality-controlled observations are assimilated into each forecast cycle. That’s more data processed per day than most financial exchanges handle in a year.
Update. Here’s where the Kalman filter earns its keep. For each observation — a temperature reading from a weather station, a wind measurement from a satellite — the filter computes a “Kalman gain” that determines how much to trust the new data versus the prior model estimate. High uncertainty in the model? Trust the observation more. High confidence in the model and noisy observation? Trust the model more. This update is applied across all ensemble members simultaneously.
Repeat. The updated ensemble becomes the starting point for the next cycle. Six hours later, the whole thing runs again.
The spread of the ensemble is the forecast’s uncertainty. Wide spread means “we don’t know.” Tight convergence means “we’re confident.” This isn’t a single forecast with confidence intervals tacked on. The uncertainty is computed from the physics, not from statistics bolted on after the fact.
The Institutions Running This
Four centers run the world’s most sophisticated weather prediction systems. All of them use some form of ensemble-based data assimilation:
ECMWF (European Centre for Medium-Range Weather Forecasts, Reading, UK) runs the gold standard. Their Integrated Forecasting System (IFS) uses a 4D-Var data assimilation scheme coupled with a 50-member Ensemble of Data Assimilations (EDA). They process data from around 90 satellite instruments operationally. In February 2025, ECMWF took their AI model AIFS into operations — the first fully operational machine-learned weather prediction model at a major forecasting centre. By July 2025, the ensemble version (AIFS ENS) went live alongside the physics-based model — 51 AI-generated forecasts running in parallel with 51 physics-based forecasts, side by side.
NOAA/NCEP (USA) runs an 80-member EnKF for data assimilation plus a 31-member Global Ensemble Forecast System (GEFS), using a hybrid approach that combines ensemble covariances with a static background.
Met Office (UK) runs 45 ensemble members and was the first centre to operationalise En-4DEnVar — a hybrid method combining ensemble covariances with four-dimensional variational assimilation.
Environment and Climate Change Canada runs the largest operational EnKF on Earth: 256 members, subdivided into eight subensembles of 32 for cross-validation.
AI Enters the Chat
In December 2024, Google DeepMind published GenCast in Nature. GenCast is an AI weather model trained on 40 years of historical data (1979–2018). It generates 15-day global forecasts in eight minutes — compared to the hours needed by physics-based systems — and outperformed ECMWF’s ensemble forecast on 97.2% of targets, including tropical cyclone tracks and extreme weather events.
That number — 97.2% — is stunning. But notice what GenCast actually did: it learned to replicate the ensemble approach. GenCast doesn’t produce a single prediction. It generates probabilistic forecasts — multiple possible futures with calibrated uncertainty, exactly like the physics-based ensembles it was trained on. The architecture it learned to mimic is the same generate-observe-update loop that meteorologists have been running since 1994.
The AI didn’t replace the algorithm. It learned the algorithm was right.
The Universal Pattern
Here’s the part that matters beyond weather.
The generate-observe-update cycle isn’t a meteorological technique. It’s the fundamental algorithm for reasoning under uncertainty. And it shows up everywhere:
Petroleum engineering. Oil companies run ensemble Kalman filters on reservoir models, using production data from wells to update their estimates of underground geology. The ensemble spread tells them where to drill next.
Central banking. The Federal Reserve and the European Central Bank use dynamic factor models with Kalman filtering for GDP nowcasting — ingesting monthly economic releases as they arrive and updating their estimate of current-quarter GDP in real time.
Robotics. Every self-driving car uses some form of particle filter or ensemble-based state estimation. Multiple hypotheses about where the car is, what other objects are doing, and what’s about to happen — updated continuously with sensor data.
Finance. Regime-switching models for markets are structurally identical: generate multiple scenarios (recession, expansion, stagflation), assign probabilities, update those probabilities as new data arrives (employment, CPI, earnings).
The pattern is always the same: generate possible futures, observe what actually happens, update your beliefs, repeat. This isn’t a niche method. It’s the best known algorithm for making decisions when you don’t know what’s going to happen — which is to say, it’s the best known algorithm for making decisions, full stop.
Why AI Doesn’t Do This Yet
Current AI products are, structurally, single-point predictors. You send a prompt, you get a response. One shot. Stateless.
No ensemble — a single model produces a single output, with no systematic exploration of alternative possibilities. No observation step — the model doesn’t go back and check what happened. No update step — the model doesn’t revise its beliefs based on outcomes. No persistence — the model forgets between sessions.
This is where AI was in 2024 when it came to reasoning about the real world. It’s roughly where weather forecasting was in 1960, before Lorenz demonstrated that single-point predictions of complex systems are unreliable by construction.
Weather forecasting solved this problem thirty years ago. The solution is in production. It processes 800 million observations a day. It runs on the most powerful civilian supercomputers on Earth. It is the most successful forecasting system ever built by humans.
The algorithm exists. It’s been running operationally, 24/7, since the mid-1990s.
The question isn’t whether it works. The question is why AI hasn’t adopted it yet.
Sources & further reading
Foundational papers
Lorenz, E.N. (1963), “Deterministic Nonperiodic Flow,” Journal of the Atmospheric Sciences, Vol 20, pp 130–141 — the founding paper of chaos theory
Evensen, G. (1994), “Sequential Data Assimilation with a Nonlinear Quasi-Geostrophic Model,” Journal of Geophysical Research, Vol 99(C5), pp 10143–10162 — the Ensemble Kalman Filter
Evensen, G. (2003), “The Ensemble Kalman Filter: Theoretical Formulation and Practical Implementation,” Ocean Dynamics, 53, 343–367 — the definitive reference (3,000+ citations)
AI weather forecasting
Price, I. et al. (2024), “Probabilistic Weather Forecasting with Machine Learning,” Nature — Google DeepMind’s GenCast
DeepMind Blog, “GenCast Predicts Weather and the Risks of Extreme Conditions“ — the 97.2% figure
ECMWF, “ECMWF’s AI Forecasts Become Operational,” February 2025
ECMWF, “ECMWF’s Ensemble AI Forecasts Become Operational,” July 2025
Operational systems
ECMWF, “Fact Sheet: Earth System Data Assimilation“ — 800 million observations/day
Houtekamer, P.L. and Zhang, F. (2016), “Review of the Ensemble Kalman Filter for Atmospheric Data Assimilation,” Monthly Weather Review — comprehensive overview including Environment Canada’s 256-member EnKF
Accessible reading
MIT Technology Review, “When the Butterfly Effect Took Flight“ — the Lorenz story
Kalnay, E. (2003), Atmospheric Modeling, Data Assimilation and Predictability, Cambridge University Press — the textbook



