As an operational forecaster, should I concern myself with typologies of uncertainty?
Contributed by Florian Pappenberger, Jan Verkade and Fredrik Wetterhall
You don’t need to know about uncertainty typology, but you do need to know what is in your uncertainty estimate and, more importantly, what is not.
Lots of talk!
The term “uncertainty” has become part of the standard vocabulary of many hydrologists. The analysis of uncertainty in hydrology is a thriving sub-discipline. The more a discipline advances the greater is the need to use/invent new jargon to allow for more precise definitions – new jargon is of course also a way to signify that progress has been made. Part of the new jargon is often a classification system. There are now a large number of such classifications for uncertainty. The most widely used definitions (and currently swimming on a tide of popularity in hydrology) are aleatoric uncertainty and epistemic uncertainty. In this post, we will give the basics of these definitions, argue that you don’t need to remember any of this blog post and postulate what you need to know.
Let us give you some background. Aleatoric uncertainty refers to the unknowns that differ each time we run the same experiment. As such, it is akin to the random elements affecting the outcomes of the experiment. Think of the throw of a dice, of the chaotic nature of the atmosphere and of the measurement error in water level gauges. Epistemic uncertainty relates to things we could but do not know. Think of the sparseness of rain gauges, of the constantly changing river geometry which we only record periodically and of the purposely omitted processes in our forecasting models that need to run quickly.
By the way, the words aleatory and epistemic stem from the Greek language and if you cannot remember which uncertainty is meant by which term, then you are not alone (a free chocolate bar will be offered for the person which comes up with the best memory-hook in the comments below!).
Both aleatoric and epistemic uncertainties are present in all components of an operational hydrologic forecasting chain. Money and effort could contribute to the reduction of the part of the error that depends on the epistemic uncertainty, but most of this effort would have to be done ‘offline’, i.e., in system and model design. For all intents and purposes, in the real-time operation of a forecasting system, we treat all uncertainties, regardless of their nature, as aleatoric, i.e., as random.
To put it differently, we expect that future hydrologic conditions will be within the range of our uncertainty estimates. In contrast, epistemic uncertainty could potentially lead to big surprises because of events that are not included in our uncertainty estimate. Think of the river that is suddenly blocked because of a landslide.
Should I care?
So far the theory. The question is, how much should one care about these different classifications of uncertainty in an operational forecasting chain?
The answer depends on whether you’re designing or operating a forecasting system.
If you’re designing a probabilistic forecasting system, it’s probably helpful to know which uncertainties can be reduced and which cannot. You’d probably also want to have some idea of how much uncertainty can be reduced by investing in additional knowledge, additional measurements and so on: should I invest in a new satellite, in additional rain gauges, in data assimilation, or in additional research into stem leaf hydrology?
If, on the contrary, you’re operating a forecasting system, you’d need to know what is in the uncertainty estimates and what is not. How are these estimates produced? Atmospheric ensembles only? Hydrologic uncertainties too? Similar to the world of hydrological modelling, you’d need to know what is in your predictive distribution (model) and what is not. This guides the interpretation of the estimate of the total range of uncertainty and the communication thereof to forecast users.
Does the typology of uncertainties matter in decision making?
It does and it doesn’t. Often, the decision maker will simply be interested in the estimate of predictive uncertainty that is based on our current state of knowledge. For example, the blue light emergency services will want to know the range of future water levels. They are less interested in what would happen if a bridge collapsed and blocked the river altogether. Having said that, it doesn’t harm to communicate the fact that some processes that could affect future conditions are not actually included in the estimate they’re given.
Does the knowledge about the typology of uncertainty really change current practice? Is it not just the same meat with different gravy?