On the economic value of hydrological ensemble forecasts
Contributed by Marie-Amélie Boucher, Maria-Helena Ramos and Ioanna Zalachori
It is often assumed that probabilistic forecasts should lead to better water and risk management through increased benefits (economic or not) to users in their decision-making processes.
Most often, this assumption arises from studies based on evaluations of forecast quality, which propose comparisons of performance between deterministic and probabilistic (or, for example, ensemble) forecasts using metrics such as the CRPS and the MAE to support their conclusions.
But, really, does quality (a ‘good’ average statistics like the CRPS) automatically translates into value? Assuming that from the water manager’s point of view, the question of value really is the most important one, after all, if no gain ($$$) is to be anticipated, why should one change a deterministic forecasting system for a probabilistic one?
A famous quote from Murphy in his 1993 paper, ‘What is a good forecast? An essay on the nature of goodness in weather forecasting’, reads that ‘… forecasts possess no intrinsic value. They acquire value through their ability to influence the decisions made by users of the forecasts’. It seems the question then is to measure ‘this ability’, which basically means that first we have to be able to ‘evaluate decisions’.
We once asked a decision-maker responsible for deciding on whether or not to open a control gate of a dam, and of how many meters it should be opened in case the decision was to open it, how he knew afterwards that his decision was the ‘best decision’. The answer we got was (not ipsis verbis): ‘It is the best decision: we take the best decision given the forecasts we receive and other complementary information we have on the situation. If the result is not good, the problem is not in the decision, but in the forecasts, which were not good’.
Concretely, we understood that the evaluation of ‘good’ or ‘bad’ decisions is not as straightforward as it appears even for decision-makers (although most probably the consequences of ‘bad decisions’ may come back somehow to decision-makers or to the forecasters if human or high economic losses are observed).
Still, investigating the forecast quality/value relationship can be useful to at least throw lights on the expected benefits of research and operational studies seeking to improve the quality of hydrometeorological forecasts: if improvements in quality translate into improvements in value or, generally speaking, in forecast utility, one may justify to continue seeking for improved forecast quality.
Now, we come back to the questions: how to measure forecast value in hydrology? How to link it to forecast quality?
There have been some interesting experiments regarding the value of hydrological forecasts. Typically, they use concurrent forecasting systems successively as inputs in a decision making model and evaluate the respective gains.
The most popular decision-making model involved in forecasts value comparisons is the one based on the ‘cost-loss ratio’ (C/L). This ratio is strongly user-subjective and depends on the user’s risk tolerance. In hydrologic forecasting, for instance, the study by Roulin (2007) used a cost-loss decision making model to evaluate the value of 1 to 10-days streamflow forecasts, using various C/L ratios and probability thresholds for decision-making. They demonstrated, for two Belgian catchments, that using ensemble rather than deterministic forecasts translates into monetary gains. McCollor and Stull (2008), Van de Bergh and Roulin (2010), Muluye (2011) and Verkade and Werner (2011) also used a cost-loss ratio to compare ensemble and deterministic forecasts value.
A number of studies also investigated the potential added value of probabilistic forecasts using numerical optimization techniques for reservoir management decisions. Kim et al. (2007) used stochastic dynamic programming to compare the value of ensemble and deterministic forecasts for a Korean catchment (the ensemble won the comparison). Weijs (2011), using a similar type of approach, showed that the gain is greater for short lead times and high reservoir levels. Boucher et al. (2011), in a case study on the Gatineau catchment in Quebec, found no significant added value in raw streamflow ensemble forecasts (compared to deterministic forecasts), although the situation was reversed with the use of statistical post-processing of ensemble forecasts (Boucher et al., 2012).
These studies and approaches share a common goal: assess the value of hydrological forecasts obtained from various meteorological forecasts (multi-model or not). The end-users’ interests, however, are quite diverse: hydropower, urban water supply and flood prevention.
In general, what is particularly interesting with hydrologic forecast value studies is that, while ensemble forecasts are often considered to be useful for the decision-making process, it is not always the ones associated with the highest forecast value. After numerous demonstrations of how the quality of ensemble forecasts is superior to the quality of deterministic forecasts, it can be shocking (and deceiving!) to realize that quality improvements do not always correspond so straightforwardly to higher monetary value.
Certainly, conclusions on the economic value of hydrological ensemble forecasts are strongly dependent on several issues (e.g., the user’s decision model and objectives, the forecast lead time of interest), which include the methodological approach itself used to evaluate it. So, even though ensemble forecasts show to be better forecasts, sometimes the most important factor for them to be of higher value is the decision-making model itself! Its limits can largely affect the assessment of the added value of some forecasting system over another.
Maybe to convince potential users of ensemble forecasts that changing their practices is worth the trouble, we must demonstrate that doing so can make them richer! … Or maybe ‘less poor’, if viewed from a flood protection point of view.
January 31, 2014 at 11:08
Very nice and interesting article. I fully agree that the link between forecast value and skill is extremely difficult to quantify and often cannot be quantified at all. Small skill improvements of the forecasts are not only often irrelevant in terms of ‘improved’ decisions or are ‘overridden’ by other larger scale events (e.g. a massive drought). However, this non-linearity can also work the other way round, meaning that small improvements can have large impacts. I also do not believe that scientific evidence is the main driving factor in the implementation or operation of probabilistic forecasting systems as there is to many other variables (have a look at the science implementation plan chapter on communication and decision making). You nicely highlight case studies in which the economic benefit of probabilistic forecast has been assed (maybe not always with the outcome I hope for). There is another source (more qualitatively) evidence of value of probabilistic forecasts is that when you come to user meetings (i can only speak of my impression of ECMWF user meetings) the number of presentations from commercial providers showing some analysis of probabilistic forecasts increases.
I don’t know how much revenue is generated by the ECMWF ensemble prediction system alone. The Ensemble is certainly a central product, as most ECMWF users request it (also explains why the TIGGE (http://tigge.ecmwf.int/) portal has so many users). As said, all of that is no ‘hard’ evidence – maybe a commercial provider of hydrological ensemble forecasts has the same impression?
February 2, 2014 at 13:47
Thanks for the comment. Value or utility of a forecast system can indeed be something that is usually more straightforwardly easy to evaluate in qualitative terms. It can be that users are keen to implement probabilistic forecasts because they are state-of-the-art systems (‘we want the best to our service/company!’), but still if we don’t explicitly show how a probabilistic system can also be ‘quantitatively’ interesting (whatever these quantities are: number of lives saved, extra hours or days of anticipation, better performance in the production of hydroelectricity, increased trustfulness in forecasts that come with the associated uncertainties), for how long will they be willing to use these systems, and, maybe more importantly, to actively contribute to developing advanced probabilistic systems? Complementary to your question, how can we also promote dissemination of all these no ‘hard’ evidence of usefulness? I guess this is a challenge to the whole hydrologic forecasting community!
February 2, 2014 at 17:56
Thank you! Regarding forecasts’ value, I would like to add that it is sometimes evaluated in a deterministic framework although the information is probabilistic. The multiple reservoir management problem can theoretically be solved using stochastic dynamic programming, but in order to avoid the curse of dimensionality, the problem has to be greatly simplified. The amount of information in the forecasts is reduced. I feel that the limits of most current operational «decision-making assistance tools» can sometimes be an obstacle to the assessment of forecasts’ value. It is also difficult to quantify the user’s level of risk aversion, which is an important factor in the decision-making process.
February 2, 2014 at 13:52
1. There is no, and cannot be any, straight relationship between statistical forecast quality and usefulness, partly because utility is subjective; what is useful for me is useless for you. See the triangular cost/loss image in my July webinar http://www.youtube.com/channel/UCuu6EodABZyujnbeSHpCHYA . Those with very low or very high protection costs do not benefit from obviously skillful forecasts. Also note, as I exemplify with the BBC forecast from December 2011, that in situations were the uncertainty is very high and the forecasts will most likely score very badly, the very knowledge of this sorry state of affairs will have a positive value for the decision maker.
2. By coincidence I am now, once again, reading Stuart Sutherland’s book from 1992 “Irrationality – The Enemy Within”. He defines at the beginning “rationality” in almost exactly the same way as your man at the gate of the dam: to be “rational” is not to be “wise” or make the “best” decision, but to make the best decision in light of one’s preferences and available information.
3. During my 20+ years with the ensemble system I have often heard people say that just “because” the ECMWF and other institutes have developed the advanced and expensive ensemble system we “have to” issue probability forecasts. It sounds as if probability forecasting stands and falls with the ensemble system. Probability forecasting has been around for more than 100 years and the ensembles are just a recent welcome support, nothing more.
February 2, 2014 at 14:21
Nice insights, Anders, thank you. I am not an operational forecaster but I have the feeling that in hydrologic forecasting, probabilistic forecasts are still used below their capacity. Maybe hydrologists (and I include myself here) are too much used to look after the best model, the best answer to their problems. I would like to see more decision makers explicitly say that they want streamflow probabilistic (or ensemble) forecasts, or that they have (or had in some specific occasions, if this is the case) experienced their positive value (even if qualitatively).
February 2, 2014 at 18:00
A lot has also been said about the need to inform users about how ensemble forecasts are produced and how they should be interpreted. I feel that this is still a very actual topic. I also agree that «goodness» can be defined differently for each user. Improved communication between the forecaster and the user could help to improve forecasts value for this particular user. Maybe this user’s preoccupations could influence the way forecasts are produced, for instance which specific sources of uncertainty are taken into account or investigated more. Or maybe just the way forecasts are presented to the user. I agree that it would be highly interesting to have more forecasts users share their positive or negative experiences with ensemble forecasts.
February 3, 2014 at 09:27
I come back to my general theme: the ensemble technique is subordinated to the general problem of producing, conveying, understanding and making us of uncertainty information irrespective of its source(s). The problem is therefore, in my opinion, not so much to learn how different ensemble systems work, but how to combine their output with other type of information to provide useful uncertainty estimations. Again, let me compare with car driving: nobody would spend too much time to learn what distinguishes the constructions of a Renault, Volvo or Volkswagen, compared to what distinguishes driving in darkness, on slippery roads or in heavy rain.
February 3, 2014 at 11:06
To me this is a strong argument to include real-time control in the assessment of the value of improved predictions. In this field we are used to translating costs into an objective function that drives the decision making process. This process uses forecasts in a structural manner, so ensemble forecasts, containing more information, will lead to a lower cost when evaluating the objective (cost) function in hindsight (given that our models are appropriate).
February 3, 2014 at 12:20
I have read the above article and posts and thought you might be interested in the findings of a project we led for the Environment Agency in which we developed a cost/loss based decision-support framework that gives flood forecasting & warning staff some clear guidance on how to decide on an appropriate action (be it closing barriers, raising demountable defences, mobilising staff, issuing alerts & warnings) when provided with a probabilistic flood forecast. As well as the quite ‘hard’ cost/loss part of the framework there is also an allowance for other ‘softer’ factors that can result in a cost/loss based decision being reversed in certain circumstances. The framework considers fluvial situations and also coastal and surface water flood situations. The reports for this study are available from the link below and if you need to know more, do let me know.
http://evidence.environment-agency.gov.uk/FCERM/en/Default/HomeAndLeisure/Floods/WhatWereDoing/IntoTheFuture/ScienceProgramme/ResearchAndDevelopment/FCRM/Project.aspx?ProjectID=C0899B01-6FCD-4775-9AEB-A2DCC8EC4D39&PageId=a0fe6dfc-506a-452c-9bff-a7ec06b4e6b0
February 3, 2014 at 13:17
If I can step into this debate as a social scientist who has worked on how forecasters, civil protection and policymakers are understanding and using ensemble forecasts in general, I think that a tension to the economic value of ensemble forecast for the private sector is somehow confronted to the same challenges experienced by other non-experts. In other words, if there is an interest in making ensemble forecast used more broadly by a a wider community of experts than civil protection, then the socio-political dimensions of using ensemble need to be taken more seriously. For example, tensions related to questions of trust of ensembles are not only cognitive, but most of the time political. In the context of large and aggressive state corporations such as Hydro-Quebec, there are institutional dimensions to the use of information as well as political mandates framing actions of users on questions of responsibility in term of crying wolf or missing events that needs to be clarified and I am afraid that technical improvement has little to do with that.
There is also the bigger challenge to move from a culture of risk in which deterministic answers are required to a culture of preparedness in which a cost-benefit logic is perhaps less adapted for.
All in all, technical development are certainly key in making ensemble forecast more valuable to users, but without a grip on the political and institutional dimensions framing how this knowledge is used, technical improvements might not deliver their full promise. Anyway, a lot has been said on this topic already, but I still think that technical development alone can’t overcome most of those challenges. There is a need for more interdisciplinary studies and both social and physical sciences can bring to each other, especially on questions regarding what to do with uncertainties.
February 3, 2014 at 21:44
The cost/loss model has been mentioned above. It is a powerful model, but just the first approximation to how people make rational decisions. If any of you had to choose between getting £700 straight in your hand and a 80% chance of winning £1000, you would rather grab the money and NOT follow the cost/loss model although its expected gain is £100 more.