Multi-model approaches for river flow forecasting: blessing or burden?

Contributed by François Anctil, Maria-Helena Ramos and Florian Pappenberger

In ensemble prediction, the use of several models to estimate the total predictive distribution presents an approach in the quest of making available reliable and skillful forecasts to operational users. Here, by multi-model, we mean broadly:


  • using multiple hydrological models (different structures, parameter sets, scales or boundary conditions, etc.),
  • using multiple meteorological ensemble prediction systems (from different meteorological models or meteorological centers),
  • using multiple ways of data assimilation, pre and post-processing (where appropriate),
  • using more than one of the options above in a combined way.

Examples of applications of multi-model approaches, in hydrologic simulation or forecasting, are numerous in the literature. A recent study conducted with European flood forecasters has shown that multi-model approaches were considered as one of their preferences regarding investments in research and development, and a high priority action to practitioners.

Studies conducted within a French-Canadian collaboration in ensemble prediction indicated that a combination of several hydrological model structures and meteorological ensemble predictions has higher skill and reliability than ensemble predictions given either by a single hydrological model fed by the weather ensemble predictions or by several hydrological models and a deterministic meteorological forecast. Additionally, it was shown that the selection of hydrological members is achievable without sacrificing the quality of a forecast, with diversity as a critical factor for selection.

Using several NWP ensemble prediction systems as input to a hydrological model is the basis of the THEPS Experiment, using the THORPEX/TIGGE database, proposed within HEPEX (see the post on HEPEX testbeds and a specific post on the use of the TIGGE forecasts in hydrology coming soon…).

Contexts other than flood forecasting may also benefit from multi-model ensembles. A multi-model ensemble of eight large-scale hydrological models has proved to be useful to simulate runoff trends in Europe, with variability among simulated trends for the different models being considered as a “strong reminder of the uncertainty of projected future changes in runoff if limited to only one such model”.

It has also been argued that the added value for flow simulation, under perspectives of its exploitation in the evaluation of climate change impacts, lies also in the diversity brought by multimodel approaches. The aggregation of twenty lumped models was found more robust under contrasted climate (temporal transposability) than any of its individual element, which is promising for climate projection applications.

What are the advantages and limitations of multi-model approaches?

Here we raise 5 points to help in addressing the question:

  1. Operational forecasters may be curious to know if the model(s) they are using is(are) really “the best” they can have for their catchments at all times: what then would be simulated by another model, with a different structure or a different representation of the physical processes? Would the peak flow be better captured by the forecasts? Would it improve the forecasts?
  2. It is fact that a single deterministic model cannot represent well all (atmospheric or hydrological) flow situations that we may encounter in the future: there may be a meteorological model that better captures convective rain or certain synoptic patterns of interest in a given region, or a hydrological model that simulates well high flows or raising limbs of hydrographs, although it is far from having a good score when it comes to low flows or recessions.
  3. Building a predictive distribution from scenario-based simulations usually supposes that all scenarios are equiprobable and coming from the same population, i.e., can be described by the same probability distribution (or at least they are going to be considered so when estimating forecast probabilities and evaluating forecast performance). How does that affect multi-model approaches? How can we know if the models used are in fact “brothers and sisters” from the same population? If we consider some weighting based on past performance of different models, how does one transform this weighted-combination into the future predictive uncertainty?
  4. Multi-model approaches can easily converge to infinity: 50 meteorological ensemble members combined with 20 hydrological models and 10 real-time data assimilation techniques will give us 10000 simulations (and this for each lead time!).  Computers are fast these days, but in such situations will they be fast enough to allow forecasters to deliver their forecast products before the event happens? Can system designers, developers and forecasters scientifically support a large number of different models or do we need tools for member selection?
  5. How different are different approaches? For example, the TIGGE archive contains multiple NWP models, some of them of different design – however, the developers have often learnt from the same text books, models are sold and exchanged and, in this era of globalisation, successful approaches are quickly copied. The same is true for many hydrological models. This opens up the questions: What exactly is model diversity? How does it work at each time step and along a succession of flow events?

Finally are multi-model approaches a blessing or a burden?

Do you have additional examples of multi-model prediction systems to share with us?


  1. Dear all,
    thank you very much for this post!
    One should expand this to the whole mood in science of additive burden needed to get a paper accepted. It is getting impossible to be “state-of-the-art”!
    It has begun with the of communicating each time results on calibration and verification. Then it got essential to communicate predictive uncertainty and parameter equifinality.
    Later on moving to forecasts some stated presenting multi-model deterministic lagged ensembles and here is where then the infinite began to come closer and closer (your post as good summary for this).
    Not speaking about pre- and post-processing and verification metrics!
    My feeling today? I feel now more and more obliged to communicate also the Kling-Gupta efficiency and I am every time happy that no reviewer is asking me why I still don’t use multiple assimilation algorithms or at least the ensemble Kalman filter.
    That’s the fate of incremental science achievements, which in our case began 1850 with the Mulvanay rational method.

  2. Really interesting article that points out a number of key issues for operational hydrologic forecasting. These issues are very real and we need to address them. In addition, we’re faced with just how many hydrologic models can an operational entity afford to support (calibration, state maintenance, etc.).

  3. I was going to make Rob’s point. Technology is changing to make it easier to support multi-models, but traditionally operational forecasters committed to a model of choice and built up complementary goods around that (e.g. software, calibrations, training materials). The question is if there is more value in a forecaster “knowing” one model very well, or having a diversity of models running around. If the forecaster was hands-off with the model, of course they would want more models, but if all the care-and-feeding of the models fell back on the forecaster, then workload would be a limiting factor.

    So there’s two perspectives to this …
    Modeler: “With 10 models, I get a 15% better answer. Need more models.” and
    Administrator: “With only 1 model, I can make it to 85% of the best answer (or 1 model + statistical post-processing gets to 95% of the best answer). I am being prudent and frugal with taxpayer resources”.

  4. As has been helpful in the past, analogies can be drawn for the multi-model discussion to the current meteorological forecast setting in which forecasters consider dozens of models in deriving local met. forecasts. With surprising rapidity, objective blending techniques (in the NWS, the CONSALL family, for instance) have gained some acceptance following verification studies that found near equivalent performance to human forecasters, who blend subjectively based on expert knowledge, in many situations. This traditional approach is increasingly difficult to implement as the number and diversity of models expand, thus the science and methods for objective combinations are a new and pressing challenge.

    Hydrologists do not currently have such a range of options (except with seasonal forecasting, which can be done cheaply/statistically). Yet with the advent of supercomputing & super-connectivity, it’s technically feasible for hydrologists to have multiple forecasts & models available. For example, NCEP currently runs a number of land surface models via NLDAS in a monitoring mode, updating a few days from real-time. It would not be a huge stretch to run these in forecast mode.

    Key differences from met. forecasting exist, though — notably, that NWP models are not ‘calibrated’ in the same way as hydrology models, and that each center runs them for regional to global domains. Local met. forecasters don’t receive training in every individual model, are not responsible for maintaining or running them, and their NWP combinations will ideally be performance based. If multi-model forecasting in hydology does emerge, it will likely be implemented not according to the traditional operational one-model paradigm, in which the forecasters are deeply integrated in the forecast-model-data loop (ie as in the NWS), but with more centralized resources and with local-to-regional forecasters performing interpretive roles more analogous to met. forecasters. The EFAS effort is an example of what one such centralized forecast source might look like. The resource question alluded to in earlier comments may not be an office-level or even sub-agency-level budgetary calculation, since the provision of multiple numerical hydrologic predictions (NHP) could develop in a multi-agency and even international context.

    The multi-model pathway certainly has potential, but many issues are unresolved. We don’t yet know the extent to which this paradigm would provide high quality operational forecasts (some of which are regulated) at local scales, satisfying local user needs. Institutional challenges are also daunting — eg, defining the new role of the human forecaster (many of whom are unionized public employees) and the transition pathway from the current paradigm — a controversial question to say the least. It may bear noting too that we still struggle to use single models well (ie with verification, past/future forcing consistency, uncertainty accounting, and many other ‘ideal’ elements of a forecasting system) — despite a healthy literature showing the advantages of such efforts. I am hopeful, at least, that we will develop and implement a framework in which to evaluate such opportunities objectively, and in contrast to our existing approaches, to support decisions on whether or how best to move forward.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.