Pre-, post-processing or both?

by Marie-Amélie Boucher, a HEPEX 2015 Guest Columnist

Do you think it is better to pre-process the meteorological forecasts, to post-process the hydrological forecasts or to do both? Why?

Following this blog about future directions for post-processing research, this challenge was mentioned in a comment by James Brown:

« Putting aside the choice of technique, I think there are some more fundamental questions about how to use hydrologic post-processing operationally. For example, under what circumstances does it make sense to separate between the meteorological and hydrologic uncertainties and model them separately (pre- and post- processing) versus lump them together? »

Vintage processing machine (from www.torange.us, labeled with non-commercial reuse)

A vintage processing machine (from www.torange.us, labeled with non-commercial reuse)

Existing comparative studies

I only ever saw three papers aimed specifically at a  «pre vs post» comparison. There is a case study on two Korean catchments by Kang et al. (2010). The authors demonstrate that, for the catchments under study, post-processing is much more efficient than pre-processing. In fact, their results show that the influence of pre-processing alone is very small compared to post-processing alone. The conclusions of Roulin and Vannitsem (2015) are similar. In another comparative study for 10 French catchments, Zalachori et al. (2012) conclude that: «Statistical corrections made to precipitation forecasts can lose their effect when propagated through the hydrological model». They also show that post-processing streamflow forecasts is quite effective in terms of improving the Ranked Probability Score and PIT histograms.

Verkade et al. (2013) did not perform a comparative study but rather investigated the influence of pre-processing the meteorological forecasts for streamflow forecasting. Their conclusions point in the same direction: « the improvements in precipitation and temperature do not translate proportionally into the streamflow forecasts ».

Of course, there are additional uncertainties arising from the hydrological model and pre-processing cannot account for that. Still, I find those results a bit surprising. I have often heard « garbage in, garbage out ». But now it seems that whether you pre-process meteorological forecasts before feeding them in the hydrological model is of very little consequence (in terms of final streamflow forecasts only, I mean).

Why?

Beyond this notorious uncertainty attributable to the hydrological model, to me it mostly appears that the hydrological model might be robust, in the sense that it is indifferent to small variations in meteorological input data. I find it similar to the problem of differentiating forecasts according to their economic value: sometimes, although system B produces forecasts of better quality (in terms of agreement with the observations) than system A, the difference is not significant enough to change a person’s (or organization’s) course of actions. So the quality improves but not the value (see Murphy 1993)

Also, neither Kang et al (2010) nor Verkade et al. (2013), or Zalachori et al. (2012) combined pre- and post-processing with data assimilation. I suspect it could make a difference. As mentioned in a previous post, there are probably interactions between data assimilation and post-processing. The recent study by Roulin and Vannitsem (2015) also supports this idea.  I am under the impression that if the hydrological state of the catchment and associated uncertainty could be estimated properly, then maybe those small improvements in meteorological forcings could be better translated further down the chain.

From www.flickr.com, labeled with non-commercial reuse.

From www.flickr.com, labeled with non-commercial reuse.

In my opinion, HEPEX needs more case studies comparing pre- and post- processing, involving:

  • More catchments in different hydroclimatic regimes,
  • Various pre- and post-processing methods,
  • Different atmospheric/hydrologic models,
  • Pairing with data assimilation.

What is the answer?

I don’t know the answer to the question « pre-, post, or both »? Do you?

I see benefits and drawbacks in both pre- and post-processing. Using only post-processing is simpler and certainly relevant in terms of research. However, hydrologists also need reliable precipitation and temperature forecasts as such, not only the final streamflow forecast. So pre-processing is also important, although more complicated and apparently sometimes inefficient in terms of improving streamflow forecasts.

Do we need both? Is there a “correct” answer to this question?

References

Kang T.-H., Kim Y.-O. and Hong I.-P. (2010) Comparison of pre- and post-processors for ensemble streamflow prediction, Atmospheric Science Letters, 11, 153-159.

Murphy A.H. (1993) What is a good forecast? An Essay on the Nature of Goodness in Weather Forecasting, Weather and Forecasting, 8, 281-293.

Roulin E. and Vannitsem S. (2015) Post-processing of medium-range probabilistic hydrological forecasting: impact of forcing, initial conditions and model errors, Hydrological Processes, 29, 1434-1449.

Verkade J.S., Brown J.D., Reggiani P. and Weerts A.H. (2013) Post-processing ECMWF precipitation and temperature ensemble reforecasts for operational hydrologic forecasting at various spatial scales, Journal of Hydrology, 501, 73-91.

Zalachori I., Ramos M.-H., Garçon R., Mathevet T. and Gailhard J. (2012) Statistical processing of forecasts for hydrological ensemble prediction: a comparative study of different bias correction strategies, Advances in Science and Research, 8, 135-141.

3 comments

  1. Pre-processing of ensemble meteorological forecasts can involve both tuning the distribution of each ensemble member and tuning the spread of that ensemble. Pre-processing in order to get the right distribution can generally be avoided, at least for short-term ensemble forecasting, if you calibrate your hydrological model with short-term forecasts rather than with observations. At least for mid-latitude watersheds of size O(1000 km2) or larger, my experience is that the quality of short-term ensemble forecasts is now sufficient for model calibration. The advantage is that then there is no need for pre-processing since you use in forecast mode the same product that the model has seen during calibration. The challenge is often to obtain archived forecasts for past dates with the same model version (a reforecast product) in order to perform the calibration.

  2. Nice post, thanks! I agree with Vincent, although I think it’s also necessary for the real-time forcings to have a somewhat consistent climatology with the forecasts (the calibration basis). The initial watershed states play such a role that they should not be generated by forcings that are oranges to apples.

    In general, I also agree that post-processing can cover many sins in hydrologic prediction, though there are several situations where pre-processing would be needed. If the forecasts are so climatologically biased that the hydrology model does not cross critical thresholds to generate runoff correctly, the post-processing won’t help. At a HEPEX meeting on post-processing in 2008, I recall CNRFC’s Rob Hartman giving the example of rainfall that is systematically 1/10th of what it should be — which would in most cases produce no runoff (all infiltration/evaporation), leaving nothing to post-process. Another example would be in snowy locations, where a temperature bias of a few degrees could either deposit snow when it should be rain or vice versa. Note, I’m not referring to forecast errors (which are hard to correct), but systematic forecast climatological biases, perhaps due to deficiencies in wx/climate model resolution or other dynamical issues.

  3. Thank you both for your comments! I agree with Andy regarding the problem of temperature biases. A colleague told me about a recent situation where both the Canadian and American atmospheric model were forecasting similar (and large) amounts of precipitation over a particular watershed in Quebec, but completely different temperatures. They were short-term forecasts (5-days ahead and shorter). All the members from the Canadian ensemble forecast were well below zero while all the members of the American forecast were above zero. So before running the hydrological model, the forecasters had to choose which atmospheric model to believe. Thus I think if you decide to calibrate a hydrological model using meteorological forecasts, you must be careful choosing which meteorological forecasts. And a multi-model system would involve multiple calibrations.

    Also, I am not sure how calibrating the hydrological model with short-term forecasts would work out for longer lead times (several months). The meteorological inputs for long-term stream flow forecasting are quite different than for short-term, but the hydrological model and parameters are often the same. Or maybe it wouldn’t make much difference? The uncertainty for long-range forecasts is already considerable, so maybe additional uncertainty related to the parameters of the hydrological model would go unnoticed?

    The idea of calibrating the hydrological model using short-term meteorological forecasts could be included in the above-mentioned future comparative studies. It would certainly be interesting.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.