Representing model error in high resolution ensemble forecasts
Contributed by Laura Baker
Ensemble weather forecasts are used to represent the uncertainty in the forecast, rather than just giving a single deterministic forecast. In a very predictable system, all the ensemble members typically follow a similar path, while in an unpredictable system, the ensemble may have a large divergence or spread between members.
A simple way to create an ensemble is to perturb the initial conditions of the forecast. Since the atmosphere is a chaotic system, a small perturbation can potentially lead to a large difference in the forecast. However, just perturbing the initial conditions of the forecast is sometimes not enough, and these ensembles can often be underspread, which means that they do not cover the full range of possible states that could occur. This means that the ensemble forecast could miss what actually occurs in observations. One way to further increase the spread of the ensemble is to add some representation of model error, or model uncertainty, into the forecast. Model uncertainty becomes relatively more important as you go down to smaller scales, so in a high-resolution ensemble it is more important to include these effects.
A recent study as part of the DIAMET project aimed to investigate the effects of randomly perturbing individual parameters in the forecast model as a way of representing model error. We used a configuration of the Met Office Unified Model with a resolution of 1.5 km and a domain covering the southern part of the UK. We generated an ensemble with one control member and 23 perturbed members. The initial conditions for each ensemble member came from a global ensemble forecast with a lower resolution (60 km). Since our domain is a sub-domain of the global model, the lateral boundary conditions are also derived from the global model forecast, and each ensemble member has perturbed boundary conditions corresponding to their initial condition perturbations.
We focussed on a single case study which occurred during one of the DIAMET field campaign periods. This case was particularly interesting from an ensembles perspective because it involved the passage of a frontal rain band with an interesting banded structure which was not well represented in the operational forecast. None of the individual ensemble members captured the two separate rain bands, but some of them had rain in the location of the second band.
We perturbed parameters in the boundary layer and microphysics parameterisation schemes. 16 parameters were chosen to be perturbed, which were known by experts to have some uncertainty in their values. We perturbed each parameter randomly within a certain range, and each ensemble member had different random perturbations applied to its parameters. We focussed our analysis on near-surface variables (wind speed, temperature and relative humidity) which could be compared with observations from surface stations, and rainfall rate and accumulation (which could be compared with radar observations). We found that for the near-surface variables, representing model error using this method improved the forecast skill and increased the spread of the ensemble. In contrast, for the rainfall the forecast skill and ensemble spread were degraded by this method after the first couple of hours of the forecast.
This study is a useful first step to developing a high-resolution ensemble system with a representation of model error. This work was recently published in Nonlinear Processes in Geophysics and can be accessed here.
April 14, 2014 at 23:07
Whatever tools we use it is important to know their strengths and weaknesses. I do not know if what I am going to tell you about the EPS is a weakness. Some consider it to be, others like me do not. It concerns the quality of the perturbed analyses. In the first approximation they are considered to be as good as the unperturbed analysis and consequently the forecasts are on average supposed to be as good as the unperturbed Control forecast, just more or less different.
But a closer inspection will reveal that the analyses are on average 41% worse than the unperturbed Control analysis and consequently the individual forecasts are also worse although not as much. The 4DVAR analysis is mathematical the optimum solution and anybody who fiddles with it will be “punished”. To understand why is coupled to answer correctly to the catch question: -Which planet is on average closest to Neptune? (Pluto is no longer considered to be a planet, and it would nevertheless be the wrong answer).
Take a look at my hastily drawn image (at the bottom end) of the “Ensemble Solar System”. The Sun is the superb ECMWF 4DVAR analysis, the best in the world. It is, however, not perfect, and assume the error is one unit (1U). So the distance between the analysis (SUN) and the Truth (red planet) is 1 U. But all the perturbed analyses also have, by construction, a 1 U difference to the 4DVAR analysis. So what’s the problem?
The problem is highlighted by the dashed red line. Half of the members are closer to the Truth than the other half. Among those who are closest some are VERY close, but among those who are further away some are VERY far away, up to 2U. When we calculate the average (root mean square) distance it turns out to be, not 1 U but 1.41 U = the square root of 2. That is the distance from the intersection of the red line with the dashed circle to the Truth. Thus the average member is 41% worse than the 4DVAR analysis.
If you are not statistically minded you might get worried, and a lot of my meteorological colleagues are worried. But I seek consolation in the fact that what the ensemble members may lack in individual skill, they compensate by being many (50). The proof of the pudding lies in the eating and the EPS is superior to the deterministic system.
Now you may also realize that the planet that most of the time is closest to Neptune is Mercury.