Quarterly GDP Forecasting with Latent Components and Flexible Observation Laws
A recent GDP nowcasting article from the French macroeconomic lab OFCE motivated the following exercise. The aim is deliberately narrow: assess how far a quarterly Bayesian forecast can go with smooth latent components and alternative observation laws, without adding external macroeconomic predictors.
Data
The analysis uses the long quarterly GDP series published by the French National Statistic Institute, Insee, in Produit interieur brut total - Volume aux prix de l’annee precedente chaines - Serie CVS-CJO.
The empirical sample is built from the official CSV export of series 011794860, last updated on 27/02/2026 08:45, and already includes 2025-T4. Although the official series starts earlier, the analysis keeps quarters from 1 January 1990 onward. Each observation is attached to the end of the quarter. The series therefore contains 144 quarters, from 1990-Q1 to 2025-Q4.
The dominant feature is the sharp disruption and subsequent plateau over 2020-2022 associated with the COVID-19 crisis. Rather than impose a heavier state-space specification, the model below uses a direct latent-component decomposition of the quarterly log level.
Model
Observation and Latent Structure
Let \(G_q\) denote the observed quarterly GDP level for quarter \(q\). The latent mean of the quarterly log level is
\[\eta_q = \alpha + s_{\operatorname{quarter}(q)} + t_q + d_q + c_q.\]
Here \(\alpha\) is an intercept, \(s_{\operatorname{quarter}(q)}\) is a quarter-of-year seasonal effect, \(t_q\) is a smooth long-run trend, \(d_q\) is a short-run deviation term, and \(c_q\) is a local COVID shock correction active only over the 2019-2020 block.
For the observation model, three alternatives were compared. The baseline was a symmetric Gaussian observation law. The second specification allowed skewness through a skew-normal likelihood,
\[\log(G_q) \mid \eta_q, \tau_y, \gamma \sim \operatorname{SN}(\eta_q, \tau_y^{-1}, \gamma),\]
where \(\tau_y\) is the observation precision and \(\gamma\) a skewness parameter, with \(\gamma = 0\) corresponding to the symmetric Gaussian case. The likelihood is parameterised by its mean, variance, and shape parameter.
The third specification kept the same latent mean but allowed the observation precision to change during the COVID-crisis block through
\[\log(\tau_q) = \beta_0 + \beta_1 \mathbb{1}\{q \in 2019\text{-}2020\}.\]
This comparison separates a symmetric reference model, skewness, and a temporary change in measurement variance around the COVID disruption.
Even though the published GDP series was already seasonally adjusted, the quarter-of-year effect can still absorb small recurrent residual differences across quarters. The smooth trend and persistent deviations carry the level dynamics directly, while the local COVID shock lets the model bend around the 2019-2020 disruption without forcing either smooth component to absorb the whole break. Outside that window, the forecast path is driven by the intercept, the seasonal effect, the trend, and the persistent deviations.
For the homoskedastic observation laws, the observation standard deviation was regularised with a penalised-complexity prior,
\[ \Pr(\sigma_y > 0.02) = 0.05, \qquad \sigma_y = \tau_y^{-1/2},\] with an additional penalised-complexity prior on the skewness parameter in the skew-normal model,
\[\gamma \sim \pi_{\mathrm{PC\text{-}SN}}(\lambda_{\mathrm{skew}}), \qquad \lambda_{\mathrm{skew}} = 10.\]
Thus asymmetry is estimated only when supported by the data. In the heteroskedastic Gaussian model, the COVID precision effect is centred on zero, so the homoskedastic case remains the reference model.
The latent components also receive explicit priors. For the smooth trend, a penalised-complexity prior is placed on the marginal standard deviation,
\[ \Pr(\sigma_{\mathrm{trend}} > 0.25) = 0.01,\]
which regularises the overall size of the smooth trend fluctuations. For the short-run deviation term, the corresponding prior is
\[\Pr(\sigma_{\mathrm{dev}} > 0.05) = 0.05,\]
with a prior on the autoregressive correlation centred on weak persistence, so the deviation process is pulled toward limited dependence unless the data support stronger serial correlation. For the local COVID shock component,
\[\Pr(\sigma_{\mathrm{covid}} > 0.10) = 0.05,\]
which allows a local disturbance in 2019-2020 without letting that block dominate the whole latent signal.
Posterior predictive checks use replicated observations from the selected observation law,
\[G_q^{\mathrm{rep}} = \exp\left(Z_q^{\mathrm{rep}}\right), \qquad Z_q^{\mathrm{rep}} \sim F_{\widehat{m}}(\eta_q),\]
where \(F_{\widehat{m}}\) denotes the observation distribution selected by cross-validation. Direct forecasts for 2026 extend the latent process through the four future quarters, while only observed quarters contribute to the likelihood.
Fit
The candidate observation models were estimated on the observed quarterly series. The latent process was extended through the 2026 forecast horizon so that posterior predictions for future quarters could be obtained directly from the selected model.
| Observation law | Status | Post-COVID quarters scored | Invalid GCPO values | Negative mean log-GCPO |
|---|---|---|---|---|
| Gaussian | Converged | 12 | 0 | -4.0371 |
| Skew Gaussian | Converged | 12 | 0 | -3.9887 |
| Heteroskedastic Gaussian | Converged | 12 | 0 | -3.8562 |
Each observation model used its own GCPO deletion groups. For every target quarter and every candidate law, the validation set consisted of the 32 observed quarters with the strongest model-specific posterior linear-predictor correlations. The comparison therefore scores each candidate under its own most adverse 32-point deletion rule. The negative mean log-GCPO was then computed only over post-COVID observations from 1 January 2023 onward. In this specification, candidates without a finite validation score are not eligible for selection. Among the eligible candidates, the selected observation model is Gaussian, which is used for the posterior predictive checks and forecasts.
Posterior predictive check
The selected component model is now evaluated through posterior predictive checks and out-of-sample forecasts.
In-sample predictive check
Posterior predictive checks were based on draws from the joint posterior distribution. The prediction step used the same additive structure as the observation equation.
The in-sample posterior predictive distribution tracks the observed quarterly series closely. The 95% equal-tailed interval covers 100% of the observed quarterly GDP levels.
Out-of-sample predictive check
Out-of-sample performance is assessed with two 2025 exercises. The first is a one-shot forecast of the four quarters of 2025. The second is an iterative forecast in which the model is updated after each newly observed quarter before predicting the next one.
The one-shot exercise hid the full 2025 block and refit the model using only data available through 2024-Q4. The iterative exercise used the same principle, but refit the model quarter by quarter so that each forecast only used information available before the target quarter.
In the one-shot exercise, the 95% equal-tailed interval covers 100% of the observed 2025 quarterly GDP levels, with uncertainty increasing across the forecast horizon.
In the iterative exercise, the 95% equal-tailed interval also covers 100% of the observed 2025 quarterly GDP levels.
Quarterly and Annual 2026 GDP growth
Finally, the selected model fitted through 2025-Q4 is used to derive the 2026 forecast path. For each posterior draw \(s\), the additive predictor gives the latent quarter level through
\[\widehat{G}_q^{(s)} = \exp\left(\eta_q^{(s)}\right), \qquad \eta_q^{(s)} = \alpha^{(s)} + s_{\operatorname{quarter}(q)}^{(s)} + t_q^{(s)} + d_q^{(s)} + c_q^{(s)},\]
Quarter-on-quarter growth uses the observed 2025-Q4 level as the reference for 2026-Q1, then each draw’s preceding forecasted quarter level as the reference for later 2026 quarters.
The predicted quarter levels imply both quarter-on-quarter growth and the full-year 2026 growth rate relative to the observed 2025 annual total:
\[g_{\mathrm{QoQ},q}^{(s)} = 100\left(\frac{\widehat{G}_q^{(s)}}{\widehat{G}_{q-1}^{(s)}} - 1\right), \qquad g_{\mathrm{2026/2025}}^{(s)} = 100\left( \frac{\sum_{q \in 2026} \widehat{G}_q^{(s)}}{\sum_{q \in 2025} G_q} - 1 \right).\]
# A tibble: 1 × 7
year date reference_date observed_annual_gdp_2025 median_annual_growth
<dbl> <date> <date> <dbl> <dbl>
1 2026 2026-12-31 2025-12-31 2638020 0.83
# ℹ 2 more variables: q0.025_annual_growth <dbl>, q0.975_annual_growth <dbl>
The posterior median QoQ growth stays between -0.05% and 0.16% over 2026. The posterior median full-year 2026 growth relative to the observed 2025 annual sum is 0.83%, with 95% equal-tailed interval [-1.43, 3.19].
For comparison, the Banque de France forecast is 0.9% and the OFCE forecast is 0.8%. The posterior median from the selected latent-component model is close to these external forecasts, while the posterior interval remains wide. This suggests that the univariate specification captures the central 2026 trajectory reasonably well, but leaves substantial uncertainty that could be reduced by adding external macroeconomic predictors or richer multivariate structure.