k <- 80L
B_aug <- seq(-5, max(x) + 5, by = 1)
B_basis <- bs(
x,
df = k,
degree = 3,
intercept = FALSE,
Boundary.knots = range(B_aug)
)
B <- unclass(B_basis)
A_ps <- Matrix(B, sparse = TRUE)Univariate P-splines in R-INLA and inlabru
P-splines combine a flexible B-spline basis with a roughness penalty on the coefficients. Their main practical objective is to make model specification more robust to the number and locations of the knots used to define the basis. In a Bayesian setting, that penalty is naturally represented by an n-th-order random walk prior on the spline weights. This short blog post shows how to fit a smoothing P-spline in R-INLA.
Useful academic work on P-splines include:
Paul Eilers and Brian Marx’s original article introducing P-splines, Flexible Smoothing with B-splines and Penalties, https://projecteuclid.org/journals/statistical-science/volume-11/issue-2/Flexible-smoothing-with-B-splines-and-penalties/10.1214/ss/1038425655.pdf.
Paul Eilers and Brian Marx’s masterpiece Practical Smoothing: The Joys of P-splines, https://psplines.bitbucket.io/.
Stefan Lang and Andreas Brezger’s Bayesian P-splines paper, https://www.tandfonline.com/doi/abs/10.1198/1061860043010.
Aritz Adin Urtasun’s Ph.D. thesis Hierarchical and spline-based models in space-time disease mapping. I could not find the original link, so I uploaded the file to my personal Google Drive here.
Dae-Jin Lee’s Ph.D. thesis Smoothing mixed models for spatial and spatio-temporal data, https://e-archivo.uc3m.es/entities/publication/72bffb09-ff83-48eb-acba-8aaed8de45f4.
Related blog posts:
Tristan Mahr’s blog post Random effects and penalized splines are the same thing, https://www.tjmahr.com/random-effects-penalized-splines-same-thing/.
Gavin Simpson’s blog post Extrapolating with B splines and GAMs, https://fromthebottomoftheheap.net/2020/06/03/extrapolating-with-gams/.
B-spline basis
For a scalar covariate \(x\), a P-spline smooth approximates the unknown function \(f\) through a finite-dimensional decomposition:
\[ f(x)= \beta_0 + \sum_{j=1}^{K} B_j(x)\alpha_j, \]
where \(B_j(x)\) are spline basis functions defined over the full domain of \(x\). The basis itself is determined by two ingredients: a polynomial degree and a knot sequence. The knots split the covariate range into intervals, and on each interval every \(B_j\) is an ordinary polynomial. In practice one often uses cubic B-splines, that is, splines of degree 3, because they are flexible while remaining smooth enough for most regression problems.
A central property of B-splines is local support. Each \(B_j(x)\) is zero outside a limited portion of the covariate range, so at a fixed location \(x\) only a few basis functions are active. For splines of degree \(d\), at most \(d + 1\) basis functions are nonzero at the same \(x\). This is what makes the basis local: changing one coefficient mainly perturbs the fitted curve near the associated knot region instead of changing the function everywhere.
Another useful property is that the basis functions form a partition of unity, meaning
\[ \sum_{j=1}^{K} B_j(x) = 1. \]
This makes the spline behave like a local weighted average of the coefficients. It also clarifies why one has to think about identifiability: if the basis can already represent constant functions, then the intercept and the spline part are not automatically separated unless one imposes a constraint or centers the smooth.
The coefficients \((\alpha_1,\ldots,\alpha_K)\) control the contribution of each basis function. For observations \(x_1,\ldots,x_n\), the basis enters the model through the design matrix
\[ \mathbf{B} = \begin{pmatrix} B_1(x_1) & \cdots & B_K(x_1) \\ \vdots & \ddots & \vdots \\ B_1(x_n) & \cdots & B_K(x_n) \end{pmatrix}. \]
Row \(i\) shows how the basis combines to represent the smooth at \(x_i\), and column \(j\) shows where coefficient \(\alpha_j\) contributes. Because of local support, this matrix is typically sparse or at least strongly banded in structure, which is one of the reasons B-splines are computationally attractive.
However, the fit quality of B-splines has been shown to be highly sensitive to the number and locations of the knots used to define the basis, leading to poor representation when too few knots are used and to overfitting when too many are used.
P-splines provide a fallback solution to this issue: instead of leaving the parameters \(\alpha\) unconstrained, they penalise their finite differences so that neighbouring coefficients are encouraged to vary smoothly. With the standard second-order penalty, the second differences satisfy
\[ \alpha_j - 2\alpha_{j-1} + \alpha_{j-2} \sim N(0, \tau^{-1}), \qquad j = 3, \ldots, K. \]
Large values of \(\tau\) shrink the second differences more strongly and therefore produce smoother functions, while smaller values allow more local curvature. The role of the P-spline is therefore split into two parts: the B-spline basis provides a rich approximation space, and the RW2 penalty regularises that space so that the fitted curve remains smooth.
Although it is possible to construct P-splines in several ways, I keep the rest of this post close to plain R-INLA so the basis construction and the RW2 penalty remain explicit.
Data import
To keep the discussion concrete, I will work with daily closing values of the CAC 40 index. On Yahoo Finance, the CAC 40 index is available under the ticker ^FCHI.
For the purpose of this blog post, I will use
Basis construction
For illustrative purposes, I will model the closing series y as a smooth function of the trading-day index x. With \(x = 1, \ldots, n\) (n = 1615), the B-spline basis is built as follows. To reduce boundary effects, I extend the covariate range slightly and construct the basis on the augmented grid from -5 to n + 5, then evaluate it back on the observed index.
Here B is the B-spline design matrix evaluated at the observed covariate values. The argument k controls the basis dimension and therefore the richness of the spline representation. The extended boundary knots help stabilise the fitted spline near the endpoints. With the chosen degree, this induces the following internal knot locations:
print(attr(B_basis, "knots")) [1] 21.69231 42.38462 63.07692 83.76923 104.46154 125.15385
[7] 145.84615 166.53846 187.23077 207.92308 228.61538 249.30769
[13] 270.00000 290.69231 311.38462 332.07692 352.76923 373.46154
[19] 394.15385 414.84615 435.53846 456.23077 476.92308 497.61538
[25] 518.30769 539.00000 559.69231 580.38462 601.07692 621.76923
[31] 642.46154 663.15385 683.84615 704.53846 725.23077 745.92308
[37] 766.61538 787.30769 808.00000 828.69231 849.38462 870.07692
[43] 890.76923 911.46154 932.15385 952.84615 973.53846 994.23077
[49] 1014.92308 1035.61538 1056.30769 1077.00000 1097.69231 1118.38462
[55] 1139.07692 1159.76923 1180.46154 1201.15385 1221.84615 1242.53846
[61] 1263.23077 1283.92308 1304.61538 1325.30769 1346.00000 1366.69231
[67] 1387.38462 1408.07692 1428.76923 1449.46154 1470.15385 1490.84615
[73] 1511.53846 1532.23077 1552.92308 1573.61538 1594.30769
The resulting basis functions along the covariate index are shown below.
basis_df <- data.frame(
x = rep(x, times = ncol(B)),
basis = factor(rep(seq_len(ncol(B)), each = nrow(B))),
value = c(B)
)
knot_df <- data.frame(knot = unname(attr(B_basis, "knots")))
p_basis <- ggplot(basis_df, aes(x = x, y = value, group = basis, color = basis)) +
geom_line(linewidth = 0.35, show.legend = FALSE) +
geom_vline(
data = knot_df,
aes(xintercept = knot),
inherit.aes = FALSE,
colour = "grey45",
linetype = 3
) +
scale_color_manual(values = grDevices::hcl.colors(ncol(B), palette = "Dynamic")) +
labs(
x = "Trading-day index",
y = "Basis value",
title = "B-spline basis along x"
) +
theme_minimal()
interactive_plot(p_basis, height = 700)