Univariate P-splines in R-INLA and inlabru

Methods
Author

Olivier Supplisson

Published

April 12, 2026

P-splines combine a flexible B-spline basis with a roughness penalty on the coefficients. Their main practical objective is to make model specification more robust to the number and locations of the knots used to define the basis. In a Bayesian setting, that penalty is naturally represented by an n-th-order random walk prior on the spline weights. This short blog post shows how to fit a smoothing P-spline in R-INLA.

Useful academic work on P-splines include:

Related blog posts:

B-spline basis

For a scalar covariate \(x\), a P-spline smooth approximates the unknown function \(f\) through a finite-dimensional decomposition:

\[ f(x)= \beta_0 + \sum_{j=1}^{K} B_j(x)\alpha_j, \]

where \(B_j(x)\) are spline basis functions defined over the full domain of \(x\). The basis itself is determined by two ingredients: a polynomial degree and a knot sequence. The knots split the covariate range into intervals, and on each interval every \(B_j\) is an ordinary polynomial. In practice one often uses cubic B-splines, that is, splines of degree 3, because they are flexible while remaining smooth enough for most regression problems.

A central property of B-splines is local support. Each \(B_j(x)\) is zero outside a limited portion of the covariate range, so at a fixed location \(x\) only a few basis functions are active. For splines of degree \(d\), at most \(d + 1\) basis functions are nonzero at the same \(x\). This is what makes the basis local: changing one coefficient mainly perturbs the fitted curve near the associated knot region instead of changing the function everywhere.

Another useful property is that the basis functions form a partition of unity, meaning

\[ \sum_{j=1}^{K} B_j(x) = 1. \]

This makes the spline behave like a local weighted average of the coefficients. It also clarifies why one has to think about identifiability: if the basis can already represent constant functions, then the intercept and the spline part are not automatically separated unless one imposes a constraint or centers the smooth.

The coefficients \((\alpha_1,\ldots,\alpha_K)\) control the contribution of each basis function. For observations \(x_1,\ldots,x_n\), the basis enters the model through the design matrix

\[ \mathbf{B} = \begin{pmatrix} B_1(x_1) & \cdots & B_K(x_1) \\ \vdots & \ddots & \vdots \\ B_1(x_n) & \cdots & B_K(x_n) \end{pmatrix}. \]

Row \(i\) shows how the basis combines to represent the smooth at \(x_i\), and column \(j\) shows where coefficient \(\alpha_j\) contributes. Because of local support, this matrix is typically sparse or at least strongly banded in structure, which is one of the reasons B-splines are computationally attractive.

However, the fit quality of B-splines has been shown to be highly sensitive to the number and locations of the knots used to define the basis, leading to poor representation when too few knots are used and to overfitting when too many are used.

P-splines provide a fallback solution to this issue: instead of leaving the parameters \(\alpha\) unconstrained, they penalise their finite differences so that neighbouring coefficients are encouraged to vary smoothly. With the standard second-order penalty, the second differences satisfy

\[ \alpha_j - 2\alpha_{j-1} + \alpha_{j-2} \sim N(0, \tau^{-1}), \qquad j = 3, \ldots, K. \]

Large values of \(\tau\) shrink the second differences more strongly and therefore produce smoother functions, while smaller values allow more local curvature. The role of the P-spline is therefore split into two parts: the B-spline basis provides a rich approximation space, and the RW2 penalty regularises that space so that the fitted curve remains smooth.

Although it is possible to construct P-splines in several ways, I keep the rest of this post close to plain R-INLA so the basis construction and the RW2 penalty remain explicit.

Data import

To keep the discussion concrete, I will work with daily closing values of the CAC 40 index. On Yahoo Finance, the CAC 40 index is available under the ticker ^FCHI.

For the purpose of this blog post, I will use

Basis construction

For illustrative purposes, I will model the closing series y as a smooth function of the trading-day index x. With \(x = 1, \ldots, n\) (n = 1615), the B-spline basis is built as follows. To reduce boundary effects, I extend the covariate range slightly and construct the basis on the augmented grid from -5 to n + 5, then evaluate it back on the observed index.

k <- 80L
B_aug <- seq(-5, max(x) + 5, by = 1)
B_basis <- bs(
  x,
  df = k,
  degree = 3,
  intercept = FALSE,
  Boundary.knots = range(B_aug)
)
B <- unclass(B_basis)
A_ps <- Matrix(B, sparse = TRUE)

Here B is the B-spline design matrix evaluated at the observed covariate values. The argument k controls the basis dimension and therefore the richness of the spline representation. The extended boundary knots help stabilise the fitted spline near the endpoints. With the chosen degree, this induces the following internal knot locations:

print(attr(B_basis, "knots"))
 [1]   21.69231   42.38462   63.07692   83.76923  104.46154  125.15385
 [7]  145.84615  166.53846  187.23077  207.92308  228.61538  249.30769
[13]  270.00000  290.69231  311.38462  332.07692  352.76923  373.46154
[19]  394.15385  414.84615  435.53846  456.23077  476.92308  497.61538
[25]  518.30769  539.00000  559.69231  580.38462  601.07692  621.76923
[31]  642.46154  663.15385  683.84615  704.53846  725.23077  745.92308
[37]  766.61538  787.30769  808.00000  828.69231  849.38462  870.07692
[43]  890.76923  911.46154  932.15385  952.84615  973.53846  994.23077
[49] 1014.92308 1035.61538 1056.30769 1077.00000 1097.69231 1118.38462
[55] 1139.07692 1159.76923 1180.46154 1201.15385 1221.84615 1242.53846
[61] 1263.23077 1283.92308 1304.61538 1325.30769 1346.00000 1366.69231
[67] 1387.38462 1408.07692 1428.76923 1449.46154 1470.15385 1490.84615
[73] 1511.53846 1532.23077 1552.92308 1573.61538 1594.30769

The resulting basis functions along the covariate index are shown below.

basis_df <- data.frame(
  x = rep(x, times = ncol(B)),
  basis = factor(rep(seq_len(ncol(B)), each = nrow(B))),
  value = c(B)
)

knot_df <- data.frame(knot = unname(attr(B_basis, "knots")))

p_basis <- ggplot(basis_df, aes(x = x, y = value, group = basis, color = basis)) +
  geom_line(linewidth = 0.35, show.legend = FALSE) +
  geom_vline(
    data = knot_df,
    aes(xintercept = knot),
    inherit.aes = FALSE,
    colour = "grey45",
    linetype = 3
  ) +
  scale_color_manual(values = grDevices::hcl.colors(ncol(B), palette = "Dynamic")) +
  labs(
    x = "Trading-day index",
    y = "Basis value",
    title = "B-spline basis along x"
  ) +
  theme_minimal()

interactive_plot(p_basis, height = 700)