For information on using this system, please visit this page.
Currently, in both R-release and R-devel terms.formula incorrectly handles some terms as duplicated. This seems due to a problem with termsform from stats/src/model.c but so far I couldn't track down the precise problem. Specifically, only the order of the arguments in function calls seems to be checked but not their names. Therefore, the terms f(x, a = z) and f(x, b = z) are deemed to be duplicated and one of the terms is thus dropped. R> attr(terms(y ~ f(x, a = z) + f(x, b = z)), "term.labels") [1] "f(x, a = z)" However, changing the arguments or the order of arguments keeps both terms: R> attr(terms(y ~ f(x, a = z) + f(x, b = zz)), "term.labels") [1] "f(x, a = z)" "f(x, b = zz)" R> attr(terms(y ~ f(x, a = z) + f(b = z, x)), "term.labels") [1] "f(x, a = z)" "f(b = z, x)" We (= Nikolaus Umlauf and myself) came across this problem when setting up certain smooth regressors with different kinds of patterns. As a trivial simplified example we can generate the same kind of problem with rep(). Consider the two dummy variables rep(x = 0:1, each = 4) and rep(x = 0:1, times = 4). With the response y = 1:8 I get: R> lm((1:8) ~ rep(x = 0:1, each = 4) + rep(x = 0:1, times = 4)) Call: lm(formula = (1:8) ~ rep(x = 0:1, each = 4) + rep(x = 0:1, times = 4)) Coefficients: (Intercept) rep(x = 0:1, each = 4) 2.5 4.0 So while the model is identified because the two regressors are not the same, terms.fomula does not recognize this and drops the second regressor. What I would have wanted can be obtained by switching the arguments: R> lm((1:8) ~ rep(each = 4, x = 0:1) + rep(x = 0:1, times = 4)) Call: lm(formula = (1:8) ~ rep(each = 4, x = 0:1) + rep(x = 0:1, times = 4)) Coefficients: (Intercept) rep(each = 4, x = 0:1) rep(x = 0:1, times = 4) 2 4 1 Of course, here I could avoid the problem by setting up proper factors etc. But I hope this should make the general problem clear.
Fixed in R-devel and R 3.4.0.