Data preparation

All data manipulations and later analyses were conducted using the free software R 3.6.2 (R Core Team 2019) and Rstudio 1.1.442 (RStudio Team 2016). To approximate normal distributions, all raw data (EFraw) were boxcox-transformed (EFboxcox) using two lambda-values (λ and λ2) estimated with the package ”geoR” (Ribeiro Jr 2020):
\begin{equation} \text{EF}_{\text{boxcox}}=\ boxcoxtransformed\ \left(\text{EF}\right)=\ \frac{\text{EF}_{\text{raw}}+\ \lambda_{2}^{\text{\ \ \ λ}}-1}{\lambda}\nonumber \\ \end{equation}
To scale all EFs to a comparable range of 0 to 1, the EFboxcox were minmax-transformed (EFminmax):
\begin{equation} \text{EF}_{\text{minmax}}=minmaxtransformed\ \left(\text{EF}\right)=\ \frac{\text{EF}_{\text{boxcox}}-\ min(\text{EF}_{\text{boxcox}})}{\left(\text{EF}_{\text{boxcox}}\right)\ -min(\text{EF}_{\text{boxcox}})}\nonumber \\ \end{equation}

Variation in individual EFs

The variation of individual EFs was quantified as a standard deviation over all data points (individual measures on plots). The individual EFs are often measured at the same time (Supporting information A, Table S2). Thus, variation of individual EFs is expected to be comparable and not biased by the identity of years and seasons measurements were taken. However, we tested whether the variation of individual EFs depended on the number of repeated measures, meaning how often in time EFs were measured (number of years * number of seasons). Therefore, a model with the standard deviation per individual EF depending on the explanatory variable “number of repeated measures” (number of years * number of seasons an individual EFs was measured) was run.
The drivers of the variation in individual EFs (EFminmax), were tested in a linear model with the explanatory terms ”block” (factor with four levels), ”SR” (initial number of species planted, log-transformed continuous variable), ”plotID” (factor with 80 levels), ”season” (factor with 3 levels, as no measurements were done in winter), ”year” (continuous variable), and their interactions. The plot identity (plotID) effect mainly accounts for differences among the initially planted communities. This set of terms is referred to as ”drivers” in the following. The same model was conducted for EFs measured only once per year excluding ”season” and the respective interaction terms.
To analyse whether different classes of EFs were affected differently by drivers, the variance in individual EFs explained by individual drivers was calculated by dividing the sum of squares explained by the driver by the total sum of squares in the repective model of the individual EFs explained above. In a subsequent model the explained variation per EF and per driver were used as meta-data. The variation was tested against the classes of EFs (with the different classes of EFs as levels) and the drivers (with the levels ”block”, ”SR”, ”plotID”, ”season”, ”year”, and their interactions) as independent variables.

Relationships between pairs of EFs

Relationships between EF pairs were statistically investigated using covariances and correlations. In correlations, the relationship between two EFs was standardised by the variation of the individual EFs (product of their standard deviations), enabling us to compare relationships between different EF pairs. To calculate correlation coefficients, we used the R-package Hmisc 4.4-2 (Harrell Jr 2020). We used the non-standardised relationships (covariances) to analyse the influence of drivers on relationships among EFs.
Variation in EF correlations
To quantify the general strength and variation of EF correlations, we calculated the mean and the standard deviation of Fisher’s Z-transformed correlation coefficients for each EF pair. Correlation coefficients were calculated among measurements on all plots at a particular time point and then averaged across time points. Hence we refer to this correlation as the mean correlation. It includes the effects of species richness and plot identity. In order to plot the EF relationships as correlation coefficients on a scale of –1 (perfect negative) to 1 (perfect positive correlation), the mean correlation coefficients were back-transformed from Z-scale.
The standard deviation of the individual correlations at the different time points quantifies the temporal variation (among seasons and years) of correlations among EFs. However, using all time points to calculate the temporal variation, might be influenced by the number of time points and the identity of time points (deviating years or seasons). Therefore, first, we checked whether this temporal variation, based on all time points, depended on the number of time points. We analysed the temporal variation of the correlations per EF pair as a function of the number of timepoints that EF pair was measured (number of years* number of seasons). The number of repeated measures for pairwise EFs, meaning the number of times two EFs were measured at the same time (same year and same season), ranged from 0 to 36 times (Supporting information B, Table S2 contains an overview of the individual EFs and at what time (years and seasons) they were measured). Second, we checked whether the variation of correlations per EF pair depended on the identity of the time point that EF pair was measured. Therefore, for each EF-pair we randomly chose four time points to calculate a standard deviation of the respective correlation coeffcients. For each EF-pair this was done 20 times. The range of these 20 standard deviations per EF pair was used to check whether the standard deviation for that EF pair was stable (small range indicating no identity effect of years and seasons) or not (large range indicating strong identity effects of years or seasons).
Drivers of the covariance between EF pairs
To analyse whether years, seasons, species richness, and plot identity affect EF relationships by driving individual EFs in similar or opposing ways, we partitioned overall covariances into contributions of the different explanatory terms. Here, plot identity was further decomposed in the effects of functional group richness, and the presence of the functional groups legumes, herbs (tall and short herbs combined), or grasses. This decomposition of covariances was based on an additive partitioning of sums of products (SPs) in the same way as additive partitioning of sum of squares (SS) is used in a decomposition of variances in an analysis of variance (ANOVA). This type of covariance analysis has previously been used to investigate, for example, the influence of explanatory terms on trait-trait relationships (He, Wang et al. 2009) and is frequently used in quantitative genetic and phylogenetic approaches (Kempthorne 1957, Bell 1989).
The sums of products, which are equivalent to covariances, were obtained per EF pair using the following formula:
\begin{equation} SP(X,Y)=\frac{SS(X+Y)\ -\ SS(X)\ -\ SS(Y)}{2}\nonumber \\ \end{equation}
where X and Y are the EFs of interest, and X+Y is the sum of the two EFs. The SS were obtained from general linear models (implemented with the lm() function in R (Mangiafico 2015)) with the explanatory terms ”block”, ”log(SR)”, ”plotID”, ”season”, ”year”, and the interactions ”season:year”, ”log(SR):(season + year + season:year)”, ” plotID:(season + year + season:year)” (note that here, following conventions of R, we use the colon instead of a multiplication sign as interaction operator). For each EF pair, three linear models were run: one for each of the individual EFs (X and Y) and one for the sum of the two EFs (X + Y), based on the measurements from different time points of EFminmax. Like in ANOVA, SPs are divided by their degrees of freedom to obtain mean SPs (MSPs), which are divided by residual MSP to calculate F-ratios and significances. Because there are nested effects, not all terms could be tested against ”Residuals”. ”Block” and ”log(SR)” had to be tested at the level of variation between plots with different species compositions (plotID). Similarly, the interaction terms ”log(SR):(season + year + season:year)” had to be tested for the same reason against “plotID:(season + year + season:year)”. All other terms were tested against ”Residuals” (Supporting Information B, Table S3). It has been shown that for balanced experimental designs such as the Jena Experiment this method is comparable to linear mixed-model analysis using restricted maximum likelihood methods (Schmid, Baruffol et al. 2017).
Because SPs are additive, we can express the influence of each driver on EF covariation (i.e., the relationship between the EFs) by calculating the percentage of the (absolute) total sum of products explained, similar to a percentage variance explained (He, Wang et al. 2009). However, unlike variances, covariances are either positive, indicating a positive relationship between two variables, or negative, indicating an inverse (i.e. trade-off) relationship between two variables. The sign of SPs for each explanatory term informs us about whether covariances are positive or negative. This means that we could deduce whether the individual drivers affected the EFs in a pair in an trade-off (negative covariance) or synergistic (positive covariance) way. Therefore, we show ”signed percentages” of covariance in the results by multiplying the absolute percentages with the sign of the respective covariance.

RESULTS

Variation in individual EFs

First, we compared the variation of individual EFs and EF classes. The average standard deviation, calculated by averaging all standard deviations of all EFs, was 0.17. While some EFs varied strongly in time among replicated measures, other EFs showed a low variation (Table 2; minimum standard deviation was 0.07 for plant carbon, maximum standard deviation was 0.38 for plant sodium). The variation of individual EFs did not depend on the number of times (number of years * number of seasons) they were measured (F1,29= 0.753, p= 0.393)(Supporting information C, Fig. S3). Classes of EFs did not differ significantly in their variation (F7,23=0.76, p=0.63; Table 2).