Study design
As in any study, the sampling or experimental design of microbiome studies should include sufficient independent replicates, avoiding confounding effects as much as possible, with the samples representing appropriate ecological scales given the processes investigated. Microbiome sampling design must also be well-planned and appropriate to the specific hypothesis that is being tested. When testing hypotheses pertaining to the impact of outlier external drivers (e.g., fire, pollution events, natural disasters), studies would ideally feature samples that were collected both before and after the event, that are not confounded by habitat type, geography or physicochemistry. Before embarking on microbiome studies in the wild, particularly those of which are opportunistic (i.e., with samples originally collected for other purposes), researchers should carefully consider if autocorrelation of factors beyond their control could impede the interpretation of results. In other words, researchers must be realistic about what can be accomplished with limited sample sets, since rigorous hypothesis testing requires equally rigorous sampling protocols and study design.
In addition, the sampling of microbial communities should take into account their high heterogeneity at small spatial scales due to micro/mesoscale heterogeneity of their environment (Vos et al., 2013; Zhang et al., 2014) or neutral assembly dynamics (Woodcock et al., 2007)). For example, composite samples (i.e. pooled individual samples) can be combined prior to homogenisation and sub-sampling, in order to reduce the local, micro-scale heterogeneity if it is irrelevant to the questions being studied (George et al., 2019). Here, knowledge of how, and at what scale, the target community responds to external drivers will inform adequate sampling design. For example, a composite 0.2mg sediment sample is likely to be representative of the bacterial, archaeal, and microbial eukaryotic biospheres, but will not sample microscopic invertebrate taxa effectively, due to issues of scale (Nascimento et al., 2018). Smaller samples will contain some microscopic taxa and trace environmental DNA but they are inadequate at representing the underlying meio- and macro-faunal communities. As the target organisms grow in size, the sample volume and spatial extent of the studied area should be correspondingly expanded.