Customer Innovation Centre for Organic Farming, Tove Mariegaard Pedersen
Customer ID DA00204-22
Project The microbial community of the field, 2. year (2022).
Sample Type Soil
Number of samples 48 samples
Type of data Shotgun metagenomics

Introduction to the biostatistical analysis

The Project

The current report describes microbiome profiles of 48 samples collected in the second year of the project, year 2022, across 48 fields. For each field, one sample was collected to represent the field, corresponding to the ‘main’ samples collected for each filed from 2021. These samples were taken for each field based on 16 subsamples taken in a w-pattern throughout the field.

In 2023, the scope will be expanded to include a total of approx. 100 fields (both conventional and organic fields). In this report we aalyse the samples from 2022 alone. Later we will perform joined analysis to both identify robust patterns arcross the years and evaluate if any patterns appear year-specific indicating varying conditions such as whether.
The aim is to evaluate how the microbiome of the fields associate with other field parameters of both agricultural practices and soil indicators of nutrients, type and structure.
Only when the full data set for 2021-2023 is ready, a final analysis can be conducted.

We separate the evaluations into two R3 reports:

Analysis

In “Report 3”, biostatistical analyses are performed and the results presented, building on the data generated and evaluated in the 2 prior reports (Report 1: Sequencing and data processing report, Report 2: Microbiome profiling report).
Through biostatistical analysis we relate the microbiome profiles to the key variables selected for year 2022. The focus here is to evaluate how and to what extent the variables shape and relate to the soil microbiome composition and diversity. We therefore focus on the overall structure of the microbiome also called the microbiome composition and the diversity.

The key variables assessed in this report are summarized with summary statistics across the 48 samples in the below table.

Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
JB_value 48
… 1 19 39.6%
… 2 5 10.4%
… 5 1 2.1%
… 6 20 41.7%
… 7 3 6.2%
Earthworm_status 48
… 0 11 22.9%
… 1 37 77.1%
Cold_soil 48
… 0 33 68.8%
… 1 15 31.2%
Compact_soil 48
… 0 40 83.3%
… 1 8 16.7%
field_well_drained 48
… 0 7 14.6%
… 1 41 85.4%
Mulching_of_straw 48
… 0 25 52.1%
… 1 23 47.9%
Clovergrass_within_3_years 48
… 0 36 75%
… 1 12 25%
No_plough 48
… 0 36 75%
… 1 12 25%
ConservationAgriculture 48
… 0 45 93.8%
… 1 3 6.2%
Years_since_plowing 48 3.417 2.988 1 1 4.25 11
Rt 48 6.487 0.489 5.7 6.2 6.75 7.6
Phosphorus 48 3.081 1.301 0.7 2.175 3.925 6
Potassium 48 10.473 6.247 1.5 7.325 14 41
Magnesium 48 6.89 2.577 1.9 5.5 8.15 16
Cobber 48 2.481 0.885 1 1.8 3.1 5.1
Organic_material_perc 48 3.025 1.363 1.16 2.173 3.372 8.74
Clay_perc 48 9.258 4.727 2.4 4.675 12.85 20
Nitrogen_perc 48 0.144 0.052 0.07 0.1 0.17 0.28
Organic_farm 48
… 0 24 50%
… 1 24 50%
Years_since_turning_organic 48 4.521 4.758 1 1 7.25 15
Livestock 48
… 0 20 41.7%
… 1 28 58.3%
Livestock_manure 48
… 0 16 33.3%
… 1 32 66.7%
Commercial.fertilizer 48
… 0 24 50%
… 1 24 50%
Vinasse 48
… 0 46 95.8%
… 1 2 4.2%
Cast 48
… 0 46 95.8%
… 1 2 4.2%
Degassed.fertilizer 48
… 0 35 72.9%
… 1 13 27.1%
Chalked 47
… 0 38 80.9%
… 1 9 19.1%

Table 1: Summary statistics of the key variables selected for evaluation in relation to the fields microbiome profiles in year 2022.

Evaluation of overall microbiome profiles

We initiate the evaluation of the 10 samples (1 per field) with a stacked barplot of the microbiome profiles in each sample. This allows us to make a first evaluation of the extent of difference in the taxonomic profiles between the fields.

Note that in order to show the organisms with a color scheme that is interpretable, it is necessary to filter the profiles and select a subset of the most abundant clades to be included in the plots. The filtering used is specified in the axis labels of each plot (e.g. >2% in the relative abundance plots mean that a clade must have a relative abundance across samples of more than 2% in order to be included in the plot).

Stacked barplots

The stacked barplots allow us to visually access the stability of the taxonomic profile across the fields, and get a feeling of the level to which individual clades are found across field or more sporadic.

Phylum


Figure 1: Visualization of the microbial community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. A) Clade abundance was transformed to relative abundance to sum to 100% in each sample B) Taxa abundance was transformed to absolute abundance. Taxa shown in the total abundance plot are selected to match the taxa that are included in the corresponding relative abundance plot (A).


Class

Figure 2: Visualization of the microbial community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. A) Clade abundance was transformed to relative abundance to sum to 100% in each sample B) Taxa abundance was transformed to absolute abundance. Taxa shown in the total abundance plot are selected to match the taxa that are included in the corresponding relative abundance plot (A).


Order

Figure 3: Visualization of the microbial community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. A) Clade abundance was transformed to relative abundance to sum to 100% in each sample B) Taxa abundance was transformed to absolute abundance. Taxa shown in the total abundance plot are selected to match the taxa that are included in the corresponding relative abundance plot (A).


Differences in alpha-diversity

As described in Report 2, alpha diversity is a measure of the diversity within (or complexity within) one microbiome community (or sample). We here evaluate the two measures of alpha diversity; Shannon and observed species (a measure of richness). The measures are introduced in Report 2.


Alpha-diversity in relation to each environmental variable

In the following plots the measures of alpha diversity and percentage of fungi is evaluated in relation to each key environmental variable. Following the plots is a table with the statistical analysis of the relations.

Grouping variables

JB value

alt text here)

Figure 25: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Earthworm_status

alt text here)

Figure 26: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Cold soil

alt text here)

Figure 27: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Compact soil

alt text here)

Figure 28: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Field well drained

alt text here)

Figure 29: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Mulching_of_straw

alt text here)

Figure 30: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Clovergrass within 3 years

alt text here)

Figure 31: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


No plough

alt text here)

Figure 32: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Conservation Agriculture

alt text here)

Figure 33: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Years since plowing

alt text here)

Figure 34: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Organic farm

alt text here)

Figure 35: Illustration of the alpha diversity levels across the levels or values of the environmental variable. Note that we have coded the none-organic fields as 0.


Years since turning organic

alt text here)

Figure 36: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Livestock

alt text here)

Figure 37: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Livestock manure

alt text here)

Figure 38: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Commercial fertilizer

alt text here)

Figure 39: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Vinasse

alt text here)

Figure 40: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Cast

alt text here)

Figure 41: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Degassed fertilizer

alt text here)

Figure 42: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Chalked

alt text here)

Figure 43: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Continous variables

Rt

alt text here)

Figure 44: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Phosphorus

alt text here)

Figure 45: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Potassium

alt text here)

Figure 46: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Magnesium

alt text here)

Figure 47: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Cobber

alt text here)

Figure 48: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Organic material (perc)

alt text here)

Figure 49: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Clay (perc)

alt text here)

Figure 50: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Nitrogen (perc)

alt text here)

Figure 51: Illustration of the alpha diversity levels across the levels or values of the environmental variable.


Statistical assessment

An analysis of Variance Model (ANOVA) was used to evaluate if the mean diversity differed significantly between the levels of grouping variables, and a robust linear regression was used to assess the relationship for continuous variables.

Observed Shannon Perc. fungi
Variable Df Sum.Sq F.value P Df Sum.Sq F.value P Df Sum.Sq F.value P
JB_value 4 321689.317 4.327 4.97e-03 4 3.643 2.685 4.38e-02 4 9.641 4.858 2.54e-03
Earthworm_status 1 6682.713 0.276 6.02e-01 1 0.039 0.099 7.54e-01 1 0.16 0.239 6.27e-01
Cold_soil 1 5684.802 0.235 6.31e-01 1 0.633 1.655 2.05e-01 1 1.202 1.857 1.80e-01
Compact_soil 1 9475.267 0.392 5.34e-01 1 0.029 0.073 7.88e-01 1 0.956 1.465 2.32e-01
field_well_drained 1 36755.227 1.56 2.18e-01 1 0.35 0.899 3.48e-01 1 0.146 0.217 6.43e-01
Mulching_of_straw 1 25386.264 1.066 3.07e-01 1 0.294 0.755 3.89e-01 1 0.177 0.264 6.10e-01
Clovergrass_within_3_years 1 1167.361 0.048 8.28e-01 1 0.291 0.747 3.92e-01 1 0.186 0.279 6.00e-01
No_plough 1 12656.25 0.525 4.72e-01 1 0.107 0.271 6.05e-01 1 0.526 0.795 3.77e-01
ConservationAgriculture 1 38866.806 1.652 2.05e-01 1 0.097 0.245 6.23e-01 1 0.958 1.468 2.32e-01
Years_since_plowing 1 13450.161 0.559 4.59e-01 1 0.451 1.168 2.85e-01 1 1.717 2.7 1.07e-01
Organic_farm 1 5896.333 0.243 6.24e-01 1 0 0.001 9.81e-01 1 0.147 0.219 6.42e-01
Years_since_turning_organic 1 3402.007 0.14 7.10e-01 1 0.028 0.071 7.91e-01 1 0.369 0.554 4.61e-01
Livestock 1 422.002 0.017 8.96e-01 1 0.561 1.461 2.33e-01 1 1.773 2.793 1.01e-01
Livestock_manure 1 26004.167 1.093 3.01e-01 1 1.343 3.659 6.20e-02 1 0.175 0.262 6.11e-01
Commercial.fertilizer 1 5896.333 0.243 6.24e-01 1 0 0.001 9.81e-01 1 0.147 0.219 6.42e-01
Vinasse 1 4900.612 0.202 6.55e-01 1 0.415 1.072 3.06e-01 1 0.056 0.084 7.74e-01
Cast 1 25271.308 1.061 3.08e-01 1 0.427 1.103 2.99e-01 1 0.727 1.105 2.99e-01
Degassed.fertilizer 1 21748.097 0.91 3.45e-01 1 0.902 2.396 1.29e-01 1 0.069 0.103 7.50e-01
Chalked 1 8113.5 0.334 5.66e-01 1 0.103 0.26 6.12e-01 1 1.693 2.63 1.12e-01

Table 23: Results from ANOVA analysis across all fields. The table shows results from ANOVA analyses including samples from all fields. The table shows the obtained statistical values for each of the environmental variables (rows) and the three microbiome features (columns).

Observed Shannon Perc. fungi
Variable Estimate SE t.value P Estimate SE t.value P Estimate SE t.value P
Rt 99.809 63.566 1.57 1.23e-01 -0.175 0.239 -0.733 4.67e-01 -0.309 0.241 -1.285 2.05e-01
Phosphorus -4.484 29.902 -0.15 8.81e-01 -0.07 0.075 -0.931 3.57e-01 0.046 0.092 0.503 6.17e-01
Potassium 5.011 4.534 1.105 2.75e-01 0.026 0.025 1.051 2.99e-01 -0.039 0.018 -2.138 3.78e-02
Magnesium -3.991 13.992 -0.285 7.77e-01 0.082 0.039 2.107 4.06e-02 -0.053 0.046 -1.155 2.54e-01
Cobber -54.171 40.2 -1.348 1.84e-01 0.129 0.191 0.675 5.03e-01 -0.143 0.134 -1.071 2.90e-01
Organic_material_perc -2.121 26.788 -0.079 9.37e-01 0.066 0.082 0.805 4.25e-01 -0.131 0.086 -1.527 1.34e-01
Clay_perc -13.916 6.383 -2.18 3.44e-02 0.026 0.024 1.091 2.81e-01 -0.032 0.025 -1.296 2.01e-01
Nitrogen_perc -284.116 679.028 -0.418 6.78e-01 2.28 1.703 1.339 1.87e-01 -4.77 2.191 -2.177 3.46e-02

Table 24: Results from linear regression analysis across all fields. The table shows results from the robust regression analyses including samples from all fields. The table shows the obtained statistical values for each of the environmental variables (rows) and the three microbiome features (columns). Note that a robust linear regression was used for the alpha diversity measures while a linear regression was used for fungi percentage. Also note that one sample was removed as outlier for the analysis of potassium.

Version information

Table 25: List of used software including the used R-programming environment packages.

Package Version Package Version
OS Ubuntu 20.04.4 LTS mvtnorm 1.1-3
R 4.2.0 hms 1.1.2
utf8 1.2.2 evaluate 0.15
tidyselect 1.2.0 xtable 1.8-4
Rtsne 0.16 jpeg 0.1-9
munsell 0.5.0 readxl 1.4.2
codetools 0.2-18 compiler 4.2.0
withr 2.5.0 V8 4.3.0
colorspace 2.0-3 crayon 1.5.1
highr 0.9 minqa 1.2.4
rstudioapi 0.14 htmltools 0.5.2
robustbase 0.95-0 mgcv 1.8-40
ggsignif 0.6.3 pcaPP 2.0-1
labeling 0.4.2 tzdb 0.3.0
GenomeInfoDbData 1.2.8 rrcov 1.7-0
mnormt 2.0.2 RcppParallel 5.1.5
hwriter 1.3.2.1 lubridate 1.9.2
farver 2.1.0 DBI 1.1.2
rhdf5 2.40.0 sjlabelled 1.2.0
coda 0.19-4 dbplyr 2.1.1
vctrs 0.5.2 MASS 7.3-57
generics 0.1.2 boot 1.3-28
TH.data 1.1-1 ade4 1.7-19
xfun 0.31 car 3.0-13
timechange 0.2.0 cli 3.6.0
R6 2.5.1 parallel 4.2.0
isoband 0.2.5 insight 0.19.0
bitops 1.0-7 igraph 1.3.1
rhdf5filters 1.8.0 pkgconfig 2.0.3
DelayedArray 0.22.0 xml2 1.3.3
assertthat 0.2.1 foreach 1.5.2
multcomp 1.4-19 svglite 2.1.0
gtable 0.3.0 bslib 0.3.1
sandwich 3.0-1 multtest 2.52.0
rlang 1.0.6 webshot 0.5.3
systemfonts 1.0.4 estimability 1.3
splines 4.2.0 rvest 1.0.2
rstatix 0.7.0 digest 0.6.29
broom 0.8.0 rmarkdown 2.14
yaml 2.3.5 cellranger 1.1.0
reshape2 1.4.4 curl 4.3.2
abind 1.4-5 nloptr 2.0.2
modelr 0.1.8 lifecycle 1.0.3
backports 1.4.1 nlme 3.1-157
tools 4.2.0 Rhdf5lib 1.18.2
ellipsis 0.3.2 carData 3.0-5
jquerylib 0.1.4 viridisLite 0.4.0
biomformat 1.24.0 fansi 1.0.3
plyr 1.8.7 pillar 1.8.1
zlibbioc 1.42.0 fastmap 1.1.0
RCurl 1.98-1.6 httr 1.4.5
ggpubr 0.4.0 DEoptimR 1.0-11
cowplot 1.1.1 survival 3.3-1
zoo 1.8-10 glue 1.6.2
haven 2.5.0 zip 2.2.0
cluster 2.1.3 png 0.1-7
fs 1.5.2 iterators 1.0.14
magrittr 2.0.3 stringi 1.7.6
reprex 2.0.2 sass 0.4.1
tmvnsim 1.0-2 latticeExtra 0.6-29