Statistical Analysis Report
Customer | Innovation Centre for Organic Farming, Tove Mariegaard Pedersen |
Customer ID | DA00204-22 |
Project | The microbial community of the field, 2. year (2022). |
Sample Type | Soil |
Number of samples | 48 samples |
Type of data | ITS2 region |
The Project
The current report describes microbiome profiles of 48 samples collected in the second year of the project, year 2022, across 48 fields. For each field, one sample was collected to represent the field, corresponding to the ‘main’ samples collected for each filed from 2021. These samples were taken for each field based on 16 subsamples taken in a w-pattern throughout the field.
In 2023, the scope will be expanded to include a total of approx. 100
fields (both conventional and organic fields). In this report we aalyse
the samples from 2022 alone. Later we will perform joined analysis to
both identify robust patterns arcross the years and evaluate if any
patterns appear year-specific indicating varying conditions such as
whether.
The aim is to evaluate how the microbiome of the fields associate with
other field parameters of both agricultural practices and soil
indicators of nutrients, type and structure.
Only when the full data set for 2021-2023 is ready, a final analysis can
be conducted.
We separate the evaluations into two R3 reports:
Analysis
In “Report 3”, biostatistical analyses are performed and the results
presented, building on the data generated and evaluated in the 2 prior
reports (Report 1: Sequencing and data processing report, Report
2: Microbiome profiling report).
Through biostatistical analysis we relate the microbiome profiles to the
key variables selected for year 2022. The focus here is to evaluate how
and to what extent the variables shape and relate to the soil microbiome
composition and diversity. We therefore focus on the overall structure
of the microbiome also called the microbiome composition and the
diversity.
The key variables assessed in this report are summarized with summary statistics across the 48 samples in the below table.
Variable | N | Mean | Std. Dev. | Min | Pctl. 25 | Pctl. 75 | Max |
---|---|---|---|---|---|---|---|
JB_value | 48 | ||||||
… 1 | 19 | 39.6% | |||||
… 2 | 5 | 10.4% | |||||
… 5 | 1 | 2.1% | |||||
… 6 | 20 | 41.7% | |||||
… 7 | 3 | 6.2% | |||||
Earthworm_status | 48 | ||||||
… 0 | 11 | 22.9% | |||||
… 1 | 37 | 77.1% | |||||
Cold_soil | 48 | ||||||
… 0 | 33 | 68.8% | |||||
… 1 | 15 | 31.2% | |||||
Compact_soil | 48 | ||||||
… 0 | 40 | 83.3% | |||||
… 1 | 8 | 16.7% | |||||
field_well_drained | 48 | ||||||
… 0 | 7 | 14.6% | |||||
… 1 | 41 | 85.4% | |||||
Mulching_of_straw | 48 | ||||||
… 0 | 25 | 52.1% | |||||
… 1 | 23 | 47.9% | |||||
Clovergrass_within_3_years | 48 | ||||||
… 0 | 36 | 75% | |||||
… 1 | 12 | 25% | |||||
No_plough | 48 | ||||||
… 0 | 36 | 75% | |||||
… 1 | 12 | 25% | |||||
ConservationAgriculture | 48 | ||||||
… 0 | 45 | 93.8% | |||||
… 1 | 3 | 6.2% | |||||
Years_since_plowing | 48 | 3.417 | 2.988 | 1 | 1 | 4.25 | 11 |
Rt | 48 | 6.487 | 0.489 | 5.7 | 6.2 | 6.75 | 7.6 |
Phosphorus | 48 | 3.081 | 1.301 | 0.7 | 2.175 | 3.925 | 6 |
Potassium | 48 | 10.473 | 6.247 | 1.5 | 7.325 | 14 | 41 |
Magnesium | 48 | 6.89 | 2.577 | 1.9 | 5.5 | 8.15 | 16 |
Cobber | 48 | 2.481 | 0.885 | 1 | 1.8 | 3.1 | 5.1 |
Organic_material_perc | 48 | 3.025 | 1.363 | 1.16 | 2.173 | 3.372 | 8.74 |
Clay_perc | 48 | 9.258 | 4.727 | 2.4 | 4.675 | 12.85 | 20 |
Nitrogen_perc | 48 | 0.144 | 0.052 | 0.07 | 0.1 | 0.17 | 0.28 |
Organic_farm | 48 | ||||||
… 0 | 24 | 50% | |||||
… 1 | 24 | 50% | |||||
Years_since_turning_organic | 48 | 4.521 | 4.758 | 1 | 1 | 7.25 | 15 |
Livestock | 48 | ||||||
… 0 | 20 | 41.7% | |||||
… 1 | 28 | 58.3% | |||||
Livestock_manure | 48 | ||||||
… 0 | 16 | 33.3% | |||||
… 1 | 32 | 66.7% | |||||
Commercial.fertilizer | 48 | ||||||
… 0 | 24 | 50% | |||||
… 1 | 24 | 50% | |||||
Vinasse | 48 | ||||||
… 0 | 46 | 95.8% | |||||
… 1 | 2 | 4.2% | |||||
Cast | 48 | ||||||
… 0 | 46 | 95.8% | |||||
… 1 | 2 | 4.2% | |||||
Degassed.fertilizer | 48 | ||||||
… 0 | 35 | 72.9% | |||||
… 1 | 13 | 27.1% | |||||
Chalked | 47 | ||||||
… 0 | 38 | 80.9% | |||||
… 1 | 9 | 19.1% |
Table 1: Summary statistics of the key variables selected for evaluation in relation to the fields microbiome profiles in year 2022.
We initiate the evaluation of the 10 samples (1 per field) with a stacked barplot of the microbiome profiles in each sample. This allows us to make a first evaluation of the extent of difference in the taxonomic profiles between the fields.
Note that in order to show the organisms with a color scheme that is interpretable, it is necessary to filter the profiles and select a subset of the most abundant clades to be included in the plots. The filtering used is specified in the axis labels of each plot (e.g. >2% in the relative abundance plots mean that a clade must have a relative abundance across samples of more than 2% in order to be included in the plot).
The stacked barplots allow us to visually access the stability of the taxonomic profile across the fields, and get a feeling of the level to which individual clades are found across field or more sporadic. Compared to the bacterial part of the microbiome, the fungi show a large deviation between fields, with both large variation in some, and others that are dominated by a few clades. And we see how the dominating clade is also different between many fields.
Figure 1: Visualization of the fungal community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. Clade abundance was transformed to relative abundance to sum to 100% in each sample.
Figure 2: Visualization of the fungal community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. Clade abundance was transformed to relative abundance to sum to 100% in each sample.
Figure 3: Visualization of the fungal community in the samples. Stacked barplots of taxonomic clades in each of the evaluated samples. Clade abundance was transformed to relative abundance to sum to 100% in each sample.
As described in Report 2, alpha diversity is a measure of the diversity within (or complexity within) one microbiome community (or sample). We here evaluate the two measures of alpha diversity; Shannon and observed species (a measure of richness). The measures are introduced in Report 2.
In the following plots the measures of alpha diversity and percentage of fungi is evaluated in relation to each key environmental variable. Following the plots is a table with the statistical analysis of the relations.
)
Figure 25: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 26: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 27: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 28: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 29: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 30: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 31: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 32: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 33: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 34: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 35: Illustration of the alpha diversity levels across the levels or values of the environmental variable. Note that we have coded the none-organic fields as 0.
)
Figure 36: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 37: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 38: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 39: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 40: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 41: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 42: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 43: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 44: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 45: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 46: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 47: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 48: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 49: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 50: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
)
Figure 51: Illustration of the alpha diversity levels across the levels or values of the environmental variable.
An analysis of Variance Model (ANOVA) was used to evaluate if the mean diversity differed significantly between the levels of grouping variables, and a robust linear regression was used to assess the relationship for continuous variables.
Observed | Shannon | |||||||
---|---|---|---|---|---|---|---|---|
Variable | Df | Sum.Sq | F.value | P | Df | Sum.Sq | F.value | P |
JB_value | 4 | 57758.078 | 0.412 | 7.99e-01 | 4 | 1.63 | 0.512 | 7.27e-01 |
Earthworm_status | 1 | 29101.983 | 0.872 | 3.55e-01 | 1 | 0.4 | 0.519 | 4.75e-01 |
Cold_soil | 1 | 140510.455 | 4.539 | 3.85e-02 | 1 | 2.388 | 3.286 | 7.64e-02 |
Compact_soil | 1 | 2856.6 | 0.084 | 7.73e-01 | 1 | 0.112 | 0.145 | 7.06e-01 |
field_well_drained | 1 | 25.098 | 0.001 | 9.78e-01 | 1 | 0 | 0 | 9.90e-01 |
Mulching_of_straw | 1 | 77053.821 | 2.383 | 1.30e-01 | 1 | 0.05 | 0.064 | 8.02e-01 |
Clovergrass_within_3_years | 1 | 71645.444 | 2.208 | 1.44e-01 | 1 | 0.556 | 0.725 | 3.99e-01 |
No_plough | 1 | 14161 | 0.42 | 5.20e-01 | 1 | 2.658 | 3.685 | 6.11e-02 |
ConservationAgriculture | 1 | 158064.2 | 5.169 | 2.77e-02 | 1 | 3.8 | 5.458 | 2.39e-02 |
Years_since_plowing | 1 | 1567.246 | 0.046 | 8.31e-01 | 1 | 1.134 | 1.504 | 2.26e-01 |
Organic_farm | 1 | 184.083 | 0.005 | 9.42e-01 | 1 | 1.642 | 2.209 | 1.44e-01 |
Years_since_turning_organic | 1 | 3359.968 | 0.099 | 7.54e-01 | 1 | 1.646 | 2.215 | 1.43e-01 |
Livestock | 1 | 12837.343 | 0.381 | 5.40e-01 | 1 | 0.039 | 0.05 | 8.24e-01 |
Livestock_manure | 1 | 56891.344 | 1.736 | 1.94e-01 | 1 | 0.964 | 1.272 | 2.65e-01 |
Commercial.fertilizer | 1 | 184.083 | 0.005 | 9.42e-01 | 1 | 1.642 | 2.209 | 1.44e-01 |
Vinasse | 1 | 1785.522 | 0.053 | 8.20e-01 | 1 | 0.106 | 0.137 | 7.13e-01 |
Cast | 1 | 2170.565 | 0.064 | 8.02e-01 | 1 | 0.015 | 0.02 | 8.89e-01 |
Degassed.fertilizer | 1 | 163716.191 | 5.376 | 2.49e-02 | 1 | 0.824 | 1.083 | 3.03e-01 |
Chalked | 1 | 6006.612 | 0.174 | 6.79e-01 | 1 | 0.696 | 0.893 | 3.50e-01 |
Table 23: Results from ANOVA analysis across all fields. The table shows results from ANOVA analyses including samples from all fields. The table shows the obtained statistical values for each of the environmental variables (rows) and the three microbiome features (columns).
Observed | Shannon | |||||||
---|---|---|---|---|---|---|---|---|
Variable | Estimate | SE | t.value | P | Estimate | SE | t.value | P |
Rt | 29.07 | 104.607 | 0.278 | 7.82e-01 | 0.233 | 0.366 | 0.635 | 5.29e-01 |
Phosphorus | -0.246 | 29.54 | -0.008 | 9.93e-01 | -0.075 | 0.141 | -0.532 | 5.97e-01 |
Potassium | 2.907 | 6.405 | 0.454 | 6.52e-01 | -0.084 | 0.029 | -2.902 | 5.68e-03 |
Magnesium | 5.774 | 21.124 | 0.273 | 7.86e-01 | -0.037 | 0.082 | -0.455 | 6.51e-01 |
Cobber | 31.064 | 64.357 | 0.483 | 6.32e-01 | 0.042 | 0.214 | 0.195 | 8.46e-01 |
Organic_material_perc | -5.842 | 32.518 | -0.18 | 8.58e-01 | -0.039 | 0.147 | -0.269 | 7.89e-01 |
Clay_perc | 6.762 | 14.223 | 0.475 | 6.37e-01 | 0.024 | 0.036 | 0.653 | 5.17e-01 |
Nitrogen_perc | -18.336 | 779.792 | -0.024 | 9.81e-01 | -0.046 | 4.172 | -0.011 | 9.91e-01 |
Table 24: Results from robust linear regression analysis across all fields. The table shows results from the robust regression analyses including samples from all fields. The table shows the obtained statistical values for each of the environmental variables (rows) and the three microbiome features (columns).
Table 25: List of used software including the used R-programming environment packages.
Package | Version | Package | Version |
---|---|---|---|
OS | Ubuntu 20.04.4 LTS | mvtnorm | 1.1-3 |
R | 4.2.0 | hms | 1.1.2 |
utf8 | 1.2.2 | evaluate | 0.15 |
tidyselect | 1.2.0 | xtable | 1.8-4 |
Rtsne | 0.16 | jpeg | 0.1-9 |
munsell | 0.5.0 | readxl | 1.4.2 |
codetools | 0.2-18 | compiler | 4.2.0 |
withr | 2.5.0 | V8 | 4.3.0 |
colorspace | 2.0-3 | crayon | 1.5.1 |
highr | 0.9 | minqa | 1.2.4 |
rstudioapi | 0.14 | htmltools | 0.5.2 |
robustbase | 0.95-0 | mgcv | 1.8-40 |
ggsignif | 0.6.3 | pcaPP | 2.0-1 |
labeling | 0.4.2 | tzdb | 0.3.0 |
GenomeInfoDbData | 1.2.8 | rrcov | 1.7-0 |
mnormt | 2.0.2 | RcppParallel | 5.1.5 |
hwriter | 1.3.2.1 | lubridate | 1.9.2 |
farver | 2.1.0 | DBI | 1.1.2 |
rhdf5 | 2.40.0 | sjlabelled | 1.2.0 |
coda | 0.19-4 | dbplyr | 2.1.1 |
vctrs | 0.5.2 | MASS | 7.3-57 |
generics | 0.1.2 | boot | 1.3-28 |
TH.data | 1.1-1 | ade4 | 1.7-19 |
xfun | 0.31 | car | 3.0-13 |
timechange | 0.2.0 | cli | 3.6.0 |
R6 | 2.5.1 | parallel | 4.2.0 |
isoband | 0.2.5 | insight | 0.19.0 |
bitops | 1.0-7 | igraph | 1.3.1 |
rhdf5filters | 1.8.0 | pkgconfig | 2.0.3 |
DelayedArray | 0.22.0 | xml2 | 1.3.3 |
assertthat | 0.2.1 | foreach | 1.5.2 |
multcomp | 1.4-19 | svglite | 2.1.0 |
gtable | 0.3.0 | bslib | 0.3.1 |
sandwich | 3.0-1 | multtest | 2.52.0 |
rlang | 1.0.6 | webshot | 0.5.3 |
systemfonts | 1.0.4 | estimability | 1.3 |
splines | 4.2.0 | rvest | 1.0.2 |
rstatix | 0.7.0 | digest | 0.6.29 |
broom | 0.8.0 | rmarkdown | 2.14 |
yaml | 2.3.5 | cellranger | 1.1.0 |
reshape2 | 1.4.4 | curl | 4.3.2 |
abind | 1.4-5 | nloptr | 2.0.2 |
modelr | 0.1.8 | lifecycle | 1.0.3 |
backports | 1.4.1 | nlme | 3.1-157 |
tools | 4.2.0 | Rhdf5lib | 1.18.2 |
ellipsis | 0.3.2 | carData | 3.0-5 |
jquerylib | 0.1.4 | viridisLite | 0.4.0 |
biomformat | 1.24.0 | fansi | 1.0.3 |
plyr | 1.8.7 | pillar | 1.8.1 |
zlibbioc | 1.42.0 | fastmap | 1.1.0 |
RCurl | 1.98-1.6 | httr | 1.4.5 |
ggpubr | 0.4.0 | DEoptimR | 1.0-11 |
cowplot | 1.1.1 | survival | 3.3-1 |
zoo | 1.8-10 | glue | 1.6.2 |
haven | 2.5.0 | zip | 2.2.0 |
cluster | 2.1.3 | png | 0.1-7 |
fs | 1.5.2 | iterators | 1.0.14 |
magrittr | 2.0.3 | stringi | 1.7.6 |
reprex | 2.0.2 | sass | 0.4.1 |
tmvnsim | 1.0-2 | latticeExtra | 0.6-29 |