Customer Innovation Centre for Organic Farming, Tove Mariegaard Pedersen
Customer ID DA00204-21
Project The microbial community of the field
Sample Type soil
Number of samples 31 samples
Type of data Shotgun metagenomic sequencing

Processing of the shotgun sequencing data through quality control and microbiome profiling across the 31 samples resulted in the detection of microorganisms (see report 2 on taxonomy) and profiling of the functional capacity of the microbes. The software HUMAnN3 was used to profile the functional capacity in terms of MetaCyc pathways and UniRef50 gene families. The latter was regrouped to Gene Ontology (GO) and KEGG Orthology (KO) terms. The functional profiling is described in detail in “Report 1: Sequencing and Data Processing Report” and the different databases used for the grouping of the functions is described below.

These databases all have different strengths and limitations and therefore, to be as exhaustive as possible, we use all three databases for the regrouping of the UniRef50 gene families.

Functional capacity versus functional activity

There are two types of data we can use for profiling of the functional capacity of the microbes; shotgun metagenomic sequencing and metatranscriptomics, where the former is sequencing of the DNA and the latter is sequencing of RNA. There are some important differences between the information we obtain with these two types of data. With sequencing of DNA we obtain sequences from the whole genome (both coding and non-coding sections) whereas with sequencing of RNA we only obtain sequences from the coding part of the genome. Furthermore, when we investigate the DNA we look at the functional capacity and we do not obtain information of the expression level of the genes. In contrast, with RNA sequencing we can analyse the functional activity, i.e. the expression level of the genes. For example, a gene can have a low relative abundance at DNA level but a high relative abundance at RNA level due to a high expression of the gene. These considerations are important to keep in mind when interpreting the data shown in this report. In this report, we look at the functional capacity as we have performed DNA sequencing.

Most abundant functional terms/pathways

This section lists the most abundant functional terms/pathways detected in your data for each of the described databases. This section is meant to give you a first impression of the format of the functional data, the types of terms and pathways detected, and how they are named. To evaluate further, you can try to google the terms or look them up in the individual databases using the links given above in order to get a better understanding of their interpretation and usage.

The tables show the most abundant functional terms/pathways. These will often be the household functions and are often not the most interesting for you to study and evaluate further. Therefore, based on your question of interest, we also highlight some functional terms/pathways that may be of special interest in your project. These can be seen in the tables in the section “Functional terms/pathways of interest”.

The values in all of the following tables range from 1-100, with the exception of some values that sum to more than 100 due to the structure of the databases (i.e. one gene is found in more than one term). Due to the high fraction of unmapped reads and the high number of functional terms/pathways, the numbers are quite small.

Most abundant MetaCyc pathways

Min. 1st Qu. Median Mean 3rd Qu. Max.
UNMAPPED 7.86E+01 8.08E+01 8.14E+01 8.14E+01 8.22E+01 8.40E+01
UNINTEGRATED 1.49E+01 1.66E+01 1.73E+01 1.73E+01 1.78E+01 1.98E+01
VALSYN-PWY: L-valine biosynthesis 1.53E-02 1.69E-02 1.73E-02 1.73E-02 1.79E-02 1.92E-02
PWY-7111: pyruvate fermentation to isobutanol (engineered) 1.53E-02 1.69E-02 1.73E-02 1.73E-02 1.79E-02 1.92E-02
ILEUSYN-PWY: L-isoleucine biosynthesis I (from threonine) 1.53E-02 1.69E-02 1.73E-02 1.73E-02 1.79E-02 1.92E-02
PWY-7219: adenosine ribonucleotides de novo biosynthesis 1.35E-02 1.46E-02 1.50E-02 1.50E-02 1.54E-02 1.77E-02
BRANCHED-CHAIN-AA-SYN-PWY: superpathway of branched chain amino acid biosynthesis 1.28E-02 1.44E-02 1.48E-02 1.49E-02 1.53E-02 1.65E-02
PWY-5103: L-isoleucine biosynthesis III 1.28E-02 1.43E-02 1.48E-02 1.48E-02 1.52E-02 1.65E-02
TCA: TCA cycle I (prokaryotic) 9.44E-03 1.13E-02 1.20E-02 1.19E-02 1.24E-02 1.40E-02
NONOXIPENT-PWY: pentose phosphate pathway (non-oxidative branch) I 9.00E-03 1.09E-02 1.18E-02 1.16E-02 1.22E-02 1.34E-02
PWY-1042: glycolysis IV 8.86E-03 1.06E-02 1.17E-02 1.15E-02 1.23E-02 1.36E-02
PWY-3001: superpathway of L-isoleucine biosynthesis I 8.18E-03 1.03E-02 1.08E-02 1.08E-02 1.14E-02 1.33E-02
TRNA-CHARGING-PWY: tRNA charging 8.38E-03 9.89E-03 1.07E-02 1.05E-02 1.11E-02 1.25E-02
PWY-7221: guanosine ribonucleotides de novo biosynthesis 8.78E-03 9.85E-03 1.02E-02 1.04E-02 1.09E-02 1.33E-02
PWY-5690: TCA cycle II (plants and fungi) 7.74E-03 9.67E-03 1.02E-02 1.03E-02 1.07E-02 1.25E-02
PWY-6700: queuosine biosynthesis I (de novo) 8.61E-03 9.39E-03 9.84E-03 9.91E-03 1.03E-02 1.28E-02
PWY-7229: superpathway of adenosine nucleotides de novo biosynthesis I 7.41E-03 9.07E-03 9.70E-03 9.69E-03 1.05E-02 1.17E-02
PWY-5913: partial TCA cycle (obligate autotrophs) 6.57E-03 8.64E-03 9.57E-03 9.55E-03 1.05E-02 1.16E-02
PWY-6277: superpathway of 5-aminoimidazole ribonucleotide biosynthesis 7.99E-03 8.80E-03 9.53E-03 9.61E-03 1.02E-02 1.16E-02
PWY-6122: 5-aminoimidazole ribonucleotide biosynthesis II 7.99E-03 8.80E-03 9.53E-03 9.61E-03 1.02E-02 1.16E-02
PWY-6969: TCA cycle V (2-oxoglutarate synthase) 7.73E-03 9.11E-03 9.46E-03 9.55E-03 9.94E-03 1.10E-02
PWY-7228: superpathway of guanosine nucleotides de novo biosynthesis I 8.27E-03 8.72E-03 9.32E-03 9.30E-03 9.83E-03 1.10E-02

Table 1: Summary statistics for top 20 most abundant MetaCyc pathways. Summary statistics were computed for all pathways and the 20 most abundant were selected based on the median relative abundance across the samples.

The data set contains 523 different pathways, not considering the UNMAPPED and UNINTEGRATED.

  • The UNMAPPED row contains information on the fraction of reads that did not map to a gene in the reference database

  • The UNINTEGRATED row contains information on the fraction of reads that did map to a gene in the reference database, but where the gene is not contributing to a pathway in the used pathway database.

The UNMAPPED and UNINTEGRATED rows allow for evaluation of the fraction of data that does not contribute to the identified functions and therefore, is not used in further analysis. We do expect a fairly high number of UNINTEGRATED reads, as many known genes are not assigned to known pathways. Furthermore, for soil samples, a high percentage (>80%) of UNMAPPED reads is expected as the soil microbiome is very complex and not well-described in the reference databases.

Most abundant KO terms

Min. 1st Qu. Median Mean 3rd Qu. Max.
UNMAPPED 7.86E+01 8.08E+01 8.14E+01 8.14E+01 8.22E+01 8.40E+01
UNGROUPED 1.48E+01 1.64E+01 1.72E+01 1.72E+01 1.78E+01 1.99E+01
K03088: RNA polymerase sigma-70 factor, ECF subfamily 6.23E-03 6.99E-03 7.33E-03 7.64E-03 8.47E-03 1.00E-02
K01990: ABC-2 type transport system ATP-binding protein 5.98E-03 6.67E-03 7.17E-03 7.15E-03 7.61E-03 8.87E-03
K03704: cold shock protein (beta-ribbon, CspA family) 4.78E-03 6.20E-03 6.68E-03 6.63E-03 7.15E-03 8.16E-03
K04078: chaperonin GroES 4.88E-03 5.36E-03 5.73E-03 5.82E-03 6.30E-03 6.77E-03
K02518: translation initiation factor IF-1 4.25E-03 4.67E-03 5.02E-03 4.96E-03 5.19E-03 5.54E-03
K02049: sulfonate/nitrate/taurine transport system ATP-binding protein 2.76E-03 3.42E-03 4.28E-03 4.00E-03 4.46E-03 4.91E-03
K01996: branched-chain amino acid transport system ATP-binding protein 2.93E-03 3.71E-03 3.91E-03 3.91E-03 4.25E-03 4.55E-03
K02051: sulfonate/nitrate/taurine transport system substrate-binding protein 2.29E-03 3.13E-03 3.56E-03 3.56E-03 4.11E-03 4.58E-03
K01130: arylsulfatase [EC:3.1.6.1] 1.69E-03 2.91E-03 3.45E-03 3.33E-03 3.92E-03 4.07E-03
K02078: acyl carrier protein 2.71E-03 3.06E-03 3.43E-03 3.43E-03 3.68E-03 4.44E-03
K01999: branched-chain amino acid transport system substrate-binding protein 2.19E-03 2.94E-03 3.43E-03 3.37E-03 3.83E-03 4.21E-03
K07304: peptide-methionine (S)-S-oxide reductase 2.87E-03 3.27E-03 3.41E-03 3.42E-03 3.57E-03 3.96E-03
K03111: single-strand DNA-binding protein 2.91E-03 3.17E-03 3.34E-03 3.34E-03 3.48E-03 3.93E-03
K01995: branched-chain amino acid transport system ATP-binding protein 2.73E-03 3.05E-03 3.30E-03 3.25E-03 3.44E-03 3.79E-03
K07497: putative transposase 2.17E-03 3.01E-03 3.23E-03 3.44E-03 3.94E-03 5.79E-03
K07305: peptide-methionine (R)-S-oxide reductase 2.78E-03 3.00E-03 3.23E-03 3.23E-03 3.47E-03 3.84E-03
K02003: NO_NAME 2.46E-03 2.92E-03 3.16E-03 3.19E-03 3.42E-03 3.97E-03
K03798: cell division protease FtsH 2.38E-03 2.86E-03 3.03E-03 3.04E-03 3.16E-03 3.84E-03
K01358: ATP-dependent Clp protease, protease subunit [EC:3.4.21.92] 2.62E-03 2.76E-03 3.01E-03 2.98E-03 3.12E-03 3.48E-03
K04751: nitrogen regulatory protein P-II 1 2.71E-03 2.91E-03 2.99E-03 3.04E-03 3.16E-03 3.50E-03

Table 2: Summary statistics for top 20 most abundant KO terms. Summary statistics were computed for all KO terms and the 20 most abundant were selected based on the median relative abundance across the samples (KO terms assigned as “subunit ribosomal protein” were not included in the table as these are highly abundant and not that informative).

The data set contains 6056 different KO terms, not considering the UNMAPPED and UNGROUPED.

  • The UNMAPPED row contains information on the fraction of reads that did not map to a gene in the reference database

  • The UNGROUPED row contains information on the fraction of reads that did map to a gene in the reference database, but where the gene is not contributing to a pathway in the used pathway database.

The UNMAPPED and UNGROUPED rows allow for evaluation of the fraction of data that does not contribute to the identified functions and therefore, is not used in further analysis. We do expect a fairly high number of UNGROUPED reads, as many known genes are not part of a known KO term. Furthermore, for soil samples, a high percentage (>80%) of UNMAPPED reads is expected as the soil microbiome is very complex and not well-described in the reference databases.

Most abundant GO terms

The GO terms are split into three main categories:

  • Biological process [BP]
  • Molecular function [MF]
  • Cellular component [CC]

Of these three categories we find the biological process most informative and thus, we selected the top 20 most abundant BP terms from the GO data set. For more information about the individual GO terms, we refer to the QuickGo homepage: https://www.ebi.ac.uk/QuickGO/.

Min. 1st Qu. Median Mean 3rd Qu. Max.
UNMAPPED 7.86E+01 8.08E+01 8.14E+01 8.14E+01 8.22E+01 8.40E+01
UNGROUPED 6.53E+00 7.28E+00 7.63E+00 7.64E+00 7.88E+00 9.11E+00
GO:0006412: [BP] translation 2.53E-01 2.70E-01 2.74E-01 2.74E-01 2.79E-01 3.10E-01
GO:0055085: [BP] transmembrane transport 2.03E-01 2.53E-01 2.65E-01 2.64E-01 2.83E-01 3.14E-01
GO:0006355: [BP] regulation of transcription, DNA-templated 2.26E-01 2.44E-01 2.55E-01 2.56E-01 2.65E-01 2.83E-01
GO:0005975: [BP] carbohydrate metabolic process 1.59E-01 1.83E-01 1.97E-01 1.96E-01 2.07E-01 2.33E-01
GO:0000160: [BP] phosphorelay signal transduction system 1.80E-01 1.86E-01 1.92E-01 1.92E-01 1.97E-01 2.02E-01
GO:0006313: [BP] transposition, DNA-mediated 6.12E-02 9.51E-02 1.20E-01 1.15E-01 1.31E-01 1.45E-01
GO:0015074: [BP] DNA integration 6.39E-02 9.47E-02 1.20E-01 1.13E-01 1.27E-01 1.46E-01
GO:0006281: [BP] DNA repair 9.47E-02 1.06E-01 1.11E-01 1.11E-01 1.16E-01 1.24E-01
GO:0006310: [BP] DNA recombination 8.89E-02 1.00E-01 1.11E-01 1.09E-01 1.17E-01 1.27E-01
GO:0009058: [BP] biosynthetic process 7.85E-02 8.64E-02 9.04E-02 9.02E-02 9.43E-02 1.02E-01
GO:0006260: [BP] DNA replication 5.58E-02 6.48E-02 6.87E-02 6.93E-02 7.40E-02 7.98E-02
GO:0006352: [BP] DNA-templated transcription, initiation 6.42E-02 6.72E-02 6.85E-02 6.92E-02 7.04E-02 8.44E-02
GO:0051301: [BP] cell division 5.13E-02 6.14E-02 6.50E-02 6.51E-02 6.88E-02 7.48E-02
GO:0006099: [BP] tricarboxylic acid cycle 4.59E-02 5.69E-02 5.98E-02 5.98E-02 6.29E-02 6.87E-02
GO:0045454: [BP] cell redox homeostasis 5.31E-02 5.73E-02 5.93E-02 5.96E-02 6.23E-02 7.02E-02
GO:0006457: [BP] protein folding 5.26E-02 5.70E-02 5.93E-02 5.87E-02 6.03E-02 6.41E-02
GO:0006351: [BP] transcription, DNA-templated 5.08E-02 5.72E-02 5.92E-02 5.96E-02 6.20E-02 6.96E-02
GO:0055114: [BP] oxidation-reduction process 4.67E-02 5.07E-02 5.40E-02 5.34E-02 5.56E-02 6.10E-02
GO:0006265: [BP] DNA topological change 4.31E-02 4.95E-02 5.27E-02 5.24E-02 5.53E-02 5.93E-02
GO:0006807: [BP] nitrogen compound metabolic process 4.00E-02 4.25E-02 4.42E-02 4.39E-02 4.54E-02 4.83E-02

Table 3: Summary statistics for top 20 most abundant KO terms. Summary statistics were computed for all GO terms in the BP category and the 20 most abundant were selected based on the median relative abundance across the samples.

The data set contains 2377 different GO terms in the BP category, not considering the UNMAPPED and UNGROUPED.

  • The UNMAPPED row contains information on the fraction of reads that did not map to a gene in the reference database

  • The UNGROUPED row contains information on the fraction of reads that did map to a gene in the reference database, but where the gene is not contributing to a pathway in the used pathway database.

The UNMAPPED and UNGROUPED rows allow for evaluation of the fraction of data that does not contribute to the identified functions and therefore, is not used in further analysis. For soil samples, a high percentage (>80%) of UNMAPPED reads is expected as the soil microbiome is very complex and not well-described in the reference databases.

Functional terms of interest

To obtain an overview of functions of special interest in your project, we selected BP and GO terms related to the following functions: nitrogen metabolism, sulfur metabolism, methane metabolism, carbon fixation, symbiosis and photosynthesis. The selected functions are not exhaustive however, it provides a good overview of which aspects of the functional capacity that might be interesting to zoom in on.

Selected BP terms

To extract BP terms of special interest, we searched the BP terms using the following keywords: “nitrogen”, “nitrate”, “nitrite”, “nitrification”, “ammonium”, “ammonia”, “ammonification”, “urea”, “carbon”, “phosphorus”, “phosphate”, “methane”, “sulfate”, “sulfur”, “potassium”, “symbiotic”, “photosynthesis”, “respiration” and “starvation”. For more information about the individual GO terms, we refer to the QuickGo homepage: https://www.ebi.ac.uk/QuickGO/.

Nitrogen
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0006807: [BP] nitrogen compound metabolic process 4.00E-02 4.25E-02 4.42E-02 4.39E-02 4.54E-02 4.83E-02
GO:0042128: [BP] nitrate assimilation 1.80E-02 1.97E-02 2.03E-02 2.03E-02 2.11E-02 2.19E-02
GO:0006808: [BP] regulation of nitrogen utilization 1.17E-02 1.30E-02 1.33E-02 1.34E-02 1.38E-02 1.50E-02
GO:0071705: [BP] nitrogen compound transport 3.74E-03 7.17E-03 9.21E-03 9.12E-03 1.07E-02 1.36E-02
GO:0009399: [BP] nitrogen fixation 6.15E-03 6.76E-03 7.32E-03 7.32E-03 7.70E-03 9.03E-03
GO:0042126: [BP] nitrate metabolic process 4.03E-04 7.48E-04 9.08E-04 9.56E-04 1.16E-03 1.50E-03
GO:0019740: [BP] nitrogen utilization 3.14E-04 5.54E-04 7.20E-04 7.33E-04 8.85E-04 1.33E-03
GO:0019333: [BP] denitrification pathway 1.60E-04 3.99E-04 5.57E-04 5.59E-04 6.67E-04 1.26E-03
GO:0019676: [BP] ammonia assimilation cycle 2.03E-04 2.35E-04 2.53E-04 2.51E-04 2.65E-04 2.90E-04
GO:0015707: [BP] nitrite transport 0.00E+00 7.01E-05 1.22E-04 1.29E-04 1.70E-04 3.14E-04
GO:0006995: [BP] cellular response to nitrogen starvation 0.00E+00 7.24E-06 4.52E-05 4.37E-05 6.58E-05 1.37E-04
GO:0010243: [BP] response to organonitrogen compound 0.00E+00 1.23E-05 3.18E-05 3.69E-05 5.26E-05 1.14E-04
GO:0015706: [BP] nitrate transport 0.00E+00 7.36E-06 2.33E-05 2.11E-05 3.20E-05 6.27E-05
GO:0071250: [BP] cellular response to nitrite 0.00E+00 0.00E+00 1.40E-05 1.75E-05 3.18E-05 6.91E-05
GO:0071249: [BP] cellular response to nitrate 0.00E+00 0.00E+00 1.40E-05 1.75E-05 3.18E-05 6.91E-05
GO:1901565: [BP] organonitrogen compound catabolic process 0.00E+00 0.00E+00 0.00E+00 4.50E-06 0.00E+00 2.62E-05
GO:0090294: [BP] nitrogen catabolite activation of transcription 0.00E+00 0.00E+00 0.00E+00 1.20E-05 8.98E-06 1.43E-04
GO:0071417: [BP] cellular response to organonitrogen compound 0.00E+00 0.00E+00 0.00E+00 1.15E-07 0.00E+00 3.55E-06
GO:0043562: [BP] cellular response to nitrogen levels 0.00E+00 0.00E+00 0.00E+00 1.20E-05 8.98E-06 1.43E-04
Urea
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0043419: [BP] urea catabolic process 2.75E-03 3.19E-03 3.35E-03 3.32E-03 3.47E-03 3.78E-03
GO:0019627: [BP] urea metabolic process 8.65E-04 1.16E-03 1.30E-03 1.27E-03 1.35E-03 1.74E-03
GO:0000050: [BP] urea cycle 2.04E-04 2.93E-04 3.43E-04 3.49E-04 4.04E-04 5.12E-04
Carbon
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0006730: [BP] one-carbon metabolic process 9.14E-03 1.00E-02 1.03E-02 1.04E-02 1.09E-02 1.18E-02
GO:0015977: [BP] carbon fixation 2.83E-03 3.98E-03 4.46E-03 4.42E-03 4.87E-03 5.69E-03
GO:0015976: [BP] carbon utilization 2.00E-03 2.32E-03 2.46E-03 2.54E-03 2.73E-03 3.58E-03
GO:0043427: [BP] carbon fixation by 3-hydroxypropionate cycle 2.96E-05 9.34E-05 1.21E-04 1.17E-04 1.40E-04 2.01E-04
GO:0042206: [BP] halogenated hydrocarbon catabolic process 0.00E+00 6.94E-05 8.95E-05 9.60E-05 1.31E-04 1.95E-04
GO:0045013: [BP] carbon catabolite repression of transcription 1.12E-05 5.23E-05 7.77E-05 8.87E-05 1.04E-04 3.20E-04
Phosphorus
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0006817: [BP] phosphate ion transport 6.76E-03 7.23E-03 7.69E-03 7.62E-03 7.92E-03 8.80E-03
GO:0035435: [BP] phosphate ion transmembrane transport 5.01E-03 5.46E-03 5.70E-03 5.72E-03 6.05E-03 6.44E-03
GO:0045936: [BP] negative regulation of phosphate metabolic process 3.49E-03 3.83E-03 4.00E-03 4.01E-03 4.16E-03 4.97E-03
GO:0030643: [BP] cellular phosphate ion homeostasis 3.49E-03 3.83E-03 4.00E-03 4.01E-03 4.16E-03 4.97E-03
GO:0042823: [BP] pyridoxal phosphate biosynthetic process 1.68E-03 2.03E-03 2.21E-03 2.25E-03 2.37E-03 3.33E-03
GO:0046855: [BP] inositol phosphate dephosphorylation 1.10E-03 1.58E-03 1.70E-03 1.66E-03 1.79E-03 1.88E-03
GO:0016036: [BP] cellular response to phosphate starvation 9.38E-05 2.88E-04 6.03E-04 5.88E-04 7.29E-04 1.23E-03
GO:0035975: [BP] carbamoyl phosphate catabolic process 1.12E-04 3.82E-04 4.48E-04 4.53E-04 5.29E-04 6.81E-04
GO:0006793: [BP] phosphorus metabolic process 2.56E-04 4.02E-04 4.28E-04 4.32E-04 4.74E-04 5.88E-04
GO:0046386: [BP] deoxyribose phosphate catabolic process 7.98E-05 2.02E-04 2.77E-04 2.87E-04 3.41E-04 6.13E-04
GO:0070409: [BP] carbamoyl phosphate biosynthetic process 1.26E-04 1.94E-04 2.24E-04 2.23E-04 2.56E-04 3.24E-04
GO:2000186: [BP] negative regulation of phosphate transmembrane transport 0.00E+00 1.13E-04 1.42E-04 1.62E-04 1.98E-04 3.57E-04
GO:0036108: [BP] 4-amino-4-deoxy-alpha-L-arabinopyranosyl undecaprenyl phosphate biosynthetic process 0.00E+00 7.05E-05 1.20E-04 1.19E-04 1.59E-04 2.66E-04
GO:0080040: [BP] positive regulation of cellular response to phosphate starvation 0.00E+00 4.67E-05 6.69E-05 7.97E-05 1.08E-04 2.20E-04
GO:0061720: [BP] 6-sulfoquinovose(1-) catabolic process to glycerone phosphate and 3-sulfolactaldehyde 0.00E+00 1.19E-05 6.38E-05 1.21E-04 1.55E-04 6.86E-04
GO:0019693: [BP] ribose phosphate metabolic process 0.00E+00 2.92E-05 4.31E-05 4.91E-05 6.16E-05 1.32E-04
GO:0006753: [BP] nucleoside phosphate metabolic process 0.00E+00 2.92E-05 4.31E-05 4.91E-05 6.16E-05 1.32E-04
GO:0061610: [BP] glycerol to glycerone phosphate metabolic process 0.00E+00 2.10E-05 4.21E-05 4.47E-05 6.82E-05 1.31E-04
Potassium
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0006813: [BP] potassium ion transport 7.43E-03 1.02E-02 1.17E-02 1.18E-02 1.35E-02 1.66E-02
GO:0071805: [BP] potassium ion transmembrane transport 2.01E-04 3.18E-04 3.97E-04 4.04E-04 4.79E-04 6.08E-04
GO:1901381: [BP] positive regulation of potassium ion transmembrane transport 0.00E+00 2.03E-04 3.17E-04 3.48E-04 4.82E-04 7.65E-04
GO:0030007: [BP] cellular potassium ion homeostasis 0.00E+00 2.38E-05 4.34E-05 4.59E-05 6.26E-05 1.14E-04
Sulfate
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0016226: [BP] iron-sulfur cluster assembly 1.57E-02 1.71E-02 1.78E-02 1.82E-02 1.95E-02 2.16E-02
GO:0097428: [BP] protein maturation by iron-sulfur cluster transfer 7.21E-03 8.16E-03 8.72E-03 8.81E-03 9.28E-03 1.08E-02
GO:0000103: [BP] sulfate assimilation 4.04E-03 4.68E-03 4.94E-03 4.97E-03 5.32E-03 6.02E-03
GO:0019379: [BP] sulfate assimilation, phosphoadenylyl sulfate reduction by phosphoadenylyl-sulfate reductase (thioredoxin) 2.64E-03 3.02E-03 3.14E-03 3.18E-03 3.32E-03 3.84E-03
GO:0019419: [BP] sulfate reduction 2.15E-03 2.47E-03 2.62E-03 2.62E-03 2.74E-03 3.02E-03
GO:0006790: [BP] sulfur compound metabolic process 9.69E-04 1.33E-03 1.46E-03 1.44E-03 1.59E-03 2.00E-03
GO:0019417: [BP] sulfur oxidation 4.30E-04 6.00E-04 7.14E-04 7.13E-04 8.14E-04 9.95E-04
GO:0019346: [BP] transsulfuration 1.41E-04 2.87E-04 3.71E-04 3.78E-04 4.52E-04 6.79E-04
GO:0008272: [BP] sulfate transport 4.56E-05 1.48E-04 2.11E-04 2.55E-04 3.48E-04 6.31E-04
GO:0010438: [BP] cellular response to sulfur starvation 7.88E-05 1.27E-04 1.77E-04 1.72E-04 2.14E-04 2.62E-04
GO:0044273: [BP] sulfur compound catabolic process 8.60E-05 1.38E-04 1.62E-04 1.58E-04 1.77E-04 2.27E-04
GO:0018909: [BP] dodecyl sulfate metabolic process 4.22E-05 7.61E-05 9.43E-05 9.33E-05 1.13E-04 1.49E-04
GO:0015012: [BP] heparan sulfate proteoglycan biosynthetic process 0.00E+00 4.71E-05 7.26E-05 9.26E-05 1.32E-04 2.90E-04
GO:0006791: [BP] sulfur utilization 0.00E+00 0.00E+00 5.76E-05 5.54E-05 9.00E-05 1.43E-04
GO:0000101: [BP] sulfur amino acid transport 0.00E+00 0.00E+00 5.76E-05 5.54E-05 9.00E-05 1.43E-04
GO:0015709: [BP] thiosulfate transport 0.00E+00 1.86E-05 4.85E-05 5.43E-05 6.84E-05 1.61E-04
GO:0000096: [BP] sulfur amino acid metabolic process 0.00E+00 2.44E-05 4.01E-05 4.44E-05 5.86E-05 1.17E-04
GO:0072348: [BP] sulfur compound transport 0.00E+00 0.00E+00 0.00E+00 4.18E-07 0.00E+00 1.30E-05
GO:0030206: [BP] chondroitin sulfate biosynthetic process 0.00E+00 0.00E+00 0.00E+00 1.13E-05 2.28E-05 4.88E-05
GO:0030200: [BP] heparan sulfate proteoglycan catabolic process 0.00E+00 0.00E+00 0.00E+00 6.15E-07 0.00E+00 1.90E-05
GO:0010134: [BP] sulfate assimilation via adenylyl sulfate reduction 0.00E+00 0.00E+00 0.00E+00 1.10E-05 2.27E-05 5.24E-05
GO:0009970: [BP] cellular response to sulfate starvation 0.00E+00 0.00E+00 0.00E+00 4.44E-06 5.58E-06 2.97E-05
GO:0000098: [BP] sulfur amino acid catabolic process 0.00E+00 0.00E+00 0.00E+00 1.57E-06 0.00E+00 2.69E-05
Starvation
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0009267: [BP] cellular response to starvation 1.74E-03 2.36E-03 2.68E-03 2.72E-03 3.02E-03 4.08E-03
GO:0016036: [BP] cellular response to phosphate starvation 9.38E-05 2.88E-04 6.03E-04 5.88E-04 7.29E-04 1.23E-03
GO:0010438: [BP] cellular response to sulfur starvation 7.88E-05 1.27E-04 1.77E-04 1.72E-04 2.14E-04 2.62E-04
GO:0034198: [BP] cellular response to amino acid starvation 0.00E+00 5.85E-05 1.35E-04 1.66E-04 2.09E-04 5.71E-04
GO:0010106: [BP] cellular response to iron ion starvation 2.98E-05 5.05E-05 7.54E-05 6.85E-05 8.30E-05 1.06E-04
GO:0080040: [BP] positive regulation of cellular response to phosphate starvation 0.00E+00 4.67E-05 6.69E-05 7.97E-05 1.08E-04 2.20E-04
GO:0006995: [BP] cellular response to nitrogen starvation 0.00E+00 7.24E-06 4.52E-05 4.37E-05 6.58E-05 1.37E-04
GO:0010350: [BP] cellular response to magnesium starvation 0.00E+00 1.69E-05 2.55E-05 3.06E-05 4.53E-05 7.57E-05
GO:0042594: [BP] response to starvation 0.00E+00 0.00E+00 0.00E+00 1.34E-05 2.73E-05 5.77E-05
GO:0034224: [BP] cellular response to zinc ion starvation 0.00E+00 0.00E+00 0.00E+00 4.79E-06 0.00E+00 5.79E-05
GO:0009970: [BP] cellular response to sulfate starvation 0.00E+00 0.00E+00 0.00E+00 4.44E-06 5.58E-06 2.97E-05
Symbiosis
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0052143: [BP] chemotaxis on or near host involved in symbiotic interaction 0.00E+00 3.19E-05 4.62E-05 6.89E-05 6.92E-05 5.21E-04
Photosynthesis
Min. 1st Qu. Median Mean 3rd Qu. Max.
GO:0019684: [BP] photosynthesis, light reaction 1.46E-03 1.66E-03 1.82E-03 1.83E-03 1.95E-03 2.52E-03
GO:0015979: [BP] photosynthesis 3.67E-04 5.23E-04 5.94E-04 6.53E-04 7.58E-04 1.21E-03
GO:0019685: [BP] photosynthesis, dark reaction 0.00E+00 0.00E+00 1.91E-05 2.43E-05 4.12E-05 1.29E-04

Table 4: Summary statistics for GO terms of interest. Summary statistics were computed for the selected GO terms. The selection of GO terms was performed using the keywords found above the table.

Selected KO terms involved in Nitrogen metabolism

The GO terms in the above table are quite broad terms and thus, in order to investigate specific genes involved in nitrogen metabolism, we selected KOs in the KEEG modules found in “Nitrogen metabolism” at https://www.genome.jp/brite/ko00002. For some of the broad modules, we selected specific parts of the module and only included the relevant KOs. Thus, not all of the module names seen in the below boxplot correspond to a KEEG module name (instead, they are part of a larger KEEG module).

Figure 1: Boxplot of selected KO terms found in Nitrogen metabolism. The colored bar indicates which KEEG module each KO belongs to. Each row in the boxplot is one KO group (some groups have multiple names as seen in the figure). Furthermore, some KOs are found in several modules as indicated in the legend of the colored bar.

Selected KO terms involved in Sulfur metabolism

The GO terms are quite broad terms and thus, in order to investigate specific genes involved in Sulfur metabolism, we selected KOs in the KEEG modules found in “Sulfur metabolism” at https://www.genome.jp/brite/ko00002. For some of the broad modules, we selected specific parts of the module and only included the relevant KOs. Thus, not all of the module names seen in the below boxplot correspond to a KEEG module name (instead, they are part of a larger KEEG module).

Figure 2: Boxplot Boxplot of selected KO terms found in Sulfur metabolism. The colored bar indicates which KEEG module each KO belongs to. Each row in the boxplot is one KO group (some groups have multiple names as seen in the figure). Furthermore, some KOs are found in several modules as indicated in the legend of the colored bar.