Genomics, transcriptomics, and laboratory experiments link bioconvection to nitrogen fixation

Abstract

Introduction:

Lake Cadagno is a meromictic lake characterized by a stable euxinic chemocline that hosts a diverse community of anoxygenic phototrophic sulfur bacteria, among the earliest photosynthetic organisms on Earth. These microorganisms are key to understanding the evolution of photosynthesis; however, due to the rarity of permanently anoxic environments, their genetic and ecophysiological traits remain poorly characterized.

Methods:

We generated four high-quality genomes (>93% completeness, < 2% contamination), including two purple sulfur bacteria (PSB; Chromatium okenii LaCa and Thiodictyon syntrophicum Cad16T) and two green sulfur bacteria (GSB; Chlorobium phaeobacteroides 1VII D7 and Chlorobium clathratiforme Cad4DE). Using an improved C. okenii genome, we analyzed chemocline transcriptomes under conditions with and without bioconvection. Nitrogen fixation potential was assessed through comparative genomic analyses of nif gene content and organization, complemented by laboratory growth experiments under nitrogen-limited conditions.

Results:

Nitrogen fixation (nif) genes were significantly upregulated in the chemocline, particularly in September, indicating a potential link between nitrogen fixation and bioconvection. Comparative genomic analyses revealed a higher abundance and diversity of nif genes in PSBs than in GSBs. Laboratory experiments demonstrated that PSBs (C. okenii, T. syntrophicum) and the GSB C. phaeobacteroides can grow using atmospheric nitrogen as the sole nitrogen source. Light intensity had minimal effects on overall biomass yield but influenced growth rates, while GSBs exhibited reduced performance relative to PSBs under nitrogen limitation.

Discussion:

Collectively, genomic, transcriptomic, and experimental evidence confirms active nitrogen fixation in dominant phototrophic sulfur bacteria of Lake Cadagno. The upregulation of nif genes and their association with bioconvection suggest a functional coupling between nitrogen cycling and physical mixing processes, potentially mediated by C. okenii. These findings provide new insights into the ecological role of anoxygenic phototrophs in stratified anoxic systems and their contribution to biogeochemical cycling.

Introduction

Life first emerged on Earth approximately 3.5–4.0 billion years ago, during the Archean eon, marking a crucial turning point in our planet's history (Van Wieren, 2021). Although today's biosphere is largely oxic, early life forms thrived in completely anoxic conditions. These primitive organisms initiated modern biogeochemical cycles through photosynthetic processes that ultimately triggered the “Great Oxidation Event,” a transformative shift that enabled the evolution of aerobic metabolic processes (Fischer et al., 2016; Demoulin et al., 2019; Uveges et al., 2023). The transition to aerobic metabolism conferred a substantial energetic advantage, as well as modifying the atmosphere by creating a protective ozone layer, facilitating the evolutionary diversification of life even beyond aquatic environments. Today, anoxic environments reminiscent of primordial oceans are extremely rare and typically restricted to “extreme environments,” making them difficult to study.

Meromictic lakes provide valuable natural laboratories for studying ancient ecosystems, as their permanent stratification maintains oxygen-depleted layers that support anoxic life forms, effectively representing “modern analogs” of early Earth environments (Gulati et al., 2017). The euxinic (i.e., anoxic and sulfidic) or ferruginous (i.e., anoxic and iron-rich) conditions within these lakes provide valuable insights into early microbial metabolic pathways and their potential analogs in extraterrestrial environments, where similar redox gradients may exist (Canfield, 1998; Poulton et al., 2004; Xiong et al., 2019). Microbial life in these ecosystems is often represented by sulfate-reducing bacteria (SRB) and methanogenic archaea typically found in the deepest dark anoxic layers (monimolimnion) and sediments, where they mediate the reduction of sulfur and iron, or methane production. In shallow meromictic lakes, where light penetrates into the oxic-anoxic redox transition zone called chemocline, dense communities of anoxygenic phototrophic purple (PSB) and green (GSB) sulfur bacteria thrive (Overmann and Van Gemerden, 2000). PSB and GSB are considered among the earliest phototrophic lineages to have evolved, retaining metabolic traits that predate oxygenic photosynthesis and thus providing key insights into the evolution of early energy metabolisms under anoxic conditions (Martin et al., 2018).

Lake Cadagno, located in the Swiss Alps, is a well-documented example of crenogenic meromixis, a particular limnological phenomenon driven by the inflow of mineral-rich groundwater (Otz et al., 2003). The euxinic layer, located at approximately 12 meters depth, supports a dense phototrophic bacterial layer (BL) consisting of at least six PSB and two GSB species (Tonolla et al., 2005; Decristophoris et al., 2009; Danza et al., 2018; Di Nezio et al., 2023). These microorganisms are integral to major biogeochemical cycles, particularly in carbon and sulfur cycling, where they contribute to organic carbon fixation and sulfur transformation, influencing the lake's overall ecosystem dynamics (Storelli, 2014; Posth et al., 2017; Luedin et al., 2019b). Recent studies suggest that these bacteria also play a significant role in the nitrogen cycle, suggesting their ability to fix nitrogen, an essential process for sustaining microbial life in nutrient-limited environments (Philippi et al., 2021).

This great biodiversity of anoxygenic phototrophic sulfur bacteria observed in the BL is also reflected in the diverse evolutionary strategies employed to dominate their ecological niche. One such strategy is bioconvection, i.e., the collective motion of microorganisms that can mix water columns up to a meter deep, a phenomenon rarely documented in nature (Sommer et al., 2017). In fact, the motile PSB Chromatium okenii swims phototactically thanks to a tuft of flagella, but upon sensing oxygen, it suddenly stops movement, increasing local water density. This denser water then sinks due to gravity, dragging the bacteria down with it. This collective mixing enhances access to light and nutrients, redistributes metabolic by-products, and helps maintain cells within optimal redox and irradiance conditions, thereby conferring a clear ecological advantage in a steeply stratified environment (Sepúlveda Steiner et al., 2021; Di Nezio et al., 2023). However, studying and cultivating these microorganisms under controlled laboratory conditions remains challenging due to the difficulty of reproducing key environmental parameters such as redox gradients, light availability, and sulfide concentrations. Laboratory-grown C. okenii exhibits marked phenotypic differences compared to its natural counterparts, underscoring the limitations of traditional cultivation methods and the importance of in situ studies (Di Nezio et al., 2024).

Recent advancements in sequencing technologies, such as single-cell DNA/RNA sequencing, metagenomics and transcriptomics, have provided unprecedented insights into the composition and functional potential of microbial communities without the need for cultivation (Emerson et al., 2008; Rinke et al., 2013). These approaches have facilitated the discovery of novel metabolic pathways and microbial interactions, revealing the adaptive strategies employed by anaerobic microorganisms in response to environmental stressors (Baker et al., 2013; Mackelprang et al., 2016; Barua et al., 2017). However, transcriptomic analyses remain challenging, particularly due to the absence of comprehensive reference databases, which complicate accurate annotation and differentiation between known and potentially novel genes (Emerson et al., 2008; Kopcakova et al., 2014; Choi et al., 2016). While de novo assembly methods offer a viable solution, they also introduce additional complexities and potential errors in gene prediction and functional annotation (Baker, 2012; Liao et al., 2019).

In this study, we sequenced the genomes of the four dominant phototrophic species in the BL, which represent more than 80% of the phototrophic sulfur bacteria cells (Di Nezio et al., 2023; Storelli et al., 2025). These high-quality genomes expand the genetic repertoire of anoxygenic phototrophic sulfur bacteria and enable a detailed investigation of the bioconvection process mediated by the fully sequenced purple sulfur bacterium Chromatium okenii through transcriptomic analyses. Gene expression in C. okenii was examined directly in the lake environment to minimize experimental artifacts by comparing BL transcriptomes collected during periods with active bioconvection in summer (July) and periods without bioconvection in autumn (September). Finally, the capacity for growth under nitrogen-free conditions was assessed experimentally in the laboratory, providing evidence consistent with the potential for PSB, alongside GSB, to contribute to nitrogen fixation.

Material and methodsStudy site and sampling

Lake Cadagno is located in the Piora Valley at 1921 m above sea level, in the southern Swiss Alps (46 °33′ N, 8 °43′ E, depth approximately 21 m). In addition to surface water tributaries, the lake receives inflows from sublacustrine springs, which supply high-density water that flows through gypsum-rich (CaSO4) dolomite rock (CaMg(CO3)2). The interplay of high salinity and low temperature maintains a dense, anoxic monimolimnion that remains stably stratified beneath the clear and oxygenated mixolimnion originating from the granitic zone. The chemocline at approximately 12 m depth harbors the dense phototrophic bacterial layer (BL), which was the main source of samples for genome and transcriptome analyses. The conductivity in the lower layer (monimolimnion) ranges between 0.20 and 0.25 mS cm−1 (Supplementary Figure S3, orange line), mainly due to the presence of carbonates ( up to 50 mg L−1) and sulfates ( up to 200 mg L−1) originating from dolomite (Del Don et al., 2001).

Physicochemical parameters of the water column were determined using a multiparameter probe (CTD115M, Sea & Sun Technology, Trappenkamp, Germany) equipped with pressure (bar), temperature (°C), conductivity (mS cm−1), dissolved oxygen (mg L−1), and turbidity (Formazine Turbidity Unit, FTU) sensors. Moreover, the CTD is further equipped with a photosynthetically active radiation (PAR, 400–700 nm) sensor (LI-COR Biosciences, Lincoln, NE, USA), detecting the spectral range (wave band) of solar radiation from 400 to 700 nm used by photosynthetic organisms in the process of photosynthesis, and a phycocyanin fluorescence (BGAPC) sensor (Turner Designs, San José, CA, USA). Different water samples were taken at the appropriate depths and analyzed chemically (50 mL and 12 mL with 5% zinc acetate) and biologically (1.5 mL) as described in Di Nezio et al. (2021). The water column profiles measured during the sampling campaigns for transcriptomic analyses on 16 July 2020 and 17 September 2020 are presented in the Supplementary Figure S3.

Isolation and growth conditions of anoxygenic phototrophic sulfur bacteria

The different strains of anoxygenic sulfur bacteria were monitored, isolated from Lake Cadagno, and cultivated in the laboratory over the past 20 years (see Table 1). From this culture collection, representative strains were selected for genome sequencing and physiological testing, including nitrogen fixation experiments. Phototrophic sulfur bacteria were grown in Pfennig's medium (Trüper, 1970) type I for PSB and type II for GSB both of which containing 0.25 g L−1 of KH2PO4, 0.34 g L−1 of NH4Cl, 0.5 g L−1 of MgSO4 7H2O, 0.25 g L−1 of CaCl2 2H2O, 0.34 g L−1 of KCl, 1.5 g L−1 of NaHCO3, in addition to different concentrations of carbonate, sulfide, and solutions of vitamins and trace elements, as shown in detail in the specifications outlined in the previous studies referenced in Table 1.

List of anoxic phototrophic sulfur bacteria fully sequenced in this study, isolated in the past and maintained in pure cultures in our laboratory (References).

PSB, purple sulfur bacteria; GSB, green sulfur bacteria.

Nitrogen fixation growth assays

To assess growth under nitrogen-replete (Standard) and nitrogen-depleted (No NH4Cl) conditions, cultures of PSB C. okenii LaCa and T. syntrophicum Cad16T, were incubated in Pfennig medium I (with 0.34 g L−1 of NH4Cl) and in a modified one without NH4Cl (with 0.34 g L−1 of NaCl). GSB C. phaeobacteroides 1VII D7 was used as a positive control. Ammonium concentrations were measured in all media prior to inoculation and at the end of the incubation period by photometric measurement using the Spectroquant Merck Ammonium Test Kit (1.00683: 2.0–150 mg L−1 (NH4-N), 2.6-193 mg L−1 ()). Concentrations below the detection limit of the assay were reported as zero.

Cultures were incubated under anoxic conditions at two light intensities 4.0 μE m−2 s−1, simulating the conditions of the lake, and 40.0 μE m−2 s−1, representing the laboratory light regime. Both incubation settings followed a 16/8-h light-dark photoperiod. All cultures were done in triplicate and growth was monitored over a 12-day incubation period. Growth trajectories were analyzed using linear mixed-effects models implemented in R (version 4.5.0), with time, medium (N+ vs. N−), and light intensity included as fixed effects and biological replicate as a random effect. Time was modeled using natural splines to accommodate non-linear growth dynamics, as implemented in the splines package (Wang and Yan, 2021). Mixed-effects models were fitted using the lme4 package (Bates et al., 2015), and statistical significance of fixed effects was assessed using Satterthwaite's approximation as implemented in lmerTest (Kuznetsova et al., 2017). Analyses were conducted separately for each strain. Final biomass was analyzed using linear mixed-effects models with medium and light as fixed effects and replicate as a random effect.

Flow cytometer

Flow cytometry (FCM) was used to monitor the growth and purity of the cultures. The analysis was conducted with a BD Accuri C6 flow cytometer equipped with two lasers (488 nm and 640 nm), dispersion and fluorescence detectors. Two parameters were measured: FSC (particle size) and SSC (internal granularity). To identify photosynthetic bacteria, an FSC-H threshold of 2,000 was applied to exclude debris and abiotic particles, followed by an FL3-A threshold > 1,100 to select cells with autofluorescence from chlorophyll or bacteriochlorophyll. The analysis was limited to 50 μL per sample, with dilution if necessary to not exceed 1,000 events mL−1, as previously shown (Danza et al., 2017; Di Nezio et al., 2021).

DNA extraction and sequencing

After being cultivated in the laboratory (see previous point “Isolation and growth conditions of anoxygenic phototrophic sulfur bacteria“), all the samples were filtered with a polycarbonate filter (Isopore 0.2 μm PC membrane, 25 mm diameter) using a vacuum pump (Vacuubrand GmbH Co. KG, Wertheim, Germany) connected to the filtration ramp (Pall Corporation, New York, NY, USA) until the filter was completely clogged (aprox. 5–10 mL). Genomic DNA was extracted with the phenol chloroform extraction protocol provide by Thermo-Fisher scientific (standard protocol). Genomes were sequenced by Fasteris (GeneSupport SA) using PacBio SMRT Hi-Fi sequencing on a Sequel IIe system. FASTQ files were QC-checked using FastQC (v.0.11.9) and deemed of good quality (Simon Andrews, 2010).

Genome assembly and annotation

De novo assembly was performed using Flye (v2.9.4) and polished using Circlator (v1.5.5) to remove repetitive regions, attempt chromosome circularization, and set the start coordinate at the dnaA gene (Hunt et al., 2015; Kolmogorov et al., 2019). All individual assemblies were manually reviewed to assess the quality of the identified contigs and remove artifacts from the sequencing and assembly process, including the removal of contigs smaller than 10 kbp. All de novo assemblies were checked for contamination and completeness using CheckM (v1.2.2) and BUSCO (v5.8.2_cv1; Parks et al., 2015; Tegenfeldt et al., 2025). Genes were annotated using the NCBI Prokaryotic Genome Annotation Pipeline (v2025-05-06.build7983; Tatusova et al., 2016). Functional annotation of genomes, including Clusters for Orthologous Groups (COG) category assignment, was performed using eggnog-mapper (v2.1.12; Cantalapiedra et al., 2021). All assembled genomes and raw sequences were submitted to NCBI and are available under the BioProject PRJNA1122537 (provisional).

Phylogenetic analysis

Average Nucleotide Identity (ANI) was calculated with FastANI (v1.34; Jain et al., 2018). Representative full length 16S rRNA sequences were retrieved from the Bacterial 16S rRNA RefSeq Targeted loci project (PRJNA33175) or extracted from the de novo assemblies using R Biostrings (v2.76.0). Complete genome assemblies for our bacteria of interest were retrieved from NCBI Datasets, comprising 11 GSB and 22 PSB genomes. All sequence analysis were performed separately for the two bacterial lineages (GSB and PSB).

Maximum likelihood (ML) phylogenetic trees were constructed using two distinct approaches: one based on bacterial 16S rRNA sequences, and another based on the amino acid sequences of 100 concatenated single-copy orthologs selected randomly. Single-copy orthologous proteins were identified using OrthoFinder (Emms and Kelly, 2019). Both datasets were aligned using MUSCLE (v3.8.31), and poorly aligned regions were trimmed using trimAl using the automate1 option (Edgar, 2004; Capella-Gutiérrez et al., 2009). ML tree inference was conducted using IQ-TREE 3 with 1000 ultrafast bootstrap replicates and 1000 aLRT tests to assess branch support (Hoang et al., 2018). ModelFinder was employed to automatically select the best-fit substitution model for nucleotide sequences (Kalyaanamoorthy et al., 2017). For the protein dataset, a partitioned model approach was applied to determine the best-fit amino acid substitution model for each gene partition (Chernomor et al., 2016). Tree visualizations were prepared using iTOL v7 (Letunic and Bork, 2024).

Transcriptomics: RNA extraction and analysis

To ensure enough RNA concentration, we first cultivated C. okenii LaCa in the laboratory and then placed them in the BL zone at a second stage using 50 cm vertical long dialysis bags (inflated diameter of 62.8 mm; Karl Roth GmbH Co. KG, Karlsruhe, Germany), These bags allow small molecules (< 20 kDa) to pass through but isolate C. okenii LaCa from other microorganisms in the BL, as already shown in previous studies (Storelli et al., 2013; Di Nezio et al., 2021). The samples enclosed in the dialysis bags were incubated for 1 month prior to RNA analysis, so that they had time to adapt to the environmental conditions.

C. okenii LaCa was isolated using filters for transcriptomic analysis, which were soaked in RNA later (Qiagen, Hilden, Germany) for 5 min immediately after filtration and then frozen at −20 °C, from three different dialysis bags, the first time on July 16, 2020 (with bioconvection; Supplementary Figure S3A) and the second time on September 17, 2020 (no bioconvection; Supplementary Figure S3B). RNA was extracted using the RNeasy plus Universal mini kit (Qiagen) following the protocol “Purification of total RNA Using the RNeasy Plus Universal mini kit” for the TissueLyser II, using the complete filter as starting material and using a mixture of glass beads of different sizes 0.1 mm, 0.5 mm and 1.0 mm. DNase treatment was performed using Ambion® Turbo DNA-freeTM kit (Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's instructions. Quantification of RNA was carried out with the QubitTM RNA HS Assay kit (Thermo Fisher Scientific) using a volume of 1.0 mL. Nanodrop absorbance ratios 260/280 nm and 260/230 nm were measured to check for impurities.

Complementary DNA (cDNA) for sequencing was prepared using the PCR-cDNA Barcoding Kit (SQK-PCB109; Oxford Nanopore Technologies, Oxford, UK) following the manufacturer's instructions. 50–100 fmol in 11.0 mL of reverse transcribed DNA were used for Oxford Nanopore Technologies (ONT) library preparation according to manufacturer instructions (Kit SQK-PCB109) and sequencing was performed with an ONT R9.4 flow cell. Quality Control (QC) metrics of the RNA sequencing with MinION of lake dialysis bag samples in July and September (Supplementary Table S3).

Basecalling was performed on raw FAST5 files using Guppy (v4.5.2), adapter removal was performed using pychopper (v2.5.0), followed by one step of poly-A removal using cutadapt (v4.6). Ribosomal reads were removed using RiboDetector (v0.3.1) prior to transcript quantification with oarfish (v0.6.5) using the de novo C. okenii assembly presented in this study (Deng et al., 2022; Zare Jousheghani et al., 2025). Reads mapping to RNA genes were excluded from the analysis. All subsequent analysis were run on R (v.4.5.0) using packages in the tidyverse (v2.0.0) for data manipulation and visualization. Differential expression analysis was run on the resulting count files using DESeq2 (v1.48.0) with apeGLM for LFC shrinkage (v1.30.0; Love et al., 2014; Zhu et al., 2019). Genes with an absolute log fold change |logFC| > 1, base Mean > 5.7 counts, and adjusted p-value < 0.05 were defined as differentially expressed.

Gene Set Enrichment Analysis (GSEA) was performed using clusterProfiler (v4.16.0) and fgsea (1.34.0; Yu et al., 2012; Korotkevich et al., 2021). The Gene Ontology (GO) database was created using AnnotationForge (v1.50.0) from Bioconductor.

ResultsHigh-quality genome assemblies

We successfully assembled and annotated four bacterial genomes isolated from Lake Cadagno, two purple sulfur bacteria (PSB, family Chromatiaceae), and two green sulfur bacteria (GSB, family Chlorobiaceae). All genomes were sequenced using PacBio HiFi and assembled with Flye, resulting in high quality assemblies with high completeness (>93%) and low contamination (< 2%), as assessed by CheckM and BUSCO (Table 2; further details in Supplementary Table S1). The PSB strains sequenced include Chromatium okenii LaCa and Thiodictyon syntrophicum Cad16T. The GSB are Chlorobium phaeobacteroides 1VII D7 and Chlorobium clathratiforme Cad4DE. Genome sizes ranged from approximately 3.0–7.7 Mbp, with contig numbers between 1 and 4. N50 values were consistently high, reflecting the contiguity of the assemblies (Table 2).

OrganismTypeGenome size (bp)Contig numberGC content (%)N50 (Mbp)Cov.Complet. (%)Contam. (%)GenBank provisional accessionChromatium okenii LaCaPSB3′137′866249.963.127194x93.510.28CP158465–CP158466Thiodictyon syntrophicum Cad16TPSB7′736′645366.226.83524x99.250.32CP158467–CP158469Chlorobium phaeobacteroides 1VII D7GSB3′104′529148.243.105218x93.372.03CP158472Chlorobium clathratiforme Cad4DEGSB3′007′024148.053.00742x97.640.46CP158483

Genome assembly characteristics.

All strains were sequenced with PacBio HiFi long reads and assembled with Flye. Completeness and contamination values were determined by CheckM.

Annotation revealed between 2,849 and 6,743 protein-coding genes per genome, with varying numbers of rRNA operons, tRNAs, and CRISPR arrays (Table 3). Notably, T. syntrophicum presented the largest genome and highest gene count, while the other genomes were approximately half its size. Cluster of Orthologous genes (COG) functional classification of genes from the complete dataset can be seen in Supplementary Table S2.

OrganismTypeProtein-coding genesrRNA genes (5S, 16S, and 23S)tRNAsncRNAsPseudogenesCRISPR arraysChromatium okenii LaCaPSB2,8499 (3, 3, 3)494363Thiodictyon syntrophicum Cad16TPSB6,7436 (2, 2, 2)5041285Chlorobium phaeobacteroides 1VII D7GSB2,8416 (2, 2, 2)473624Chlorobium clathratiforme Cad4DEGSB2,9336 (2, 2, 2)473833

Genome annotation summary.

Figure 1 shows the assembly for the PSB C. okenii LaCa strain represents a major improvement over the previously published assembly (GCF_002958735.1), now consisting of a single circular chromosome and a secondary circular contig (approx. 10 kbp). This contig did not map the main chromosome and contained a DNA polymerase, a recombinase and a transcription regulator apart from other hypothetical proteins.

Figure with three panels comparing genome assemblies and gene category counts for Chromatium okenii LaCa. Panel A shows a contig length vs. cumulative length plot, with the new assembly in red having longer contigs than the previous study in blue. Panel B displays BUSCO completeness, fragmentation, and missingness as stacked horizontal bars, with the new assembly in green and more complete than the previous one. Panel C is a grouped bar chart comparing counts of genes in COG categories, where red and blue bars show data from this study and from Luedin 2019, respectively, across functional gene categories.

Comparison of the Chromatium okenii LaCa genome assembly generated in this study with the previously published assembly (GCF_002958735.1; Luedin et al., 2019a). (A) Contig length (kbp) plotted against the cumulative percentage of total assembly length. (B) BUSCO completeness metrics. (C) Number of genes assigned to each COG functional category. In all panels, the assembly from this study is shown in red and the previously published assembly in blue.

Genome completeness assessment using BUSCO revealed a slightly reduced completeness for the C. okenii genome compared to the Chromatiaceae lineage, with 7.1% of expected single-copy orthologs missing (Supplementary Table S1). Among these, the canonical replication initiator gene dnaA was notably absent from all assemblies existing for this species. To assess whether this absence reflects an artifact or a lineage-specific feature, we performed an orthology-based comparative analysis across 23 complete Chromatiaceae genomes using OrthoFinder (see material and methods section). The orthogroup containing dnaA was absent in all C. okenii assemblies available and in only one other Chromatiaceae genome, whereas orthogroups encoding other core components of the replication machinery, such as dnaB, dnaN and the gyrases gyrA and gyrB, were conserved across all genomes analyzed.

The genome of Thiodictyon syntrophicum Cad16T closely matched the previously published assembly (GCF_002813775.1), confirming its identity and genomic stability nearly 10 years later. Chlorobium phaeobacteroides 1VII D7 was assembled into a single contig with 2,851 genes and showed 98.789% ANI with the DSM 266 strain (GCF_000015125.1), previously isolated from meromictic Lake Blankvann in Norway. The high ANI and nearly identical 16S rRNA sequences (99.87% of blast identity) confirm that both strains belong to the same species. Chlorobium clathratiforme Cad4DE was assembled into a complete genome for the first time. This strain, synonymous with Pelodictyon clathratiforme and Pelodictyon phaeoclathratiforme, shares 99.996% ANI with the BU-1/DSM 5477 strain, originally isolated from the monimolimnion of Lake Buchensee, Germany. This represents the first complete genome under the name Chlorobium clathratiforme. The similarities between the old and new genomes are evident both at the orthologous protein sequence level (Figure 2) and at the 16S rRNA level, as illustrated by the phylogenetic tree (Supplementary Figure S1).

Phylogenetic trees labeled A and B display evolutionary relationships among bacterial strains based on genome sequences. Branches are annotated with strain names, accession numbers, and support values. Key strains from the current study are highlighted in bold. Scale bars denote evolutionary distance.

Phylogenetic relationship of the four bacteria within all publicly available complete Genomes of closely related species. The maximum likelihood consensus tree was constructed from 100 single-copy orthologs randomly selected. Bootstrap support values are shown for nodes with support higher than 70%. (A) Phylogenetic tree for orthologous sequences of Chromatiales genomes. (B) Phylogenetic three for orthologous sequences of Chlorobiales genomes. Ca., Candidatus; st., strain.

Bioconvection: seasonal changes in gene expression

To explore the effect of bioconvection on C. okenii physiology, we performed exploratory RNA sequencing in situ on pure cultures using dialysis bags at two different times: in July (bioconvection active) and in September (inactive). Differential expression analysis between July and September samples identified a total of 91 differentially expressed genes (DEGs, 28 downregulated and 68 upregulated), from a total of 770 protein-coding genes detected in the experiment (Figure 2A and Supplementary Table S4). Gene Set Enrichment Analysis (GSEA) was performed to investigate the biological significance of the 91 DEGs (Supplementary Figure S4).

Genes associated with nitrogen fixation (nif HDK, nif ENB, nif T, nif V) were strongly downregulated in July, a period characterized by active bioconvection (Figure 3B). We also observed enrichment of Gene Ontology (GO) terms that indicate cell proliferation such as gene expression, biosynthetic process and primary metabolic process (also translation initiation and macromolecule biosynthesis, Supplementary Table S5) coinciding with the seasonal occurrence of bioconvection. Among genes that were enriched in July, we found the chaperonin groEL, nuoF, and nuoG (involved in oxidoreduction), and the light-harvesting antenna LH1. We also compared daily variations in gene expression between day and night in July, to try to understand why bioconvection persists even without light. The result of the comparison between day and night showed no significant changes in gene expression (data not shown).

Panel A shows a volcano plot of gene expression with log2 fold change on the x-axis and negative log10 false discovery rate on the y-axis, highlighting downregulated genes in blue, upregulated genes in orange, and non-significant genes in gray. Panel B features ridge plots of enrichment distribution for functional categories, with blue indicating downregulated and orange indicating upregulated gene sets across biosynthetic process, gene expression, primary metabolic process, nitrogen cycle metabolic process, and nitrogen fixation.

Differential analysis of gene expression of C. okenii cultures in dialysis bags between July and September. (A) Volcano plot of differentially expressed genes between July and September, with July taken as a reference point in terms of gene expression level. FDR values are capped to 10–10 and displayed as triangle shapes for visualization purposes. (B) Gene set enrichment analysis (GSEA) of Biological Process (BP) terms. Highest-scoring five gene categories are shown. All enriched categories have adjusted p-values < 0.0001.

Nitrogen fixationNitrogen pathway annotation

Nitrogenase (nif ) genes are usually found in highly conserved operons and generally have very similar phylogenetic histories. We found the three core nitrogenase components nif HDK and the cofactor assembly proteins nif ENB in all the assembled genomes (Figure 4).

Panel A shows a presence-absence heatmap of nif genes across four bacterial species, with orange squares indicating presence and white squares indicating absence for each gene. Panel B presents gene cluster maps for each species, displaying the organization and classification of nif genes by color: green for catalytic, blue for biosynthetic, orange for other nif, and pink for nif-associated genes, with genomic position given along the x-axis.

Nitrogen pathway in two PSB and two GSB genomes. (A) Presence of the key nif genes in the four genomes sequenced and annotated in this study. (B) Operons arrangement in the genomes of the two PSBs and two GSBs considered. Catalytic nif subunits are shown in green while biosynthetic subunits are shown in pink. Genes are grouped in clusters of at least 4 features separated by less than 30 kb.

Nitrogen fixation in laboratory

The presence in the genome of the necessary genes for fixing inorganic nitrogen has been experimentally verified in the laboratory. We monitored the growth capacity of PSB C. okenii LaCa and T. syntrophicum Cad16T, as well as GSB C. phaeobacteroides 1VII D7, in normal Pfennig medium with a nitrogen source (standard: with NH4Cl) and without (no NH4Cl: with NaCl). The growth of all phototrophs was monitored at two different light intensities, one similar to environmental conditions (4 μE m−2 s−1) and the other with a higher intensity similar to laboratory conditions (40 μE m−2 s−1).

The presence of NH4 in the medium before and after bacterial growth was measured to assess: (1) its actual utilization, and (2) the potential production of excess ammonium. Before the growth experiment, a value of 92.0 ± 2 mg L−1 was measured in the standard Pfennig medium, while in the other (No NH4Cl) medium, the value was 0.0 mg L−1. After 12 days of incubation, we measured the ammonium concentrations in all cultures again. In the standard soil, we saw a reduction, with final values of 3.7, 1.8, and 5.6 mg L−1 of ammonium for C. okenii LaCa, T. syntrophicum Cad16T, and C. phaeobacteroides 1VII D7, respectively. In Pfennig media modified with NaCl instead of NH4Cl, we found no trace of ammonium, which therefore remains at 0.0 mg L−1 for all cultures even after the experiment.

Figure 5 shows that all microorganisms can survive and reproduce under all growth conditions. The two PSB strains showed similar growth in both standard Pfennig medium (with NH4Cl) and modified medium (without NH4Cl) with only nitrogen in gaseous form (Figure 5, green line). Mixed-effects modeling of growth curves indicated that, in C. okenii and T. syntrophicum, nitrogen availability and light intensity modulated growth dynamics in a time-dependent manner; however, these effects did not translate into consistent differences in mean growth levels (Supplementary Table S7). In contrast,

Comments (0)

No login
gif