Lifestyle and Horizontal Gene Transfer-Mediated Evolution of Mucispirillum schaedleri, a Core Member of the Murine Gut Microbiota

Shifts in gut microbiota composition have been associated with intestinal inflammation, but it remains unclear whether inflammation-associated bacteria are commensal or detrimental to their host. Here, we studied the lifestyle of the gut bacterium Mucispirillum schaedleri, which is associated with inflammation in widely used mouse models. We found that M. schaedleri has specialized systems to handle oxidative stress during inflammation. Additionally, it expresses secretion systems and effector proteins and can modify the mucosal gene expression of its host. This suggests that M. schaedleri undergoes intimate interactions with its host and may play a role in inflammation. The insights presented here aid our understanding of how commensal gut bacteria may be involved in altering susceptibility to disease.

archaea, with BLASTP (version 2.2.31+) (Camacho et al., 2009) with a custom log e-value of -3. IS families of genes producing significant alignments were extracted from the BLAST results.
PHAST (Zhou et al., 2011) was used for the similarity based analysis of putative prophages and phage-like proteins, whereby genomic regions that are enriched in protein coding genes with known phage homologs are detected.

Identification of non-identical genes
Non-identical genes between the two strains were identified by subtracting almost identical genes (homology constraints were minLrap ≥ 0.9, maxLrap ≥ 0.9, identity ≥ 100%) from all genes (homology constraints were minLrap ≥ 0.9, maxLrap ≥ 0.9, identity ≥ 30%) using MicroScope's Gene Phyloprofile interface, whereby minLrap is defined as the quotient of the length of the match and the length of the shorter protein, whereas maxLrap is defined as the quotient of the length of the match and the length of the longer protein.

Putative horizontally-transferred genes
For horizontal gene transfer (HGT) analysis phylogenetic trees for each gene were calculated using the software PhyloGenie (Frickey & Lupas, 2004). Trees were constructed using RAxML version 8.2.4 with the GAMMA model of rate heterogeneity and rapid bootstrapping (100 bootstraps) (Stamatakis, 2014). To identify putative HGT events, we collected trees containing a node connecting Mucispirillum exclusively with a specified phylogenetic group but with no other groups. The trees that met this criterion were selected using PHAT (part of the PhyloGenie software package). Phylogenetically-closest species in each tree were identified using R (version: 3.2.1) (Team, 2013) and the R packages phytools (version: 0.4-56) (Revell, 2012) and ape (Paradis et al., 2004) by extracting species with the minimum phylogenetic distance within the node containing M. schaedleri.

Genome reconstruction and comparison
Recently, the genome of M. schaedleri ASF 457 (genome AYGZ) from a culture maintained in an American collection was announced (Wannemuehler et al., 2014). We had meanwhile sequenced the genome of M. schaedleri ASF 457 (genome MCS) using a culture maintained in a strain collection in Germany. While we believe that the two cultures are originally derived from the same stock from the Charles River laboratories, it is unclear how long the two cultures have been separated and how many bacterial generations may have occurred subsequently. Neither genome is closed, but estimates based on detection of tRNAs and conserved housekeeping genes indicate that the genomes are largely complete (Table S1). Both genomes have a GC content of 31% and a coding density of 88%. The number of detected coding sequences (CDS) without artifacts is 2,227 for the AYGZ and 2,218 for the MCS genome. The content of genomic objects in the two genomes is highly similar, with shared CDSs accounting for 92% of all CDSs in both genomes (shared CDSs are defined as having ≥ 80% amino acid similarity and ≥ 80% sequence length). As the genomes are not closed it is not possible to determine whether the differences in CDSs is due to absence from a genome or due to technical artifacts (incomplete sequencing and/or assembly).

Putative electron donors and carbon sources
The genome encodes 15 proteases, of which 4 (AYGZ) respectively 5 (MCS) are predicted to be secreted, and 3 aminopeptidases. Catabolic pathways for glutamine, asparagine, and cysteine are present. The genome encodes multiple ABC transporters for amino acids in general, and in particular for leucine/isoleucine/valine, methionine and toluene. It has transporters for peptides (ABC-type), oligopeptides (appBCD), and a permease for oligopeptides is also present.
Also an ABC-type transporter for polyamines could be detected. The genome features an extremely reduced repertoire of polysaccharide degradation machinery with just 3 glycoside hydrolases belonging to family 57 (α-amylases).
M. schaedleri has genes for degradation of glycerophosphodiester and glycerol utilization,

Cofactors and vitamins
M. schaedleri can produce coenzyme A (CoA), which plays a role in the oxidation and biosynthesis of fatty acids and in the TCA cycle. Biotin, a water-soluble B-vitamin that, as a coenzyme, is involved in gluconeogenesis and in the synthesis of isoleucine, valine and fatty acids, can also be produced. It can be synthesized from 7-keto-8-aminopelargonate, from riboflavin and flavin adenine dinucleotide (FAD), an essential flavin cofactor that is involved in a variety of redox reactions, and from thiamin diphosphate, which plays an essential role in energy metabolism as a cofactor of a variety of enzymes like pyruvate dehydrogenase or transketolase. The pathway for de novo biosynthesis of coenzyme B12 (cobalamin coenzyme) is incomplete, but M. schaedleri can synthesize coenzyme B12 from cobalamin. Although no transporter for cobalamin was detected, it can probably be synthesized de novo from cobinamide via an uncharacterized route.

SecD-SecG and SecY translocases from the Sec translocase-mediated pathway, and TatA and
TatC translocases from the twin-arginine translocation (Tat) system for protein translocation across and insertion into membranes were detected. M. schaedleri has transporters for molybdate (ABC-type), peptide/nickel (ABC-type), nickel (ABC-type), iron (ABC-type), magnesium, cobalt (ABC-type), cadmium and zinc. A sodium:proton antiporter was detected and the genome putatively also encodes a biotin transporter, a lipopolysaccharide transporter and a sulfate transporter. A drug resistance MFS transport protein (drug:H+ antiporter-2 family), a putative multidrug-efflux transporter MexB and the multidrug efflux system protein SugE for exporting antibiotics and other cytotoxic substances are also present.

Storage compounds
M. schaedleri appears to be able to produce glycogen as a storage compound, as a glycogen synthase (E.C. 2.4.1.21) and a glycogen phosphorylase (E.C. 2.4.1.1) were detected in the genome. Adjacent to the glycogen synthase there are three glycosyl hydrolases (family 57) that may be involved in glycogen processing. A polyphosphate kinase (Ppk) was detected which enables the organism to synthesize polyphosphate (poly P) from inorganic phosphorus or from the terminal phosphate of ATP. Despite the use of poly P as a potential energy source, it might also play a role in both stress response and pathogenicity. A poly P-AMP-phosphotransferase (PAP), which phosphorylates AMP to ADP using poly P as a substrate, could not be detected.

Motility
The genome encodes more than 80 proteins classified in the COG group Cell Motility

Virus defense (CRISPR)
M. schaedleri has a CRISPR/Cas-System with a length of 694 nucleotides, which is identical between the two genomes. The cas1, cas2 and cas9 genes were detected, indicating that it is a type II CRISPR/Cas-System (Makarova et al., 2011). There are 10 spacers, all of them identical between both genomes, and a repeat consensus sequence with a length of 36 nucleotides. We searched for targets of the crRNA spacers, but of 10 spacers only one had a single match, which was Bacillus thuringiensis MC28 plasmid pMC189 (NC_018687).

Mobile genetic elements
One intact prophage, including putative head and tail proteins, was detected in the AYGZ genome. All of the predicted phage-like proteins are also present within a region in the MCS genome, but have not been predicted as intact prophage. The prophage region has a size of 134.6 kb and 32% of CDS (43 total) in this region encode phage-like proteins. Consistent with known quality of integration sites, multiple tRNAs and transposases were detected in this region, although an integrase could not be identified. The putative head and tail proteins are located in a region in close vicinity to the CRISPR/Cas-system region. No plasmid was identified.

Putative horizontally transferred genes (HGT)
M. schaedleri putatively acquired several genes involved in virulence, resistance and defense, and mobile genetic elements from other bacteria. The gene of the HlyD family secretion protein, which is involved in the transport of hemolysin A shares a node with Helicobacter.
Parts of the CRISPR/Cas-system were putatively acquired from Bacilli and Epsilonproteobacteria, with the CRISPR-associated Csn1 (Cas9) family protein coming from Staphylococcus (Bacilli) and the CRISPR-associated endonuclease Cas1 protein coming from either Campylobacter or Helicobacter (Epsilonproteobacteria), which also appear to be the origin of the vapD gene. The Tra conjugal transfer proteins and the VirB complex from a putative type 4 secretions system are related to genes from Proteobacteria. Genes involved in resistance appear to have its origin in a wider range of phylogenetic groups with a drug resistance MFS transporter (drug:H+ antiporter-2 family) coming from Bifidobacterium, a drug/metabolite transporter (DMT family) coming from Staphylococcus, a putative multidrug resistance protein MexB coming from Desulfovibrionaceae, and a putative β-lactamase coming from Campylobacter. The putative methyl-accepting chemotaxis protein is related to genes from Campylobacter. Clostridia appear to be the source of several proteins involved in transport, cofactor biosynthesis, respiration and oxygen stress response. Transport proteins for cobalt are related to those from Clostridium as well as the nitroreductase and ruberythrin. A cobalamin synthase putatively comes from Eubacterium, the hydrogenase 2 from Geobacter and the catalase from Sphaerochaeta. Most of the putative HGT genes are classified in the COG classification scheme as replication, recombination and repair (COG category L) with a large fraction coming from Firmicutes. Firmicutes are also by far the largest group putatively contributing to coenzyme transport and metabolism (COG category H) and inorganic iron transport and metabolism (COG category P), whereas Proteobacteria appear to be an important source for laterally transferred genes in most of the other COG categories.