Metabolic Fingerprints from the Human Oral Microbiome Reveal a Vast Knowledge Gap of Secreted Small Peptidic Molecules

Metabolomics is the ultimate tool for studies of microbial functions under any specific set of environmental conditions (D. S. Wishart, Nat Rev Drug Discov 45:473–484, 2016, https://doi.org/10.1038/nrd.2016.32). This is a great advance over studying genes alone, which only inform about metabolic potential. Approximately 25,000 compounds have been chemically characterized thus far; however, the richness of metabolites such as SMs has been estimated to be as high as 1 × 1030 in the biosphere (K. Garber, Nat Biotechnol 33:228–231, 2015, https://doi.org/10.1038/nbt.3161). Our classical, one-at-a-time activity-guided approach to compound identification continues to find the same known compounds and is also incredibly tedious, which represents a major bottleneck for global SM identification. These challenges have prompted new developments of databases and analysis tools that provide putative classifications of SMs by mass spectral alignments to already characterized tandem mass spectrometry spectra and databases containing structural information (e.g., PubChem and AntiMarin). In this study, we assessed secreted peptidic SMs (PSMs) from 27 oral bacterial isolates and a complex oral in vitro biofilm community of >100 species by using the Global Natural Products Social molecular Networking and the DEREPLICATOR infrastructures, which are methodologies that allow automated and putative annotation of PSMs. These approaches enabled the identification of an untapped resource of PSMs from oral bacteria showing species-unique patterns of secretion with putative matches to known bioactive compounds.

ABSTRACT Recent research indicates that the human microbiota play key roles in maintaining health by providing essential nutrients, providing immune education, and preventing pathogen expansion. Processes underlying the transition from a healthy human microbiome to a disease-associated microbiome are poorly understood, partially because of the potential influences from a wide diversity of bacterium-derived compounds that are illy defined. Here, we present the analysis of peptidic small molecules (SMs) secreted from bacteria and viewed from a temporal perspective. Through comparative analysis of mass spectral profiles from a collection of cultured oral isolates and an established in vitro multispecies oral community, we found that the production of SMs both delineates a temporal expression pattern and allows discrimination between bacterial isolates at the species level. Importantly, the majority of the identified molecules were of unknown identity, and only~2.2% could be annotated and classified. The catalogue of bacterially produced SMs we obtained in this study reveals an undiscovered molecular world for which compound isolation and ecosystem testing will facilitate a better understanding of their roles in human health and disease.  32). This is a great advance over studying genes alone, which only inform about metabolic potential. Approximately 25,000 compounds have been chemically characterized thus far; however, the richness of metabolites such as SMs has been estimated to be as high as 1 ϫ 10 30 in the biosphere (K. Garber, Nat Biotechnol 33:228 -231, 2015, https://doi.org/10.1038/nbt .3161). Our classical, one-at-a-time activity-guided approach to compound identification continues to find the same known compounds and is also incredibly tedious, which represents a major bottleneck for global SM identification. These challenges have prompted new developments of databases and analysis tools that provide putative classifications of SMs by mass spectral alignments to already characterized tandem mass spectrometry spectra and databases containing structural information (e.g., Pub-Chem and AntiMarin). In this study, we assessed secreted peptidic SMs (PSMs) from 27 oral bacterial isolates and a complex oral in vitro biofilm community of Ͼ100 species by using the Global Natural Products Social molecular Networking and the DEREPLICATOR infrastructures, which are methodologies that allow automated and putative annota-tion of PSMs. These approaches enabled the identification of an untapped resource of PSMs from oral bacteria showing species-unique patterns of secretion with putative matches to known bioactive compounds. KEYWORDS Lactobacillus, Streptococcus, Veillonella, biofilms, oral microbiology, peptidic small molecules H ost-microbiome interfaces are known to be of critical importance in structural, immunological, and metabolic functions (1,64). These communication hot spots are key in human health and can be found in the oral cavity, which also provides the perfect portal of entry for microbes. Most of the approximately 1,000 oral bacterial species that have been identified so far are considered commensal. They have coevolved with their host and carry critical functions in training the immune system, protecting against epithelial cell injury, and suppressing pathogenic microbial growth (2). The last decade's advances in sequencing technologies have revealed not only a heterogeneous distribution of bacterial taxa within the oral cavity (3,4) but also between individuals within populations and between geographically distinct populations (e.g., the West versus Chinese) (5,6). Each human mouth harbors a unique bacterial diversity consisting of on average approximately 150 bacterial taxa (5), and in addition to this complexity within each of these exclusive communities, all primary types of organism interaction can exist: i.e., consumer-resource interactions, competition, and mutualism (7,8). Most of our knowledge of the human microbiome derives from associative large-scale DNA sequencing studies, which suggest that healthassociated microbiomes are highly heterogeneous (65,66) and that specific pathogens are associated with disease (9). Learning from previous studies, we now understand that microbiome research needs to move beyond correlation-based analysis, and we need to gain a deeper knowledge of molecular mechanisms supporting its complex network of functions. The social language (i.e., primary and secondary metabolites) of host-microbiome interactions needs to be identified, quantified, and ultimately functionally characterized. Numerous human microbiome-associated biosynthetic genes have been identified, which encode the major metabolic classes of SMs (10). These have a variety of biological functions, including antibacterial and immune modulation activities (10,11). Antagonist activities between bacterial species associated with the human microbiome are mediated by SMs such as lantibiotics, bacteriocins, and microcins, which support both commensals and pathogens to compete and establish resilient colonization (10). The latter SMs are usually active against a narrow spectrum of Gram-positive bacteria that are closely related to the producing strain (10). Other SMs are host targets, such as Escherichia coli-produced enterotoxins (12) and modified amino acids (e.g., the neurotransmitter tryptamine) synthesized by gut bacteria (13). A few peptidic SMs (PSMs) that mediate antagonistic interactions between bacteria in the oral cavity have been isolated and structurally identified, such as mutanobactins (14), salivaricins (15), and proteases (17). However, the mechanisms that trigger the biosynthesis of these SMs in complex host-microbiome interactions in vivo are yet unknown. A few recent studies of human skin and gut reveal the potential of thousands of chemical species (10,18). Here, we report on the temporal production and overall chemodiversity of the PSM secretome of human oral cavity-associated bacterial isolates belonging to the Actinomyces, Fusobacterium, Lactobacillus, Porphyromonas, Streptococcus, and Veillonella genera, as well as an in vitro oral biofilm model system containing more than 100 bacterial species (19,20). Our previous comparative metagenomics and metatranscriptomics studies of the same biofilm model system confirmed its taxonomic and metabolic similarities to natural oral biofilms (19,20). We also conducted parallel metatranscriptomics and global metabolomics (extracellular and internal) using gas chromatography-mass spectrometry (GC-MS) to study the interplay between gene expression and core metabolites. Here we expand on our knowledge of the in vitro biofilm community to include PSMs, which are well-known bioactive compounds that play key roles in cell-to-cell signaling and interactions with a host (21).
By employing liquid chromatography-tandem mass spectrometry (LC-MS/MS) on cell-free bacterial growth medium extracts, using the Global Natural Products Social Molecular Networking (GNPS) (22) and the DEREPLICATOR (23) infrastructures, we putatively identified a large number of PSMs secreted from the in vitro biofilm community as well as from individual bacterial isolates. We studied PSM secretion under different incubation conditions and incubation time points, including a rich SHI medium and a minimal chemically defined medium (cdm). The latter incubation medium supported an overall high metabolic activity and metabolite production but no growth in our previous study of the in vitro biofilm community (20). This condition is favorable for secondary metabolite production as most are produced during the stationary phase and not during exponential growth.
Together these results show a complex expression of SMs, whose role in human health and disease is yet unknown. The dynamic and species-specific production observed here implicates their potential roles in host-microbiome interactions and bacterial interspecies interactions. This work provides the first comprehensive landscape of unexplored PSMs produced by representatives of known commensal and pathogenic oral bacteria. The study lays the groundwork for further identification, classification, and investigation of novel groups of clinically and ecologically important PSMs.

RESULTS AND DISCUSSION
LC-MS/MS is a key analytical technology for detecting SMs of low molecular weight that cannot easily be structurally identified by genome sequencing (24,25). Here we applied an ultrahigh-resolution-quadrupole time of flight mass spectrometry (UHR-qTOF MS) approach that yields high sensitivity regarding resolving closely spaced spectral peaks and detection of peaks with less intensity compared to standard methods. The approach allowed us to apply collision energy stepping coupled with the TOF transfer stepping, which facilitate thorough fragmentation of a diversity of molecules in a single LC-MS run. Given the relatively few studies comprehensively targeting identification and classification of secreted molecules of human oral bacterial species grown alone or existing as complex multispecies biofilm, our goal was to capture unique and previously unknown PSM signatures by applying the workflow presented in Fig. 1. We hypothesized that bacteria with distinct taxonomic backgrounds produce a rich diversity of PSMs and that some PSMs overlap other bacteria, while others are uniquely produced. This information can be used to target ecologically and clinically important PSMs and also to potentially taxonomically annotate SMs in the natural environment. We proposed that PSMs obtained from monocultures and mixed cultures of bacteria vary over time due to changes in metabolic activity across different stages of growth. Such a phenomenon is well acknowledged for all living organisms but largely underexplored for the human microbiome. In the first part of this study, we grew 27 bacterial isolates representing taxonomically broad groups of oral bacteria (e.g., Streptococcus, Veillonella, Lactobacillus, Porphyromonas, Actinomyces, and Fusobacterium) (see Fig. S1 in the supplemental material) and an already established in vitro oral biofilm model system (19,20). This model system was previously shown to be both taxonomically and transcriptionally stable over time (19,20), which allowed us to study PSM secretion on an hourly basis here. The study was conducted under anaerobic conditions in carbohydrate (i.e., sucrose, glucose, or lactate)-amended growth medium, which we previously showed maintained the highest taxa diversity representative of an environment in the anterior regions of the maxilla, where food particles may be stuck for longer periods of time and where saliva velocity is low (26,27). This is a highly understudied environment that is of significant interest due to its association with caries disease. Bacteria were incubated in 1-ml cultures in a 24-well plate setup, and growth could be observed at the bottom of each growth well after an initial incubation period in a blood-based SHI medium (Fig. S1). (For more details on growth conditions, see Materials and Methods.) Spent growth media were collected from each growth well at different time points for LC-MS/MS analysis (Fig. S1). The obtained MS/MS spectra were analyzed by using the GNPS network (22) and DEREPLICATOR (23) infrastructures, which were previously developed to analyze large MS/MS data sets where one sample may contain several thousand MS/MS spectra (Fig. 1). Currently, GNPS (http://gnps.ucsd .edu/) (22) has approximately 100 million MS/MS spectra available for analysis, of which 7.7 million have matches (dereplicated) to 15,477 known compounds (28). Additional features in GNPS include the annotations of theoretical masses for the most common adducts MϩH, MϩH2, MϩK, and MϩNa (i.e., variants of the same compound but with direct additions of new molecules). DEREPLICATOR enables high-throughput PSM identification that is compatible with large-scale mass spectrometry-based screening platforms. The tool constructs theoretical spectra for all peptides in chemical structure databases (e.g., PubChem and AntiMarin), which enables PSMs to be dereplicated without reference spectra. To demonstrate the power of DEREPLICATOR, Mohimani and colleagues (23) analyzed the approximately 100 million spectra in GNPS and identified . To identify parent masses and their corresponding ion fragment profiles in each sample, growth extracts were analyzed with an UltiMate 3000 UHPLC system and a Maxis qTOF mass spectrometer equipped with an electrospray ionization (ESI) source. (Steps 2A and B) Parent masses obtained from replicate samples were sorted into bucket tables and compared between time points (step 2A) and between bacterial isolates (step 2B) by using Venn diagrams and cluster analyses. (Steps 3A and B) The Global Natural Products Social Networks infrastructure (22) was employed to putatively annotate MS/MS spectra by spectral alignments of query spectra with~20,000 benchmarked MS/MS spectra in the GNPS library (step 3A). GNPS networks revealed associations between query spectra and benchmark spectra, which contributed to level 2 annotations of~50 PSMs (step 3B). (Steps 4A and B) The DEREPLICATOR tool (23) was used to annotate MS/MS spectra and predict the probability of each annotation by calculating false discovery rate (FDR) scores (step 4A). These annotations were based on structural homologies with PMS in databases such as PubMed and correspond to level 2 annotation standards (step 4B). twice as many PSMs. Based on these findings, we employed both infrastructures to improve putative annotation of mass spectra in this study. Despite the eventual influence of adducts, the high number of PSMs (400 to 900 parent masses per bacterial isolate and time point of growth) that we observed here is noteworthy (see discussion below). Discussions of annotations of ion fragmentation spectra (MS/MS) from mass spectrometry experiments in the following sections follow the "level 2" annotation standard described by the metabolomics standard initiative (29). This standard was based upon significant literature, which decidedly supports putative identification of SMs based on their ion fragment spectra (30)(31)(32)(33)(34)(35)(36). Also, most of the annotations that were identified here matched benchmarked compounds that were obtained from isolated or commercial standards in house: therefore, we feel more confident assigning their putative annotations. In addition, to minimize effects of false positives, such as adducts in our comparative data analysis, we only included SMs that could be identified in replicate samples.
PSM production over time during sugar fermentation. In a previous global genome mining study of thousands of genomes from bacterial isolates of the human microbiota and 752 metagenomes from five human body sites, 3,118 SM biosynthetic gene clusters (BGCs) were identified (37). The oral cavity was by far the richest environment (together with the gut) and was found to harbor 1,061 BGCs (37). Products of these BGCs are for the most part unknown, and the elucidation of these remains a daunting challenge for the understanding of key ecological functions of the human microbiome. Larger BGCs, such as those encoding polyketides (PKs) and nonribosomal peptides (NRPs), are often horizontally transferred between closely related strains and species, and therefore, bacterial phylogenetic signatures have been associated with these in previous studies (38,39). In this study, we applied hierarchical cluster analyses (using Pearson correlation) and multidimensional scaling (MDS) ordinations to capture relationships between MS/MS spectra at two growth stages (at 24 and 72 h of growth, respectively). We chose to incubate cultures for longer periods of time since secondary metabolites are known to be produced during the stationary phase (later stages of growth). Our previous findings from the in vitro biofilm community also showed that no significant cell division activity (i.e., growth) occurred during incubation under similar growth conditions (20). However, overall metabolic activity (gene transcription and metabolic output [i.e., primary and secondary metabolites]) remained high, indicating that most bacteria in the biofilms had entered the stationary phase (20). The cluster analyses revealed a significant difference (two-way analysis of variance [ANOVA] and Tukey's multiple comparisons test, with P values ranging between Յ0.001 and Յ0.05) between bacterial species and time points ( Fig. 2; see Table S2 and Table S3  A few examples of species-specific MS/MS signatures were observed for S. parasangunis and L. fermentum SHI-2, as their PSM profiles from two different time points clustered together for respective species. At 24 h of growth, Streptococcus salivarius SHI-3 and Veillonella parvula SHI-1 clustered most closely to the species complex biofilm community from which they were originally isolated ( Fig. 2) (30). These two species also showed concurrently high gene transcription activity at the genome level in low pH in a previous study (20). To address to what extent V. parvula SHI-1, S. salivarius SHI-3, and the in vitro biofilm community produced similar PSMs, we conducted GNPS MS/MS network analysis of ion fragment spectra from monocultures, including the two species and the in vitro biofilm. This revealed a total of 2,151 MS/MS features, of which the biofilms shared 418 MS/MS features (~20%) with S. salivarius SHI-3, 250 with V. parvula SHI-1 (~12%), and 233 (~11%), with both S. salivarius SHI-3 and V. parvula SHI-1, while 362 MS/MS (~17%) features were unique (see Table S4 posted at ftp://massive.ucsd.edu/MSV000079151/updates/2017-06-30_aedlund_64677506/ other/). These results demonstrate that it is potentially feasible to use MS/MS spectra derived from monocultures of bacterial species to putatively annotate species-unique PSMs in highly complex and understudied biofilm communities. However, a limitation to this approach is that bacteria can have different physiologies when grown as planktonic versus biofilm or monospecies versus multispecies, and therefore they may produce conditionally specific PSMs leading to low overlap. Regardless of this discrepancy, our results suggest that cultivated lab strains of bacteria can serve as a guide to putatively annotate PSMs taxonomically in complex microbial communities, such as oral biofilms. Such annotation approach can have important applications in future disease diagnostics to verify presence of specific pathogens. A cometabolic relationship between oral Veillonella and Streptococcus was previously identified in dual-species interaction studies where Veillonella uses Streptococcus-produced lactate as the sole carbon source (40). However, recent studies suggest that we are only just beginning to understand this relationship as dual-species biofilms of the two are less susceptible to antimicrobial treatments than individual monoculture biofilms, suggesting a more complex metabolic interaction (41). Based on these previous studies, our findings here, and our previous metatranscriptomics study of the same in vitro biofilm model system (20), we suggest that a specific interaction between S. salivarius SHI-3 and V. parvula SHI-1 is represented by cometabolic interactions at the level of PSMs. In conclusion, by using the applied clustering approaches, no clear phylogenetic congruency could be revealed from MS/MS profile comparisons (e.g., Streptococcus species did not form a separate cluster). Therefore, we suggest that the overall production of PSMs in this study may reflect PSM biosynthesis via minor enzymatic modifications (e.g., methylation, acetylation, etc.) of peptides and amino acids and not biosynthesis via complex PK and NRP gene clusters with phylogenetic signatures. The overall presence of PSMs with no phylogenetic signature is also in line with our previous metatranscriptome study of the complex oral in vitro biofilm model system (20), where activity changes of the genes and pathways (KEGG orthology level) were not in total agreement with phylogeny in response to carbohydrate amendment and pH stress, e.g., S. agalactiae's differentially expressed pathways clustered closely with L. fermentum SHI-2 while S. parasangunis activities clustered with Klebsiella sp. and not with other Streptococcus species. To better understand the phylogenetic relatedness of PSMs in future studies, a larger number of bacterial isolates from major taxonomic groups, grown in different growth media under different growth conditions, should be considered.
A broad diversity of secreted PSMs is revealed across time. By applying a comparative network approach of ionized parent mass (MS/MS) profiles from different time points of growth, we could identify that bacteria secrete a wider variety of PSMs than when analyzing a single time point (Fig. 3 and 4), which suggests that natural product discovery endeavors may benefit from screening multiple growth stages of bacterial isolates. Approximately 400 to 900 PSMs with MS/MS profiles were produced by each isolate per time point (Fig. 4). Comparative MS/MS analyses of samples that contain a high diversity of metabolites is not an exact quantitative measure due to the fact that some parent masses may not be selected for collision-induced dissociation (CID). In addition, the method we applied here specifically targets peptidic SMs within the size range of approximately 100 to 2,000 Da, which excludes many other SMs. In this study, to circumvent loss of parent masses that do not disassociate, we applied a stepping approach that allowed us to apply collision energy stepping coupled with TOF transfer stepping, which is known to provide thorough fragmentation of a diversity of molecules in a single LC-MS run (42).
When analyzing all MS/MS profiles together, using the GNPS network tool, only 153 profiles had matching annotations (more details of these annotations are presented in the discussion below) (see Table S1 posted at ftp://massive.ucsd.edu/MSV000079151/updates/ 2017-06-30_aedlund_64677506/other/), which suggest that most PSMs produced by oral bacteria are unknown. Parent mass distributions of the identified masses varied from m/z 110 to 1,865 (see Fig. S3 in the supplemental material). Approximately one-third of parent masses ranged between m/z 110 and 299, while the remaining masses ranged between m/z 300 and 899. Only a smaller fraction was of larger sizes: m/z 900 to 1,865 (Fig. S3). An hourly comparison of secreted SMs from the in vitro biofilm community revealed a shift in SM profiles starting as early as after 3 h of inoculation (Fig. 3). Relatively similar molecular masses were detected after 6 h; however, clear shifts occurred at 9 and 17 h of biofilm growth (Fig. 3). The implications of this dynamic behavior of PSMs for human health are unknown, which suggests that more emphasis should be put on understanding the role of PSM changes over time. Such information would provide a deeper knowledge of detailed short-term mechanisms that foster overall key functions of microbial communities.
GNPS and DEREPLICATOR annotation pipelines reveal PSM candidates with structural similarities to compounds with known bioactivity. By analyzing all obtained MS/MS spectra representative of the 27 bacterial species using GNPS network (22) and DEREPLICATOR approaches (23), we were able to identify putative analogs of PSMs belonging to known classes of compounds, which will be discussed here. It is important to highlight that even small changes in chemical structure (here viewed as differences in ion fragments between query MS/MS spectra and matching benchmark MS/MS spectra) can greatly impact biological function. Since we rely on automated annotation and a level 2 classification standard to identify PSMs, the following discussion regarding their biological role is not absolute.
Our previous study of the oral in vitro biofilm model system indicated the presence of PSMs such as single amino acid derivatives, dipeptides, and lactone-like compounds during growth at low pH (20), and therefore we choose to explore their production in more detail here. Results from GNPS networks revealed a total of 32 putative dipeptide annotations (see Table S1 posted at ftp://massive.ucsd.edu/MSV000079151/updates/ 2017-06-30_aedlund_64677506/other/). These were structurally similar to 13 known peptides with defined MS/MS profiles in the GNPS database (cyclo-Gly-Leu, cyclo-Leu-Pro, cyclo-Pro-Val, cyclo-Leu-Phe, cyclo-Pro-Gly, cyclo-Thr-Pro, cyclo-Val-Phe, cyclo-Ala-Leu, cyclo-Phe-4-Hyp, cyclo-Phe-Pro, cyclo-Trp-Pro, cyclo-Tyr-Pro, cyclo-Val-Pro) (see Table S1 posted at the above URL). The analysis showed a mixed taxonomic origin of these peptides (i.e., several phylogenetically distinct bacterial isolates were able to produce them). Although the functional role of most of these dipeptides is unknown, a few earlier studies showed they have critical roles in regulating the production of homoserine lactones (43,44). Other studies identified that they are capable of interacting with other bacterial community members at the level of gene transcription as well as regulating bacterial population sizes and survival (45,46). The previously identified PKS-NRPS metabolite mutanobactin A (m/z 721.4350), which was isolated from S. mutans UA159 (14), was also identified at all growth stages in S. mutans UA159 biofilms here (see Tables S1 and S5 at the above URL). Ion fragments from the growth extracts were compared to the pure compound (m/z 721.4380), which was available in our lab as a gift from the Qi lab (14) (see Table S5 at ftp://massive.ucsd.edu/MSV000079151/updates/2017-06-30_aedlund_64677506/ other/). Both GNPS and DEREPLICATOR were able to identify this PSM, showing that our experimental protocols were optimized for extraction and identification of such PKS-NRPS metabolites (see Tables S5 and S6 at the above URL). Other interesting putative PSMs, produced both by bacterial isolates and the in vitro biofilm community, were platelet activating factor (PAF) C-16 (m/z 524.368 to 524.378) and lyso-PAF (m/z 482.362 to 482.362)-like compounds ( Fig. 5a; see Table S1 posted at the above URL). PAF-like PSMs could be identified in growth extracts from multiple bacterial species belonging to the genera Fusobacterium, Streptococcus, and Actinomyces and the periodontal pathogen Porphyromonas gingivalis. A lyso-PAF-like PSM was only identified in P. gingivalis growth extracts (Fig. 5a; see Table S1 posted at the above URL). PAF is a potent lipid mediator with various biological activities, including platelet and leukocyte acti- vation. It is produced by eukaryotes; however, the final step in its biosynthesis (i.e., the conversion of lyso-PAF to PAF) has also been observed to be carried out by various bacterial strains, e.g., Escherichia coli (E. coli) K-12 (47,48), as well as Salmonella enterica serovar Typhimurium and Helicobacter pylori (49). Since a blood-based medium was used to seed each bacterial culture in this study, it is possible that PAF was synthesized from medium-derived lyso-PAF. PAF was not identified in the growth medium controls, and therefore our results support earlier findings that oral bacteria can convert eukaryotic lyso-PAF to PAF, which could play additional roles in platelet aggregation and inflammation. To verify that bacteria can conduct the last step of PAF synthesis in future studies, isotope-labeled lyso-PAF could be used as a substrate and traced through metabolic pathways in monocultures of bacterial members from the human microbiome.  Table S4  Another putative PSM that was identified by the GNPS network analysis, from growth extracts of the periodontal pathogen P. gingivalis, was a 1-methyladenosine (m1a)-like molecule (m/z 282.120 to 282.122) (Fig. 5b; see Table S1 at the above URL). This modified nucleoside is known as major mRNA and DNA modifier in both eukaryotic and prokaryotic cells (50). It occurs on thousands of different transcripts in eukaryotic cells at an estimated average transcript stoichiometry of 20% in humans (51). It responds to physiological conditions and correlates positively with protein production, which indicates a strong functional role of m1a in promoting translation of methylated mRNA (52). Compared to regular adenosine, m1a has an additional methyl group at the Watson-Crick interface, and due to this structural change, m1a cannot only lead to truncated cDNA but can also cause misincorporation at the site of read-through of cDNA (50). Based on these findings and the fact that P. gingivalis has the potential to invade human cells (e.g., oral epithelial cells), it is possible that P. gingivalis-secreted nucleoside can interfere with the human DNA and RNA machinery. The role of bacterially produced nucleosides in eukaryotic DNA and RNA synthesis is completely unexplored and could be further investigated by studying human intracellular pathogens (e.g., species belonging to Fusobacterium, Porphyromonas, and the Chlamydia/Chlamydophila group).
An N-acetylserotonin-like molecule (m/z 219.112 to 219.115) was identified in multiple oral isolates belonging to the Streptococcus genus (i.e., S. sanguinis, S. mutans, S. salivarius SHI-3, and S. pneumoniae) ( Fig. 5c; see Table S1 posted at ftp://massive.ucsd.edu/ MSV000079151/updates/2017-06-30_aedlund_64677506/other/). N-Acetylserotonin is an intermediate in melatonin production and is now recognized as ubiquitous among living organisms, including humans, animals, plants, bacteria, fungi, and macroalgae (53)(54)(55)(56). N-Acetylserotonin was recently shown to be produced by both photosynthetic bacteria and endophytic bacteria from grapevine roots (57,58). In the latter study, the bacterium Bacillus amyloliquefaciens SB-9 exhibited the highest level of in vitro melatonin secretion and also produced three intermediates of the melatonin biosynthesis pathway: 5-hydroxytryptophan, serotonin, and N-acetylserotonin (59). Our study is the first to observe an N-acetylserotonin-like molecule produced from oral bacterial community members. This is a particular interesting finding as it indicates that the human oral microbiome could possibly impact hormonal levels related to human mood and sleeping patterns. Interactions between bacterial N-acetylserotonin-like production and human cells could be tested in future experiments, and in vivo production can be monitored in saliva and plaque samples.
GNPS network analysis also identified a putative 3-indolepropionic acid (IPA)-like molecule (m/z 190.085 to 190.087) in growth extracts from various isolates belonging to the Streptococcus, Lactobacillus, Actinomyces, and Fusobacterium genera ( Fig. 5c; see Table S1 posted at the above URL). IPA is already known to be produced by the human microbiome, which is interesting since it has shown neuroprotective abilities (60,61) and is an even more potent scavenger of hydroxyl radicals than melatonin, the most potent scavenger of hydroxyl radicals synthesized by human enzymes (47). Similar to melatonin but unlike other antioxidants, IPA scavenges radicals without subsequently generating reactive and prooxidant intermediate compounds (48,49). To further elucidate PSM annotations of the obtained MS/MS spectra, we also analyzed all MS/MS spectra with the DEREPLICATOR infrastructure (23). DEREPLICATOR compares mass spectra to predicted spectra of peptide natural products obtained from available structure-based databases (e.g., PubChem and AntiMarin) and is therefore also based on putative annotations that need experimental validation for exact identification. In this study, DEREPLICATOR provided a list of annotated molecules (see Table S6 at the above URL), which was sorted based on matching score and false discovery rate (FDR). Six parent masses showed annotations that were significant (P Ͻ 0.001) with FDR values of 0% (see Table S6). As previously mentioned, mutanobactin A (m/z 720.424) could be identified as being secreted from S. mutans UA159. A putative BZR-cotoxin II (m/z 964.573), earlier identified from endophytic fungi (62), was also identified in growth extracts from Streptococcus gordonii ATCC 10558 (FDR, 0%). Moreover, two cyclic peptides (i.e., putative annotations eptidemanamide [m/z 853.383] and anacyclamide A10 [m/z 1,052.53]), were also identified from growth extracts of Streptococcus pneumoniae TCH8431 and S. sangunis VMS66, respectively. Also, a putative L-valyl-L-leucyl-L-prolyl-L-valyl-L-prol peptide (m/z 652.408; rt, 215.65) was identified in 52 different growth extracts representative of bacterial isolates of various taxonomic origins (i.e., Veillonella, Streptococcus, and Actinomyces) (see Table S6 at the above URL). Similar results were obtained in GNPS, where this PSM was identified in 41 different spectra (m/z 652.407; rt, 215.650). Future research to identify and annotate this "unknown" PSM needs to be conducted to gain a deeper understanding of its role in oral microbial ecology. We also employed the VarQuest algorithm in DEREPLICATOR to search for analogues of known natural products in the obtained MS/MS data. This resulted in identification of additional putative PSMs from growth extracts of S. In conclusion, by using a comparative analysis approach based on the annotation of parent masses in growth cultures of individual oral bacterial isolates in parallel with the highly complex in vitro biofilm community, we identified unique signatures of PSMs over time, which suggests that the oral microbiome has a highly dynamic chemotype that is potentially involved in the regulation of community succession and bacteriumto-bacterium interactions as well as cell-to-host signaling. As of today, little information exists on the role and the underlying drivers of the differential production of PSMs, and the fact that only a few of the produced PSMs (2.2%) (see Table S1 posted at ftp://massive.ucsd.edu/MSV000079151/updates/2017-06-30_aedlund_64677506/other/) could be putatively annotated highlights that this area of microbiome research remains a black box and needs significant attention to provide a deeper understanding of key interactions between the human host and its microbiome. The GNPS and DEREPLICATOR annotation pipelines allowed us to identify patterns in metabolite production that can be linked to specific taxonomic units and culture conditions. Furthermore, these annotation approaches also provided a survey of the biosynthetic capacity of common oral bacterial community members and a method to compare isolates based on the variety of their SMs.

MATERIALS AND METHODS
Description of growth media, bacterial strains, and saliva inoculum. Chemically defined medium (cdm) was modified after previous protocols (51,53). SHI medium was prepared after the protocol of Tian et al. (54). Detailed cdm and SHI medium preparation protocols are available at http://depts.washington .edu/jsmlab/downloads/protocols/. The following bacterial strains were obtained from the American Type Culture Collection (ATCC) and included in this study:  (8), Lactobacillus fermentum SHI-2, Veillonella parvula SHI-1, and Streptococcus salivarius SHI-3. Species identity was verified for each isolate by sequencing of the 16S rRNAencoding gene at the Genewiz, Inc. (La Jolla, CA), sequence facility using the fD1-16S rRNA primer (55) encompassing approximately 600 bp of the gene. A saliva inoculum for in vitro biofilm growth was collected and pooled from six healthy subjects (ages 25 to 35 years) as described by Edlund and colleagues (19).
Incubation conditions for oral bacterial isolates and saliva-derived in vitro biofilms. Individual isolates (except those belonging to the Veillonella genus) and pooled saliva inoculum (19) were seeded into separate replicate growth wells (2 replicates per isolate or inoculum) in SHI medium and sucrose (0.5%) within sterile 24-well plates (20). Members belonging to the Veillonella genus that cannot metabolize sucrose were seeded in lactate as a carbon source. Cell-free saliva was used for coating wells prior to growing biofilms (20). After 16 h of growth in 37°C in anaerobic conditions, SHI medium was carefully removed from each growth well and the bottom of each well was screened with the naked eye for biofilm formation. For cultures in which no visible biofilms had formed at this time point, incubation continued for another 24 or 52 h (see SHI 40 and 68 h in Fig. S1 at ftp:// massive.ucsd.edu/MSV000079151/updates/2017-06-30_aedlund_64677506/other/) prior to continuing with the remaining biofilm wash and carbohydrate amendment steps. When biofilm establishment was confirmed and SHI medium had been removed, extra careful washing of the biofilms with buffered chemically defined medium (cdm) was performed. After the washing step, biofilms were starved in fresh cdm (pH 7) for 2 h in 37°C under anaerobic conditions. After 2 h of starvation, the spent cdm was carefully removed, and 1 ml of fresh cdm (pH 7) supplemented with either glucose (0.5%) or lactate (27.8 mM) was added to each growth well.
Sample collection and organic solvent compound extraction for LC-MS/MS analyses. For each sample presented in Fig. S1, replicate biofilm samples were collected from two growth wells for organic solvent extraction in separate sterile 2.0-ml Eppendorf tubes. Samples were immediately frozen on dry ice inside in the anaerobic chamber and then transferred to Ϫ80°C storage until organic solvent extraction. PSMs were extracted by using the following protocol: ethyl acetate (3:1 ratio) was added to each frozen sample (final volume, 1,200 l). Directly after thawing, samples were resuspended by pipetting and sonication at maximum frequency for 10 min in a water bath followed by incubation in room temperature for 15 min. Samples were then centrifuged at 14,000 rpm for 5 min. Supernatants were transferred to clean tubes and dried under vacuum in a lyophilizer. When completely dry, 1 volume of acetonitrile and methanol (1:1 ratio) was added to each of the dried extracts, which were resuspended by vortexing, and a second round of sonication, incubation at room temperature, and centrifugation was performed, as described above. After the last centrifugation step, sample extracts were concentrated and dried under vacuum in a lyophilizer. Concentrated samples were stored at Ϫ80°C until they were ready to be analyzed by LC-MS/MS.

LC-MS/MS analysis.
For LC-MS/MS analysis, the dried samples were dissolved in 150 l 80% methanol and diluted 100-fold. The resuspended extracts were analyzed with an UltiMate 3000 ultrahighperformance liquid chromatography (UHPLC) system (Thermo Fisher Scientific, Carlsbad, CA) using a Kinetex 1.7-m C 18 reversed-phase UHPLC column (50 by 2.1 mm) and a Maxis qTOF mass spectrometer (Bruker Daltonics, Billerica, MA) equipped with an electrospray ionization (ESI) source. The chromatography was performed at a flow rate of 0.5 ml/min throughout the run. MS spectra were acquired in positive-ion mode in the mass range of m/z 100 to 2,000. An external calibration with ESI-L low-concentration tuning mix (Agilent Technologies, La Jolla, CA) was performed prior to data collection, and internal calibrant hexakis(1H,1H,3H-tetrafluoropropoxy)phosphazene was used throughout the runs. The capillary voltage of 4,500 V, nebulizer gas pressure (nitrogen) of 160 kPa, ion source temperature of 200°C, dry gas flow of 7 liters/min at source temperature, and spectral rate of 3 Hz for MS 1 and 10 Hz for MS 2 were used. To acquire MS/MS fragmentation, the 10 most intense ions per MS 1 were selected. Basic stepping function was used to fragment ions at 50% and 125% of the collision-induced dissociation (CID) calculated for each m/z (56) with a timing of 50% for each step. Similarly, basic stepping of collision radio frequency (RF) of 550 and 800 V peak to peak (Vpp) with a timing of 50% for each step and transfer time stepping of 57 and 90 s with a timing of 50% for each step was employed. The MS/MS active exclusion parameter was set to 3 and was released after 30 s. The mass of internal calibrant was excluded from the MS 2 list.
Mapping PSMs in GNPS and DEREPLICATOR. Molecular networking analyses were performed at the UCSD-hosted Global Natural Products Social Molecular Networking web server (http://gnps.ucsd .edu/) (22). This platform provides an overview of the molecular features in mass spectrometry-based metabolomics by comparing ion fragmentation patterns to identify chemical relationships. This comparison is based upon the similarity cosine scoring of MS/MS spectra and the visualization of those relationships in a 2-dimensional network in the Cytoscape software v.3.4.0 (63). A single chemical species is represented as a node, and the relatedness between spectra is represented as an edge. Molecular network analysis was performed separately on raw mxXML files obtained from growth medium extracts of individual bacterial isolates. The in vitro biofilm time series (0 to 21 h of growth) was analyzed in network analysis separately. The following network settings were applied: minimum cosine setting, 0.5; Network TopK, 10; minimal matched peaks, 3; minimum cluster size, 2. Run MS cluster was selected. Parent mass tolerance was set to 0.02 Da, and fragment ion mass tolerance was set to 0.02 Da. GNPS analysis parameters, networking statistics, and network summarizing graphs are available on the GNPS site at https://gnps.ucsd.edu/ProteoSAFe/result.jsp?taskϭ48e75e72f8294f25b947a4ff5e828249&view ϭview_all_clusters_withID_beta.
The DEREPLICATOR program (23), a peptidic natural product workflow, was also employed to compare experimental MS/MS spectra against chemical structure databases: e.g., PubChem and Anti-Marin. The following settings were used for DEREPLICATOR analysis: precursor and fragment ion mass tolerance, 0.02 Da; maximum charge, 3; and accurate P values, "yes." All other parameters were set to the default values. We also employed the VarQuest algorithm to search for analogues of known natural products in the obtained MS/MS data. The following running parameters were used: precursor and fragment ion mass tolerance, 0.02 Da; maximum charge, 2; maximum allowed modification mass, 150 Da; minimum matched peaks with known compound, 4; and accurate P values, "yes." All other parameters were set to the default values.