ABSTRACT
The ecological drivers that concurrently act upon both a virus and its host and that drive community assembly are poorly understood despite known interactions between viral populations and their microbial hosts. Hydraulically fractured shale environments provide access to a closed ecosystem in the deep subsurface where constrained microbial and viral community assembly processes can be examined. Here, we used metagenomic analyses of time-resolved-produced fluid samples from two wells in the Appalachian Basin to track viral and host dynamics and to investigate community assembly processes. Hypersaline conditions within these ecosystems should drive microbial community structure to a similar configuration through time in response to common osmotic stress. However, viral predation appears to counterbalance this potentially strong homogeneous selection and pushes the microbial community toward undominated assembly. In comparison, while the viral community was also influenced by substantial undominated processes, it assembled, in part, due to homogeneous selection. When the overall assembly processes acting upon both these communities were directly compared with each other, a significant relationship was revealed, suggesting an association between microbial and viral community development despite differing selective pressures. These results reveal a potentially important balance of ecological dynamics that must be in maintained within this deep subsurface ecosystem in order for the microbial community to persist over extended time periods. More broadly, this relationship begins to provide knowledge underlying metacommunity development across trophic levels.
IMPORTANCE Interactions between viral communities and their microbial hosts have been the subject of many recent studies in a wide range of ecosystems. The degree of coordination between ecological assembly processes influencing viral and microbial communities, however, has been explored to a much lesser degree. By using a combined null modeling approach, this study investigated the ecological assembly processes influencing both viral and microbial community structure within hydraulically fractured shale environments. Among other results, significant relationships between the structuring processes affecting both the viral and microbial community were observed, indicating that ecological assembly might be coordinated between these communities despite differing selective pressures. Within this deep subsurface ecosystem, these results reveal a potentially important balance of ecological dynamics that must be maintained to enable long-term microbial community persistence. More broadly, this relationship begins to provide insight into the development of communities across trophic levels.
INTRODUCTION
The identification of ecological drivers that shape microbial communities is a challenge in many environments, where high microbial diversity, a lack of time-resolved samples, and variable geochemistry limit a full understanding of community assembly. Hydraulically fractured deep subsurface environments can act as model systems for investigating ecological principles governing assembly processes. The process of hydraulic fracturing involves high-pressure injection of water, chemical additives, and proppant (usually sand) into deep hydrocarbon-bearing shale formations to generate fracture networks and recover oil and gas resources. During this activity, diverse microbial communities of bacteria and archaea are inadvertently injected into the deep subsurface and exposed to high pressure and temperature and rapidly increasing salinity driven by the dissolution of shale-derived minerals (1). These consistent physicochemical pressures decrease microbial diversity by selecting for a subset of introduced halophilic microorganisms that are able to subsequently colonize this new fractured shale ecosystem and persist for extended periods of time (>350 days) (2–4). Microbial community dynamics are additionally influenced by the presence of abundant viral populations that drive both top-down control via predation of community members and bottom-up control on microbial activity through the release of cellular nutrients via cell lysis (3, 5). Importantly, the subsurface fracture network is essentially a closed system following the hydraulic fracturing process, allowing the study of time-resolved community dynamics in the absence of new microbial and viral inputs. Together, this information suggests that hydraulically fractured wells can serve as model systems to investigate constrained community assembly processes.
Microbial communities are shaped by different ecological assembly processes (Table 1) (6). First, two types of selection can lead to deterministic shifts within microbial communities (7). When some environmental pressure drives two communities to be significantly divergent, this observed difference is due to “variable selection.” Variable selection occurs when turnover between two communities is greater than expected by random chance and has been observed when communities experience varied geochemical conditions or organic matter composition (8, 9). In comparison, “homogeneous selection” occurs when some common stressor pushes two communities to be convergent (e.g., when turnover is lower than expected by random chance alone) (7). For example, microbial communities from a successional soil environment were driven to more similar configurations (10). Second, differences in organismal movement through space can significantly influence community structure. When organisms are capable of easily dispersing, “homogenizing dispersal” drives communities to be more similar than under “dispersal limited” scenarios, which are characterized by low organismal movement leading to communities which drift apart (7, 11). When comparing communities separated by time rather than space, “dispersal limitation” suggests that communities are changing due to drift rather than the inability to mix. Last, when no process dominates (e.g., there is moderate dispersal and weak selection), the turnover between communities is considered “undominated” (7, 11).
Definition of terms used throughout the article
Relative to the processes that drive microbial community assembly, the factors influencing viral community structure are poorly understood (12). The close association between viruses and their hosts could lead to viral community assembly, which mirrors those processes experienced by the hosts (12–14). For example, the viral community may assemble according to variable selection when environmental filtering drives highly dynamic host abundances. Conversely, if the microbial community is phylogenetically consistent through time, the viral community may be affected by “homogenous selection,” leading to convergent community composition. However, viruses are still separately subjected to environmental processes, such as grazing, salinity, UV exposure, or lifestyle (lytic versus lysogenic), which can complicate predictions (13–16). The interplay between these factors and viral community assembly is underexplored, partly due to the lack of universal marker genes within viruses which makes viral communities challenging to study (17). Using a combination of marker genes and reference-based metagenomic sequencing, one study used null models to attribute deterministic (“nonrandom”) assembly to a viral subcommunity from macaque feces but did not identify specific processes (18). Furthermore, the interplay between microbial and viral community assembly processes is completely unexplored, to our knowledge. By investigating the processes governing the assembly of paired microbial and viral communities and their coordination, we can better contextualize interactions across different trophic levels.
Hydraulically fractured shales provide a closed system to investigate potentially constrained ecological assembly processes affecting both microbial and viral communities. Within these shales, we expect that microbial communities would initially experience significant variable selection (e.g., will be driven to divergent configurations) due to the rapid environmental changes encountered upon introduction into the subsurface (e.g., a freshwater system transitioning to brine-level salinity). Once the community acclimates to shale conditions, however, we might expect that a common community configuration will be maintained due to strong homogeneous selection (barring external perturbations) because ill-adapted members would have been filtered out by high pressure, temperature, and salinity. Such pattern is observed in many fractured shale ecosystems where Halanaerobium sp. is consistently a dominant microbial community member in late-produced water samples (2, 3, 5). In addition to abiotic controls, Borton et al. (2018) demonstrated that numerous metabolic handoffs likely help maintain these microbial communities, which could further enhance homogenous selection due to increased ecological interconnections (8, 19, 20). Additional studies have revealed that viral predation could exert a significant top-down control on the persisting microbial communities that may also result in community composition being relatively consistent through time (1, 3, 5). Given these past observations and known environmental forcing driven by in situ physicochemical conditions (e.g., salinity), we expect the microbial community structure will be primarily shaped by homogeneous, deterministic processes after injection into the deep subsurface. In turn, we expect that viral communities will assemble according to similar processes encountered by the host microbial community due to their narrow host range (21, 22). Specifically, selection should first deterministically cause significant shifts in viral community composition of the input community (i.e., variable selection) and then constrain community composition to be consistent through time (i.e., homogeneous selection).
In this study, we applied the ecological null modeling tools β-nearest taxon index (βNTI) and Raup-Crick (Bray-Curtis) (RCBC), to investigate the ecological assembly processes affecting both microbial and viral populations in deep fractured shale ecosystems. We leveraged time-resolved metagenomic data sets across multiple shale wells and used ribosomal protein S3 (rps3) as a marker gene to infer microbial community assembly analyses. The recovery of viral sequences from the same data sets also allowed the same ecological null modeling tools to be applied to over 80% of the observed viral community, a proportion previously inaccessible to these methods through the use of viral marker genes. From these data, we show that the assembly processes that govern these two distinct communities are significantly related to each other, supporting their strong interdependence. These results further demonstrate the need to study assembly processes across multiple trophic levels simultaneously to fully understand factors driving spatiotemporal dynamics of any one trophic level.
RESULTS AND DISCUSSION
Viral predation counterbalances environmental filtering in governing microbial community assembly processes.Samples collected from two hydraulically fractured wells in the Utica shale—termed Utica-2 (U-2) and Utica-3 (U-3)—are explored in Daly et al. in greater detail (5). Briefly, these wells exhibited geochemical characteristics similar to previously described fractured shales (3). Following hydraulic fracturing, produced waters from both wells became increasingly saline over time with U-2 fluids increasing from 1.3 g/liter to 172 g/liter total dissolved solids (TDSs) on day 302, and U-3 fluids increasing from 0.08 g/liter to 204 g/liter TDSs on day 159 (see Fig. S1 in the supplemental material). In order to measure the ecological assembly processes governing microbial community structure within these brine-like samples, two null modeling analyses, namely, β-nearest taxon index (βNTI) and Raup-Crick (Bray-Curtis) (RCBC) were performed. These approaches leverage randomized community structures to determine whether observed turnover between communities is higher or lower than would be expected by random chance. βNTI is a phylogenetic null model capable of distinguishing deterministic and stochastic assembly processes (23). When |βNTI| is greater than 2, deterministic processes have driven observed community differences. These deterministic processes can be further investigated. If βNTI is greater than 2, variable selection has driven the observed communities to be more dissimilar. Such environmental filtering has been observed when divergent geochemical conditions affected community composition (24). If βNTI is less than −2, homogenous selection pushed the observed communities toward a common configuration, as can be observed when river water exerts a potentially common set of stressors in some riverbeds (25, 26). When |βNTI| is less than 2, however, communities are as different as expected by random chance due to stochastic processes. In combination with the taxonomic null model RCBC, we can further distinguish these stochastic processes just as βNTI can distinguish deterministic processes (7, 23). If RCBC is greater than 0.95, a decreased ability to mix has led to significant community drift leading a dispersal limitation signal. In contrast, when a system experiences significantly high exchange rates, a homogenizing dispersal signal (RCBC of less than −0.95) can arise. Lastly, if |RCBC| is less than 0.95, no single assembly process was strong enough to exert control and thus an undominated signal would be observed (7). Given that RCBC values are primarily useful when βNTI indicates stochastic processes are responsible for community differences, we only present RCBC values when |βNTI| values are less than 2. By examining the dynamics of these two null models through space and time, we can investigate the ecological assembly processes affecting both the microbial and viral communities.
FIG S1
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Input fluids for the hydraulic-fracturing process are frequently generated using local freshwater sources, with microbial communities distinct from those that subsequently colonize and persist in the deep shale ecosystem. Consistently, we observe that microbial communities in freshwater inputs were significantly different from those in produced fluids due to variable selection (βNTI of >2) (Fig. 1A and B, Fig. 2A and B). This variable selection was likely driven by the development of high-salinity conditions in the deep fractured shales caused by the dissolution of mineral phases. Recycled produced fluids (RTs) from an older hydraulically fractured well were used as one input source in well U-3 and differed from later time points, primarily through undominated processes. This indicates that upon injection, microbial RT communities from well U-3 were influenced by some combination of deterministic and stochastic processes whereby neither dominates. This result is likely due to the microbial community being preadapted for saline down-hole conditions in the recycled input water. By surviving in a similar habitat in a different shale well, the RT communities differ due to isolation but maintain similarities through common selective pressures, which, when combined, lead to an undominated signal.
Ecological null modeling results for the Utica-2 samples through time; the −5 and 0 time points are samples collected from input fluids. (A) βNTI results for the microbial community; (B) RCBC results for the microbial community; (C) βNTI results for the viral community; (D) RCBC results for the viral community. Given that these are pairwise metrics, the boxplots represent values collected between the given time point and all others. In A and C, the red-shaded regions are those values interpreted as variable selection (βNTI of more than 2), the blue-shaded regions are those values interpreted as homogeneous selection (βNTI of less than −2), and the white regions indicate stochastic processes (|βNTI| of <2). In B and D, the green-shaded regions indicate dispersal limitation (RCBC greater than 0.95), the purple represents homogeneous dispersal (RCBC less than −0.95), and the white regions are undominated. Given RCBC is only interpretable for stochastic βNTI results, only RCBC values for insignificant βNTI comparisons are presented. The red asterisk/star indicates a discrepancy in interpretation between rarefied and nonrarefied data.
Ecological null modeling results for the Utica-3 samples through time; the −25 time point represents samples collected from the freshwater input fluid (solid) and the recycled input fluid (gray diagonal lines). (A) βNTI results for the microbial community; (B) RCBC results for the microbial community; (C) βNTI results for the viral community; (D) RCBC results for the viral community. Given that these are pairwise metrics, the boxplots represent values collected between the given time point and all others. In A and C, the red-shaded regions are those values interpreted as variable selection (βNTI greater than 2), the blue-shaded regions are those values interpreted as homogeneous selection (βNTI less than −2), and the white regions indicate stochastic processes (|βNTI| less than 2). In B and D, the green-shaded regions indicate dispersal limitation (RCBC greater than 0.95), the purple represents homogeneous dispersal (RCBC less than −0.95), and the white regions are undominated. Given RCBC is only interpretable for stochastic βNTI results, only RCBC values for insignificant βNTI comparisons are presented.
As hypothesized above, intense environmental filtering should occur on produced fluid samples, due to the strong environmental pressures experienced within the fractured shale (i.e., salinity, temperature, and pressure). In contrast to our expectation that homogeneous selection would maintain a consistent community configuration through time, the majority of community comparisons in wells U-2 and U-3 were in the stochastic range (|βNTI| of <2) (Fig. 1A and 2A), with RCBC results suggesting that undominated processes were the major driver of community assembly (Fig. 1B and 2B). An undominated signal typically consists of some mixture of weak assembly processes, such as weak selection combined with a moderate level of dispersal (7, 11). However, previously observed widespread viral predation within these fractured shale environments (3, 5) might be the primary driver for this undominated signal. This signal may arise from a “kill-the-winner” (KTW) scenario counter-acting otherwise strong homogeneous selection exerted by the hypersaline conditions (5, 12, 27). A KTW scenario could be influential via two potential mechanisms. First, KTW could result in variable selection by continuously driving microbial community turnover through the elimination of the most abundant community members (28). Under this mechanism, the lack of a dominant community assembly process might be the result of strong variable selection imposed by viral predation being canceled out by a strong homogeneous selective pressure imposed by high salinity. Alternatively, KTW could directly result in stochastic assembly processes in the absence of strong homogeneous selection (29). In this case, we expect that dominant organisms would still be deterministically targeted by predation, but the resulting niche space would be filled stochastically. This mixture of deterministic and stochastic processes is expected to lead to an undominated signal (7). Regardless of which of these KTW mechanisms is the most accurate, the absence of a strong signal for homogeneous selection, despite a strong homogenizing pressure (e.g., salinity), suggests that viral predation may run counter to potential environmental filtering in fractured shale ecosystems.
Viral populations develop according to homogenous selection and undominated processes.Given the dependence of viruses on their microbial hosts for reproduction, we hypothesize that viral communities would be influenced by similar assembly processes as those that affected the microbial community. In this case, variable selection will act upon input samples while undominated processes structure later time points. As with the microbial community, we analyzed the viral community by measuring both βNTI and RCBC. Contrary to our expectations, null modeling results indicated that viral communities in freshwater input fluids in both wells differed significantly from those in produced fluids due to a greater influence of dispersal limitation (Fig. 1C and D, Fig. 2C and D), while viral populations in the recycled produced fluids in well U-3 were different due to undominated forces (Fig. 2C and D). The comparative absence of variable selection indicates that changes in environmental conditions after injection did not influence viral communities as strongly as they influenced microbial communities. Instead, separation in time (and thus lack of physical mixing) combined with stochastic ecological drift in the community (leading to a signal of dispersal limitation) was the primary driver of initial community change.
Viral communities encountered assembly processes that did not strictly align with those experienced by the host microbial community. In the initial produced waters of U-2, viral communities appear to be controlled by undominated forces followed by a period of strong homogeneous selection (Fig. 1C and D) linked to the Halanaerobium sp.-dominated microbial community (see Fig. S2 in the supplemental material) (5). Although similar assembly patterns have been observed within a mammalian ecosystem (18), this is the first time that this methodology has been attempted on such a large (>80%) proportion of an observed viral community using viral sequences identified de novo. Moreover, this observation suggests that the relatedness index used to calculate βNTI actually contains evolutionarily relevant information because this homogeneity coincides with the dominance of Halanaerobium sp. (5). After this point, the viral community never reached a similarly homogenous state, instead being governed by either homogeneous selection or undominated processes (Fig. 1C and D). In contrast, well U-3 was primarily controlled by undominated processes throughout the sampling period, never reaching a comparably homogenous condition (Fig. 2C and D). While the inferred viral assembly processes deviate from our initial hypothesis, the variability in assembly processes acting upon viral and microbial communities could be driven by the low host diversity within these ecosystems. Low-diversity ecosystems could drive viral populations to be more closely related due to changes in microdiversity (i.e., strain-level fluctuations), rather than total diversity (i.e., along broad phylogenetic lineages) (12, 27, 30, 31).
FIG S2
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
The development of viral communities is intrinsically linked to the microbial community in fractured shale ecosystems.While the specific community assembly processes experienced by the viral and microbial communities did not completely align, significant coordination appears to occur. To determine the relationship between the ecological processes acting upon viral and microbial communities, the null modeling results from both communities were correlated with each other using both a permuted Procrustes analyses and Spearman-based Mantel test (Fig. 3, see Table S2 in the supplemental material). Both the permuted Procrustes (U-2 m2 of 0.460, P = 0.0015; U-3 m2 of 0.479, P= 0.009) and Mantel tests (Fig. 3) revealed that viral and microbial βNTI results within each well were significantly correlated. This indicates that the deterministic processes shaping the phylogeny/relatedness of both communities are interlinked. Notably, the viral community appears to experience stronger homogenizing selection than the microbial community, as evidenced by low βNTI values (less than −30), likely related to a singular selective force primarily affecting viral structuring (i.e., host availability).
Correlations between microbial and viral βNTI values in wells U-2 (A) and U-3 (B). The red dashed line represents the Loess curve fit. In both panels, the Mantel correlation is presented in the bottom right.
TABLE S2
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
While previous research has noted the connection between viral and microbial community dynamics (5, 13, 22, 32), this demonstrates that the ecological pressures structuring these communities are intrinsically linked; when one community experiences increased turnover due to shifts in selective pressures (indicated by higher βNTI), the other will likely experience it as well. These results potentially are the result of the dynamic feedback loop existing between the top-down and bottom-up effects a virus community exerts on the host microbial community, although an in silico analysis would be necessary for confirmation. The RCBC Mantel tests provide further evidence of coordination in the assembly processes influencing the microbial and viral communities (U-2 r of 0.815, P = 0.0001; U-3 r of 0.745, P = 0.033) (Table S2). Collectively, our results indicate strong linkages in the ecological assembly processes affecting viral and microbial communities, although the underlying mechanisms leading to this coordination are not yet clear.
Conclusions.While the interactions between viruses and their hosts in marine, soil, and subsurface environments have received increasing attention in recent years, no study to date has investigated the relationship between the ecological processes coordinating the assembly of each community. Here, we demonstrate that while each community is affected by different specific assembly processes at individual time points (Fig. 1 and 2), the overall assembly processes influencing these two communities are linked (Fig. 3). This supports past observations that across viral and host communities, fluctuations in one community lead to changes in the other (5, 13, 22). We suggest that widespread viral infection may act as a counterbalance to otherwise strong, homogenous environmental filtering, with the action of viral populations overriding strong deterministic factors from the abiotic environment (e.g., high salinity) that could otherwise lead to homogeneous selection (Fig. 4).
Conceptual model and potential explanations for observed microbial and viral community dynamics. (A) Contrast between expected and observed microbial dynamics. (B, C, and D) Detail the possible reasons for observed differences. (B) Within scenario 1, strong deterministic processes counterbalance, giving rise to undominated assembly for both the microbial and viral communities. (C) Within scenario 2, however, a multi-infection event leaves open niche space that is then occupied due to weak dispersal-based assembly processes, resulting in an undominated signal. (D) Regardless of the scenario, the result is the same, namely, coordination between the ecological assembly processes of microbial and viral communities. While the magnitudes might vary, these results suggest that one community will experience dynamics in response to the other.
The interpretation of viral ecological modeling still poses a number of challenges. Viruses primarily have either a lysogenic or lytic lifestyle which could have ramifications for viral community development (12). Lysogenic viruses would likely experience different pressures than free lytic viruses given that they can integrate within host genomes until becoming virulent. For example, in a surface marine environment where high UV exposure may lead to free virus degradation, lysogenic phage could be protected and potentially more closely mirror microbial community dynamics (33). Once induced, however, lysogenic viruses might encounter lagged processes relative to their hosts, as observed within some marine environments (13). Alternatively, a community dominated by viruses with a lytic lifestyle might experience more divergent assembly than the microbial community (i.e., faster dispersal and inaccessibility to sufficient hosts). While the viral community within a fractured shale is a roughly equal mixture of these lifestyles (5), other ecosystems might have a different ratio of lysogenic and lytic viruses. Better quantification of viral lifestyles will help improve our understanding of the contributions that viruses make to ecological assembly processes. Another complication is that all of the viruses within this study were obtained via shotgun metagenomic sequencing and likely represent only a subset of the overall viral community (34). By collecting viromes, rather than metagenomes, a more complete viral community could be available, enabling us to better distinguish different viral groups (17, 34).
Beyond broader ecological considerations, these viral-host relationships have implications for the functioning of fractured shale ecosystems. Our results demonstrate that the two shale ecosystems examined here exhibit coordination between viral and microbial community assembly. Other studies have indicated that additional shale ecosystems develop along similar trajectories, although less information about viral community dynamics is available (2, 3, 35, 36). The apparent coordination and feedback between microbial and viral communities in the systems studied here suggest that such processes may be conserved across shale systems. If these feedback are disrupted, there may be impacts to the functioning of the shale ecosystem. For example, extensive viral predation could disrupt microbe-microbe interactions (e.g., between Halanaerobium sp. and Methanohalophilus sp.) and lead to changes in biogeochemical fluxes, such as the generation of new biogenic methane (19, 37). While hypothetical, these potential outcomes demonstrate the necessity of investigating how different assembly processes are partitioned within shale ecosystems. Furthermore, insights gained in this constrained shale ecosystem may be applicable to other environments that host more complex microbial and viral populations.
Overall, our results both confirm previous observations of the relationship between viral and microbial communities and demonstrate a strong association with respect to assembly processes, which has not been previously recognized. Not yet clear, however, are mechanisms that underlie coordination in assembly processes. To ultimately predict the effects of viral-host interplay across diverse ecosystems, such mechanisms need to be revealed through modeling efforts and lab-based experiments.
MATERIALS AND METHODS
Sample collection.Hydraulic fracturing input fluids and shale-produced fluids were collected from wellheads and oil-gas-water separators in sterile bottles with no headspace (1 to 2 liters). When fluids were collected from the separator tanks, the separator tanks were flushed immediately before sample collection to minimize community changes due to incubation in the separator. Flow rates ranged from ∼400,000 liters per day at early time points to ∼170,000 liters per day at later time points, with a separator capacity of ∼5,500 liters. These fluids were collected from two wells in the Utica-Point Pleasant shale formation in Ohio (n = 2, Utica formation). Fluids were filtered on 0.22-μm pore size polyethersulfone filters (Millipore; Fisher Scientific). Produced fluids contained high concentrations of ferrous iron (100 to 180 mg liter−1). During the filtering process, the oxidizing iron resulted in a natural analog of the chemical flocculation protocol used to concentrate free viral particles (38). Thus, the viruses sampled in this study are probably a combination of free viral particles, integrated proviruses, and active, replicating viruses. Total nucleic acids were extracted from filters using lysis buffer I (39), purified with two phenol-chloroform and one chloroform-isoamyl alcohol extraction, and precipitated with NaCl and ice-cold ethanol. Nucleic acids were stored at –20°C until sequencing.
Metagenomic sequencing and assembly.Libraries were prepared using the Nextera XT library system according to manufacturer’s instructions. Sequencing adapters were ligated, and library fragments were amplified with 12 to 15 cycles of PCR before quantification with the KAPA Biosystems next-generation sequencing library kit. Following library preparation with a TruSeq paired-end cluster kit (v4), sequencing was performed on the Illumina HiSeq 2500 platform with HiSeq TruSeq sequencing by synthesis (SBS) sequencing kits (v4) following a 2 × 150-bp indexed run recipe. Fastq files were generated with CASSAVA 1.8.2. Fastq files were trimmed from the 5′ and 3′ ends with Sickle, and each sample was assembled individually with IDBA-UD using default parameters (5, 40, 41). Metagenome statistics, including the amount of sequencing are detailed in Table S1 in the supplemental material. Scaffolds of ≥5.0 kb were included in subsequent viral analyses.
TABLE S1
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Microbial analysis.An hidden Markov model (HMM) obtained from PFAM (PF00189) was utilized to find rps3 sequences in each metagenomic assembly (42). Resulting sequences were clustered at 100% using the UCLUST option within USEARCH (43). Reads per kilobase million (RPKM) was calculated by mapping metagenomic reads to the clustered sequences using Bowtie2 and then normalizing read counts to sequencing depth (44). An alignment for these sequences was generated using MUSCLE and trimmed in Geneious to remove any position consisting of ≥95% gaps, after which a RAxML tree was generated using an evolutionary model approximated by PROTTEST (45–48) (see Text S1 in the supplemental material).
TEXT S1
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Viral analysis.Assemblies were mined for viruses using the methods outlined in Daly et al. (2019). Briefly, VirSorter was run using the “virome” database to identify viral sequences (49). Sequences resulting in category 1 or 2 status were retained and clustered into viral OTUs using the “ClusterGenomes” (v1.1.3) app in CyVerse, using 95% average nucleotide identity and 80% alignment fraction of the smallest contig (50). Length-normalized viral operational taxonomic unit (OTU) abundances were then determined by mapping reads to viral sequences using Bowtie2, dividing read counts by length, and adjusting individual abundances by summed values within a sample (5, 44). Due to the lack of marker genes across viruses, a protein clustering approach was utilized to generate a tree which approximates taxonomic relationships as follows. First, vConTACT was used to compare gene content across viral genomes/contigs and obtain all-versus-all pairwise similarity values (51). Viral sequences which shared no protein content were assigned a similarity score of 0 in order to allow their incorporation into the tree. These similarity values were then standardized between 0 and 1 in order to capture subtle differences which were masked by large scores. These similarity values were finally converted to dissimilarity values and used to generate a unweighted pair group method using average linkages (UPGMA) tree (upgma command, “phangorn” package v2.4.0) (see Text S2 in the supplemental material) (52).
TEXT S2
Copyright © 2020 Danczak et al.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.
Ecological null modeling.Ecological null modeling was performed in order to investigate the ecological drivers affecting both microbial and viral communities (7, 23, 53). Specifically, β-nearest taxon index (βNTI) and Raup-Crick (Bray-Curtis) (RCBC) were used to determine contributions from selective and dispersal-based processes, respectively. First, β-mean nearest taxon distance (βMNTD) was calculated for each possible pairwise comparison within either the microbial or viral communities (comdistnt, “picante” package v1.8) (54). By comparing these observed βMNTD values to those obtained from 999 community randomizations, βNTI was calculated according to Stegen et al. (7, 23). These βNTI results can then be used to investigate the phylogenetic turnover within both communities and to understand whether deterministic (i.e., selection) or stochastic (i.e., random) processes affected community composition. If a |βNTI| value exceeds 2, a deterministic process shaped the microbial or viral community; if a |βNTI| value is less than 2, a stochastic process affected the community. Deterministic processes can then be distinguished based upon the sign of the βNTI value. When βNTI is greater than 2, communities are significantly more different than would be explained by random chance due to variable selection. Such a process occurs when environmental conditions between the compared communities are different (e.g., differences in organic carbon quality), leading to variation in community composition (55). If βNTI is less than −2, communities are significantly more similar than would be expected by random chance due to homogeneous selection. This type of selection would occur when relatively spatiotemporally consistent environmental conditions drive communities toward a common configuration, such as in the case of microbial community succession in geochemically stable soil environments (10). Correlations between microbial and viral βNTI values were assessed using both a permuted Procrustes test (protest command, “vegan” package v2.5.4) and a Spearman-based Mantel test (mantel command, vegan package v2.5.4) (56).
In order to calculate βNTI, a phylogenetic tree is necessary. Given that metagenomic sequencing was primarily used to investigate the microbial communities, the rps3 RAxML tree generated above was used to assess phylogenetic turnover. Rarefaction to the sample with the lowest sequencing depth was used to verify that results were not an artifact of sequencing depth, and any differences were noted. Viral null modeling was performed using the >80% of the viruses placed on the viral relationship tree following the methods detailed above. In some cases, different samples showed identical viral composition (i.e., the same viruses were present in some pairs of samples). While we would normally expect a homogeneous selection signal in comparisons between these communities, βMNTD calculations result in no nearest neighbor distance, as the nearest neighbor is identical in all cases. When calculating βNTI, this results in a divide-by-zero error, as the null models are also identical. In order to alleviate these problems, small amounts of nearest neighbor “noise” were added in the form of very small values (10−5) to the null βMNTD measurements to indicate that a null phylogenetic tree has slightly augmented distances.
In addition to βNTI calculations, RCBC was used to further distinguish observed stochastic processes according to Stegen et al. (23). Using 9,999 iterations per pairwise comparison, null communities were probabilistically generated based upon observed microbial and viral abundances. From these communities, a null distribution of Bray-Curtis values was calculated and then compared to observed Bray-Curtis values. Resulting deviations of the observed values from the null expectation represent the RCBC metric once normalized from +1 to −1. Significant RCBC values (|RCBC| of >0.95) suggest that communities are governed by either dispersal limitation or homogenous dispersal. Dispersal limitation (RCBC of >0.95) occurs when two communities are significantly different due to being physically unable to mix in either time or space, leading to significant ecological drift (random changes in organismal abundance). Homogenous dispersal (RCBC of less than −0.95) happens when two communities are significantly more similar due to the ability to freely mix throughout a given environment. Lastly, if |RCBC| is less than 0.95, the differences between communities arose due to undominated processes where no single ecological pressure is able to exert a dominating effect (i.e., weak dispersal and weak selection) (7). Given that these results are only useful in distinguishing stochastic processes, RCBC values are only presented for insignificant βNTI values (i.e., |βNTI| of <2).
Plot generation and code availability.The program R (v3.5.4) was used to perform statistical tests and plot generation (57). All data plots were generated using “ggplot2” (package v3.1.1) (58). R scripts used within this study are available at https://github.com/danczakre/ShaleViralEcology.
Data availability.The metagenomes used in this study are publicly available through JGI; accession numbers can be found in Table S1.
ACKNOWLEDGMENTS
This work was supported by National Science Foundation grant EAR-1847684 to M.J.W. The research described in this study was partially supported by the U.S. Department of Energy (DOE), Office of Biological and Environmental Research (BER), as part of Subsurface Biogeochemical Research Program’s Scientific Focus Area (SFA) at the Pacific Northwest National Laboratory (PNNL). Sequencing was performed through a Community Sequencing Proposal award to K.C.W. at the Joint Genome Institute. The work was conducted by the U.S. DOE Joint Genome Institute, a DOE Office of Science User Facility, supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-05CH11231.
We declare no conflicts of interest.
FOOTNOTES
- Received January 31, 2020.
- Accepted February 24, 2020.
- Copyright © 2020 Danczak et al.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.