Département de Biologie Cellulaire et Infection, Institut Pasteur, Unité des Interactions Bactéries-Cellules, Paris, FranceINSERM, U604, Paris, FranceINRA, USC2020, Paris, FranceInstitut Pasteur–Bioinformatics and Biostatistics Hub, C3BI, USR 3756 IP CNRS, Paris, France
Overview of the Listeriomics platform. (Center) The five major tools of Listeriomics, i.e., gene conservation and synteny, coexpression network, genome viewer, expression, and protein atlas. (Left) Summary of all the available genomic information available on the website. (Right) List of all the transcriptomic information available in Listeriomics. (Bottom) View of all the proteomic information that can be accessed.
Transcriptomic and proteomic data sets available in the Listeriomics database. (A) Summary of all the transcriptomic data sets available at the Listeriomics website. In parentheses is the number of transcriptomics data sets available in the Listeriomics database for a specific biological condition. (B) Schematic representation of all the L. monocytogenes mutants for which transcriptomic data sets are available in the Listeriomics database. (C) Schematic representation of the number of transcriptomics data sets available for each L. monocytogenes growth phase. (D) Summary of all the proteomics data sets available at the Listeriomics website. In parentheses is the number of proteomics data sets available in the Listeriomics database for a specific biological condition.
Multi-omics genome viewer and coexpression network tool. (A) Genome viewer of representative omics data sets for L. monocytogenes EGD-e grown in BHI at 37°C to the exponential and stationary growth phases as indicated in the text. The genome viewer shows positive genome strand genes (in red), negative genome strand genes (blue), tRNAs and rRNAs (in yellow), small RNAs (in purple), riboswitches (in green), asRNAs (in light green), predicted operons (in orange) from reference 56, and predicted transcription terminators (22) (in blue circles). Exp, exponential. (B) Coexpression network of the virulence locus genes (lmo0200 to lmo0207) of L. monocytogenes EGD-e. Network nodes are genome elements (genes and noncoding RNAs) with the same color code as in the genome viewer tool. (C) Circular graph visualization of the coexpression network of the virulence locus genes (lmo0200 to lmo0207). Coexpression edges are displayed overlaid on a circular representation of the EGD-e genome.
Meta-analysis of the Listeriomics transcriptomic data sets. (A) Relational network built on the 362 transcriptomic biological conditions found in the Listeriomics database. Each node corresponds to a growth condition. The size of each node is proportional to the occurrence of each condition in the whole database. A link is drawn between two growth conditions if they are present in the same transcriptomic data set. (B) Heat map of the 15 genes with the highest ratio of differential expression. The value used for colorization is the number of data sets in which each gene has been found to be differently expressed. (C) Heat map of the six genes with no variability. (D) Pathway enrichment analysis of the 651 genes of L. monocytogenes EGD-e that are found differently expressed in >10% of the 279 data sets. We performed a pathway enrichment analysis by using COG information and the Fisher exact test P value.
Flowchart of omics data set integration in the Listeriomics database. (A) Complete genome sequences from the RefSeq and GenBank databases were downloaded and integrated into Listeriomics, along with pathway information and small RNAs. (B) MAGE-TAB data sets were downloaded from ArrayExpress. Metadata on the data sets were manually curated, and processed gene expression array tables were added. Raw RNA-Seq data were downloaded and mapped to a reference genome. After log fold change calculation, all of the data sets were normalized with variance normalization to fix the statistical deviation at 1 and ensure comparability. (C) Proteomics data sets were manually curated from the core articles and related supplementary data. Download FIG S1, EPS file, 1.1 MB.
List of Listeria genomes integrated into the website. A total of 83 complete Listeria genomes were downloaded from NCBI RefSeq and GenBank. All of the available chromosomes are listed with their names, sequence IDs, release dates, sizes, the numbers of coding elements, and the databases from which they were downloaded. Download TABLE S1, XLS file, 0.05 MB.
List of all 304 small RNAs discovered in L. monocytogenes EGD-e. Sheet 1 presents the set of 154 sRNAs, sheet 2 presents the set of 46 cisRegs, and sheet 3 presents the set of 104 asRNAs. For each set are shown the names and synonyms of the noncoding RNA elements, their positions, the first publication in which they were discovered, the others publications in which they were detected, and if they were detected in the TSS study. Download TABLE S2, XLS file, 0.1 MB.
Summary of the transcriptomic data sets. Three excel spreadsheets are shown. The first one shows the 64 transcriptomic studies included in the Listeriomics database. For each study downloaded from ArrayExpress, the name, accession number, date, type of technology, Listeria strain used, and download URL are shown. On the second sheet is a list of the 38 available differential-expression data sets with related technologies (gene expression array, tiling array, RNA-Seq). The third sheet shows a summary of the RNA-Seq remapping performed with the percentage of covered reads, and the sequencing platform used. Download TABLE S3, XLS file, 0.1 MB.
Box plot before and after variance normalization of the transcriptomics data sets. (A) Box plot before variance normalization of the 255 relative expression (log fold change) transcriptomics data sets from L. monocytogenes EGD-e available in the Listeriomics database. (B) Box plot after variance normalization of the 255 relative expression (log fold change) transcriptomics data sets from L. monocytogenes EGD-e available in the Listeriomics database. A default log fold change cutoff of 1.5 will, after normalization, better discriminate the real differently expressed genome elements. Download FIG S2, EPS file, 1.5 MB.
The 42 data sets used for reconstruction of the coexpression network of L. monocytogenes EGD-e. Displayed are the 42 transcriptomics data sets used for reconstruction of the coexpression network and all of the information about the biological conditions used. Download TABLE S4, XLS file, 0.03 MB.
The 23 proteomics data sets. Listed are the publications from which proteomics data were extracted. Publication dates, PubMed accession numbers, and download links are provided. Download TABLE S5, XLS file, 0.04 MB.
Transcriptomic and proteomic metadata analysis of L. monocytogenes EGD-e genes. Three excel spreadsheets are shown. On the first, the number of transcriptomic data sets in which each gene is differently expressed is shown. On the second are the results of the pathway enrichment analysis performed on the 651 genes of L. monocytogenes EGD-e that are found differently expressed in >10% of the 279 data sets. We performed a pathway enrichment analysis by using COG information and a Fisher exact test. The third sheet displays the number of data sets in which each L. monocytogenes EGD-e protein has been detected. A histogram helps to summarize the distribution of protein detection. Download TABLE S6, XLS file, 1.5 MB.