Skip to main content
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Latest Articles
    • Special Issues
    • COVID-19 Special Collection
    • Editor's Picks
    • Special Series: Sponsored Minireviews and Video Abstracts
    • Archive
  • Topics
    • Applied and Environmental Science
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Novel Systems Biology Techniques
    • Early-Career Systems Microbiology Perspectives
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics
  • About the Journal
    • About mSystems
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
mSystems
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Latest Articles
    • Special Issues
    • COVID-19 Special Collection
    • Editor's Picks
    • Special Series: Sponsored Minireviews and Video Abstracts
    • Archive
  • Topics
    • Applied and Environmental Science
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Novel Systems Biology Techniques
    • Early-Career Systems Microbiology Perspectives
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics
  • About the Journal
    • About mSystems
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
Commentary | Novel Systems Biology Techniques

Caught between Two Genes: Accounting for Operonic Gene Structure Improves Prokaryotic RNA Sequencing Quantification

Taylor Reiter
Taylor Reiter
aDepartment of Population Health and Reproduction, University of California, Davis, Davis, California, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Taylor Reiter
DOI: 10.1128/mSystems.01256-20
  • Article
  • Info & Metrics
  • PDF
Loading

ABSTRACT

RNA sequencing (RNA-seq) has matured into a reliable and low-cost assay for transcriptome profiling and has been deployed across a range of systems. The computational tool space for the analysis of RNA-seq data has kept pace with advances in sequencing. Yet tool development has largely centered around the human transcriptome. While eukaryotic and prokaryotic transcriptomes are similar, key differences in transcribed units limit the transfer of wet-lab and computational tools between the two domains. The article by M. Chung, R. S. Adkins, J. S. A. Mattick, K. R. Bradwell, et al. (mSystems 6:e00917-20, 2021, https://doi.org/10.1128/mSystems.00917-20), demonstrates that integrating prokaryote-specific strategies into existing RNA-seq analyses improves read quantification. Unlike in eukaryotes, polycistronic transcripts derived from operons lead to sequencing reads that span multiple neighboring genes. Chung et al. introduce FADU, a software tool that performs a correction for such reads and thereby improves read quantification and biological interpretation of prokaryotic RNA sequencing.

The views expressed in this article do not necessarily reflect the views of the journal or of ASM.

COMMENTARY

Over the last 15 years, RNA sequencing (RNA-seq) has offered a high-resolution view of the presence and abundance of transcripts at a given time (1, 2). Transcriptome sequencing has revealed the functional elements of genomes and their relationships to cellular environments across a wide range of organisms. Accurate estimation of gene abundances underlies many discoveries from RNA sequencing, including those relying on differential expression analysis and gene coexpression networks (3). Due to the foundational role of accurate transcript estimates, both experimental and computational techniques have been developed to improve the accuracy of read-based quantification methods. While advances have been disproportionately driven in the eukaryotic transcriptome space, many advances improve read quantification across domains of life. For example, due to the presence of similar sequences in a genome such as what occurs with paralogous genes, some transcriptome reads ambiguously map to multiple genes or transcripts at distant locations in the genome. To better assign read counts in these situations, expectation maximization algorithms use counts from unambiguously mapped reads to estimate the true abundance of multimapped reads (4–7). This method improves read quantification in any genome that contains paralogous genes.

Fundamental differences in transcription limit the transfer of innovation in the quantification of transcripts between biological domains. Eukaryotic transcripts contain a single product (monocistronic), while many prokaryotic transcripts contain multiple products (polycistronic, e.g., all genes in an operon). Computational prediction of operon structures is still an active area of research, meaning that laboratory-based techniques like 5′ and 3′ rapid amplification of cDNA ends (RACE) and direct RNA sequencing remain the gold standard for operon prediction but preclude application to the majority of RNA sequencing experiments. Without well-annotated reference transcriptomes that contain polycistronic transcripts, genome-based alignment strategies better capture reads that span genes in an operon. Similar to multimapped reads, reads that span multiple genes create problems in read quantification. However, as a uniquely prokaryotic problem, this problem has received little attention.

With the development of the FADU software tool, Chung and colleagues (8) present a simple and elegant correction for the quantification of reads from prokaryotic transcriptomes that span multiple genes when mapped against a reference genome. The algorithm assigns read counts that are proportional to the length of the overlap between a read and gene, corrected by the length of the gene itself. This method alleviates the undercounting and overcounting of operonic genes and small genes in gene-dense regions by other approaches, thereby improving the accuracy of downstream analyses that rely on gene counts such as differential expression. This concept is illustrated on simulated and real data, demonstrating that proportional correction for reads that span multiple genes impacts the biological interpretation of prokaryotic sequencing data.

This correction is a valuable contribution that is poised for incorporation into general prokaryotic RNA-seq analyses as well as the broader read counting tool space. As a stand-alone tool that operates on BAM alignment files, FADU can be integrated into RNA-seq analysis as an alternative to software like featureCounts or HTSeq. FADU has a small memory and CPU footprint that is permissive for integration into routine RNA-seq analysis pipelines, including for large-scale RNA-seq analyses. Alternatively, many other tools that perform read quantification already provide optional parameters to control behavior around multimapped reads and thus likely contain the infrastructure to support the adoption of this new correction technique (9, 10). Integration of this correction step into other read quantification tools would support widespread adoption. Adoption of this correction will provide relief of systematic biases in the quantification of operonic genes and small genes in gene-dense coding regions, supporting improved biological insights from prokaryotic transcriptomics.

While Chung and colleagues (8) explored their approach in the prokaryotic transcriptome space, correction for reads that map to multiple genes may additionally improve metagenome, metatranscriptome, and single-cell read quantifications. These applications remain to be tested but may improve insights into operonic and gene-dense coding regions in yet-to-be-cultured prokaryotes from diverse environments.

  • Copyright © 2021 Reiter.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.

REFERENCES

  1. 1.↵
    1. Wang Z,
    2. Gerstein M,
    3. Snyder M
    . 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63. doi:10.1038/nrg2484.
    OpenUrlCrossRefPubMedWeb of Science
  2. 2.↵
    1. Croucher NJ,
    2. Thomson NR
    . 2010. Studying bacterial transcriptomes using RNA-seq. Curr Opin Microbiol 13:619–624. doi:10.1016/j.mib.2010.09.009.
    OpenUrlCrossRefPubMed
  3. 3.↵
    1. Teng M,
    2. Love MI,
    3. Davis CA,
    4. Djebali S,
    5. Dobin A,
    6. Graveley BR,
    7. Li S,
    8. Mason CE,
    9. Olson S,
    10. Pervouchine D,
    11. Sloan CA,
    12. Wei X,
    13. Zhan L,
    14. Irizarry RA
    . 2016. A benchmark for RNA-seq quantification pipelines. Genome Biol 17:74. doi:10.1186/s13059-016-0940-1.
    OpenUrlCrossRefPubMed
  4. 4.↵
    1. Li B,
    2. Dewey CN
    . 2011. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12:323. doi:10.1186/1471-2105-12-323.
    OpenUrlCrossRefPubMed
  5. 5.↵
    1. Roberts A,
    2. Pachter L
    . 2013. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods 10:71–73. doi:10.1038/nmeth.2251.
    OpenUrlCrossRefPubMedWeb of Science
  6. 6.↵
    1. Bray NL,
    2. Pimentel H,
    3. Melsted P,
    4. Pachter L
    . 2016. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 34:525–527. doi:10.1038/nbt.3519.
    OpenUrlCrossRefPubMed
  7. 7.↵
    1. Patro R,
    2. Duggal G,
    3. Love MI,
    4. Irizarry RA,
    5. Kingsford C
    . 2017. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417–419. doi:10.1038/nmeth.4197.
    OpenUrlCrossRefPubMed
  8. 8.↵
    1. Chung M,
    2. Adkins RS,
    3. Mattick JSA,
    4. Bradwell KR,
    5. Shetty AC,
    6. Sadzewicz L,
    7. Tallon LJ,
    8. Fraser CM,
    9. Rasko DA,
    10. Mahurkar A,
    11. Dunning Hotopp JC
    . 2021. FADU: a quantification tool for prokaryotic transcriptomic analyses. mSystems 6:e00917-20. doi:10.1128/mSystems.00917-20.
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    1. Liao Y,
    2. Smyth GK,
    3. Shi W
    . 2014. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30:923–930. doi:10.1093/bioinformatics/btt656.
    OpenUrlCrossRefPubMedWeb of Science
  10. 10.↵
    1. Anders S,
    2. Pyl PT,
    3. Huber W
    . 2015. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31:166–169. doi:10.1093/bioinformatics/btu638.
    OpenUrlCrossRefPubMedWeb of Science
PreviousNext
Back to top
Download PDF
Citation Tools
Caught between Two Genes: Accounting for Operonic Gene Structure Improves Prokaryotic RNA Sequencing Quantification
Taylor Reiter
mSystems Jan 2021, 6 (1) e01256-20; DOI: 10.1128/mSystems.01256-20

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print
Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this mSystems article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Caught between Two Genes: Accounting for Operonic Gene Structure Improves Prokaryotic RNA Sequencing Quantification
(Your Name) has forwarded a page to you from mSystems
(Your Name) thought you would be interested in this article in mSystems.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Caught between Two Genes: Accounting for Operonic Gene Structure Improves Prokaryotic RNA Sequencing Quantification
Taylor Reiter
mSystems Jan 2021, 6 (1) e01256-20; DOI: 10.1128/mSystems.01256-20
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • COMMENTARY
    • REFERENCES
  • Info & Metrics
  • PDF

KEYWORDS

prokaryote
software
transcriptomics

Related Articles

Cited By...

About

  • About mSystems
  • Author Videos
  • Board of Editors
  • Policies
  • Overleaf Pilot
  • For Reviewers
  • For the Media
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Author Warranty
  • Types of Articles
  • Getting Started
  • Ethics
  • Contact Us

Follow #mSystemsJ

@ASMicrobiology

       

 

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Online ISSN: 2379-5077