TABLE 1 

SHI7 learning module produces meaningful QC parametrizations on internal and publicly available data sets

Data setAvailabilityLearned parameters
HMP tongue; shotgun
(Illumina HS PE TS2)
Public, SRS014271 (8)--adaptor TruSeq2 --flash True --allow_outies False
--filter_qual 36 --trim_qual 36
Immigrant Microbiome Project; amplicon
(mixed Illumina PE Nextera)
Internal--adaptor Nextera --flash True --allow_outies False
--filter_qual 34 --trim_qual 32
Small bowel aspirate; amplicon
(Illumina PE Nextera)
Internal--adaptor Nextera --flash True --allow_outies False
--filter_qual 36 --trim_qual 34
Primate Microbiome Project stomach;
amplicon (Illumina PE TS2)
Internal--adaptor TruSeq2 --flash True --allow_outies False
--filter_qual 36 --trim_qual 33 --min_overlap 239
--max_overlap 269
Longitudinal diet study; shotgun
(Illumina HS SE Nextera)
Internal-SE --adaptor Nextera --flash False --allow_outies
False --filter_qual 36 --trim_qual 34
HMP stool; amplicon (454 SE)aPublic, stool (17)-SE --adaptor None --flash False --allow_outies False
--filter_qual 34 --trim_qual 31
Mouse tutorial; amplicon
(Illumina PE Nextera)
Public (18)--adaptor None --flash True --allow_outies False
--filter_qual 34 --trim_qual 34 --min_overlap 154
--max_overlap 172
Irritable bowel syndrome cohort;
shotgun (Illumina HS SE Nextera)
Internal-SE --adaptor Nextera --flash False --allow_outies
False --filter_qual 37 --trim_qual 35
Human microbiome; RNA-Seq
(Illumina HS PE ScriptSeq)
Internal--adaptor TruSeq3-2 --flash True --allow_outies False
--filter_qual 39 --trim_qual 36
  • a sff_extract -Q was used for the initial conversion of .sff to .Fastq format (19).