Reannotate an OTU table with an updated taxonomy
--input-archive-otu-table INPUT_ARCHIVE_OTU_TABLE
Renew this table
-p, --taxonomic-profile FILE
output a 'condensed' taxonomic profile for each sample based on the OTU table. Taxonomic profiles output can be further converted to other formats using singlem summarise.
--taxonomic-profile-krona FILE
output a 'condensed' taxonomic profile for each sample based on the OTU table
--otu-table filename
output OTU table
--threads num_threads
number of CPUS to use [default: 1]
--assignment-method {smafa_naive_then_diamond,scann_naive_then_diamond,annoy_then_diamond,scann_then_diamond,diamond,diamond_example,annoy,pplacer}
Method of assigning taxonomy to OTUs and taxonomic profiles [default: smafa_naive_then_diamond]
Method | Description |
---|---|
smafa_naive_then_diamond | Search for the most similar window sequences <= 3bp different using a brute force algorithm (using the smafa implementation) over all window sequences in the database, and if none are found use DIAMOND blastx of all reads from each OTU. |
scann_naive_then_diamond | Search for the most similar window sequences <= 3bp different using a brute force algorithm over all window sequences in the database, and if none are found use DIAMOND blastx of all reads from each OTU. |
annoy_then_diamond | Same as scann_naive_then_diamond, except search using ANNOY rather than using brute force. Requires a non-standard metapackage. |
scann_then_diamond | Same as scann_naive_then_diamond, except search using SCANN rather than using brute force. Requires a non-standard metapackage. |
diamond | DIAMOND blastx best hit(s) of all reads from each OTU. |
diamond_example | DIAMOND blastx best hit(s) of all reads from each OTU, but report the best hit as a sequence ID instead of a taxonomy. |
annoy | Search for the most similar window sequences <= 3bp different using ANNOY, otherwise no taxonomy is assigned. Requires a non-standard metapackage. |
pplacer | Use pplacer to assign taxonomy of each read in each OTU. Requires a non-standard metapackage. |
--output-extras
give extra output for each sequence identified (e.g. the read(s) each OTU was generated from) in the output OTU table [default: not set]
--archive-otu-table filename
output OTU table in archive format for making DBs etc. [default: unused]
--output-jplace filename
Output a jplace format file for each singlem package to a file starting with this string, each with one entry per OTU. Requires 'pplacer' as the --assignment_method [default: unused]
--metapackage METAPACKAGE
Set of SingleM packages to use [default: use the default set]
--singlem-packages SINGLEM_PACKAGES [SINGLEM_PACKAGES ...]
SingleM packages to use [default: use the set from the default metapackage]
--assignment-singlem-db ASSIGNMENT_SINGLEM_DB
Use this SingleM DB when assigning taxonomy [default: not set, use the default]
--diamond-taxonomy-assignment-performance-parameters DIAMOND_TAXONOMY_ASSIGNMENT_PERFORMANCE_PARAMETERS
Performance-type arguments to use when calling 'diamond blastx' during the taxonomy assignment step. [default: use setting defined in metapackage when set, otherwise use '--block-size 0.5 --target-indexed -c1']
--evalue EVALUE
HMMSEARCH e-value cutoff to use for sequence gathering [default: 1e-05]
--min-orf-length length
When predicting ORFs require this many base pairs uninterrupted by a stop codon [default: 72 when input is reads, 300 when input is genomes]
--restrict-read-length length
Only use this many base pairs at the start of each sequence searched [default: no restriction]
--translation-table number
Codon table for translation. By default, translation table 4 is used, which is the same as translation table 11 (the usual bacterial/archaeal one), except that the TGA codon is translated as tryptophan, not as a stop codon. Using table 4 means that the minority of organisms which use table 4 are not biased against, without a significant effect on the majority of bacteria and archaea that use table 11. See http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=tgencodes for details on specific tables. [default: 4]
--filter-minimum-protein length
Ignore reads aligning in less than this many positions to each protein HMM when using --no-diamond-prefilter [default: 24]
--max-species-divergence INT
Maximum number of different bases acids to allow between a sequence and the best hit in the database so that it is assigned to the species level. [default: 2]
--exclude-off-target-hits
Exclude hits that are not in the target domain of each SingleM package
--min-taxon-coverage FLOAT
Minimum coverage to report in a taxonomic profile. [default: 0.35 for reads, 0.1 for genomes]
--debug
output debug information
--version
output version information and quit
--quiet
only output errors
--full-help
print longer help message
--full-help-roff
print longer help message in ROFF (manpage) format
Ben J. Woodcroft, Centre for Microbiome Research, School of Biomedical Sciences, Faculty of Health, Queensland University of Technology Samuel Aroney, Centre for Microbiome Research, School of Biomedical Sciences, Faculty of Health, Queensland University of Technology Raphael Eisenhofer, Centre for Evolutionary Hologenomics, University of Copenhagen, Denmark Rossen Zhao, Centre for Microbiome Research, School of Biomedical Sciences, Faculty of Health, Queensland University of Technology
Powered by Doctave