Opened 7 years ago

Closed 7 years ago

Last modified 3 years ago

#743 closed task (performed)

Implement tool for SAI filtered sequence export

Reported by: westram Owned by: westram
Priority: major Milestone: arb7.0
Component: External_tools Version: SVN
Keywords: silva Cc:

Description (last modified by westram)

Specification:

  1. load ARB database
  2. filter sequences by existing SAI(s)
  3. export remaining columns of all sequences to FASTA file
    • skip sequences with < --min-bases base positions
    • logged skipped species to stderr (using field 'name')

CLI :

Switches……………………………. Description
--db "database.arb" name of ARB input database
--ali "aliname" name of alignment in ARB DB
--fasta "flat.file" name of FASTA output file
--id "ACI" ACI defining FASTA header line (default: "readdb(name)")
(see ACI manual)
--min-bases NUM do not export sequences with < NUM bases left after applying filter
--count-bases "ACGTU" list of base-characters to count for --min-bases (default: "A..Z")
(the following arguments may be specified multiple times)
--filterby "SAIname" name of SAI used to filter sequence data
--pass [allbut] "chars" characters that will forward column to output if found in SAI
--block [allbut] "chars" characters that will block column from output if found in SAI
  • --pass and --block may only occur after and always apply to the last preceding --filterby.
  • the optional parameter [allbut] inverts the character set, i.e. if only one filter is specified the following are equivalent:
    • --pass "ABC"--block allbut "ABC"
    • --block "XYZ"--pass allbut "XYZ"

Filter combination:

OR --filterby "PV1" --pass allbut ".-=0123" --filterby "PV2" --pass allbut ".-=012345"
AND --filterby "PV1" --block ".-=0123" --filterby "PV2" --block ".-=012345"

Assuming the SAIs contain positional variability:

  • OR: will export all columns where PV1 is ≥ 4 or PV2 is ≥ 6
  • AND: will export all columns where PV1 is ≥ 4 and PV2 is ≥ 6

Change History (10)

comment:1 Changed 7 years ago by westram

  • Description modified (diff)

comment:2 Changed 7 years ago by westram

  • Description modified (diff)

comment:3 Changed 7 years ago by westram

  • Description modified (diff)

comment:4 Changed 7 years ago by westram

  • Milestone changed from wishlist to r17q2
  • Owner changed from devel to westram
  • Status changed from new to assigned

comment:5 Changed 7 years ago by westram

  • Status changed from assigned to _started

comment:6 Changed 7 years ago by westram

  • Description modified (diff)
  • proposes add. switch --count-bases

comment:7 Changed 7 years ago by westram

  • Description modified (diff)
  • Resolution set to performed
  • Status changed from _started to closed

by [16119]

comment:8 Changed 7 years ago by westram

  • Resolution performed deleted
  • Status changed from closed to _started
  • add sequence post-processing

comment:9 Changed 7 years ago by westram

  • Resolution set to performed
  • Status changed from _started to closed

by [16234]

comment:10 Changed 3 years ago by westram

  • Milestone changed from r17q2 to arb7.0
Note: See TracTickets for help on using tickets.