Opened 4 years ago

Last modified 4 years ago

#813 new task

tweak fasta filters to better support metadata transport

Reported by: westram Owned by: devel
Priority: normal Milestone: wishlist
Component: Import / Export Version: SVN
Keywords: import Cc:

Description (last modified by westram)

Motivation:

FASTA format wasn't designed to transport metadata, nevertheless the headerlines often contain data which users would like to import. Modifying import filters to do so, is too complicated even for more advanced users.

Proposed solution:

Import:

  1. add import filters that parse common types of fasta headerlines, cutting them into pieces e.g.
    • at separators or
    • at fixed column positions
  2. store all these pieces into automatically generated field names (e.g. p1, p2, p3, …)
  3. user may use such a filter together with FTS to quickly select which data shall be imported where.

Export:

  1. add export filters which concatenate the same fieldnames as above into the headerline (e.g. separated by some character or tabbed)
  2. user may again use that together with FTS

Implementation notes:

  • refactor class implementing MATCH keyword (i.e. import_match):
    • split into Modifier (handling ACI + SRT)
      and Writer (handling TAG, WRITE (plus type-info), APPEND, SETVAR(?))
  • introduce new class Splitter (=split+loop over parts incrementing loop-variable)
    • allow to split using fixed string, reg.expression and with explicit position list (⇒ 3 different split-commands)
  • introduce new base class Consumer:
    • MATCH and Splitter are Consumers
    • each Consumer consists of a Modifier plus either a Writer or a Splitter
  • what to do with empty fields during export?
    • just skipping them may be bad, because it'll shift left all data written behind by one position (e.g. if read later with a complementary import filter)
    • instead introduce new keyword DROPEMPTY usable in every Writer
      (Splitters won't act on empty data)

related: #562 (did implement FTS), #686

Change History (2)

comment:1 Changed 4 years ago by westram

dropped some examples to ria:Documents/arb/ticket-813

comment:2 Changed 4 years ago by westram

  • Description modified (diff)
Note: See TracTickets for help on using tickets.