Opened 8 years ago
Last modified 7 years ago
#755 new enhancement
probe match shall support more types of "weak matches"
Reported by: | westram | Owned by: | devel |
---|---|---|---|
Priority: | major | Milestone: | wishlist |
Component: | PT server | Version: | SVN |
Keywords: | Cc: |
Description (last modified by westram)
Types of weak matches currently supported by PT-Server:
- mismatches
- N-mismatches
It should also support these types:
- IUPAC codes in 'Target string'
- insertions
- deletions
Needed additional parameters:
- max. number of insertions and deletions allowed
- mismatch-penalty for inserts + deletes (each using 2 penalties, e.g.: one for initial insert, another for extending that insert)
- mismatch-penalty for IUPAC-matches?
- maybe specify a general weight-factor (for IUPAC and N-matches),
- calculate some kind of weighted mismatch from
- the probabilities of the specific bases (defined by IUPAC/N) and
- their transition/transversion penalties and
- add weight-factor * mismatch-weight to current (weighted) mismatch value. That would allow to either ignore IUPAC-mismatches (by setting the weight to zero) or to count/weight them as mismatching.
Prefix (PT) traversal:
- needs to carry much more state-information, e.g. number of inserts/deletes performed (absolute + extends),
- has to be able to properly undo state-modifications (performed during descent) while ascending and
- should descent into all additional prefixes that get possible by inserting/deleting bases.
While traversing PT, matches get collected and need to get categorized into
- impossible matches,
- definite matches and
- possible matches (which need further inspection).
Reasons why a hit may be classified as possible match:
- target string is longer than PT-depth (⇒ reaches cut-off at tips of PT)
- whenever any other cut-off in PT is reached (e.g. 'N' or IUPAC-code occurred in input sequence data)
Possible matches need to inspect the actual sequence data and evaluate the "rest" of the target string against it.
Note: See
TracTickets for help on using
tickets.