source: branches/lib/HELP_SOURCE/source/pos_var_pars.hlp

Last change on this file was 19532, checked in by westram, 4 months ago
  • reintegrates 'help' into 'trunk'
    • tweak arb documentation:
      • automatically link
        • ticket references to arb bug tracker (only affects html version).
        • found URLs.
      • page titles
        • warn about long titles.
        • introduce SUBTITLEs (automatically triggered by multi-line titles in source files).
        • increase allowed length (limited by subwindow width).
      • cleanup header sections in all helpfiles.
      • fix and/or update several help files.
      • document syntax of help sources.
      • build issues:
        • when xml validation fails, next build no longer uses invalid xml ⇒ keeps failing.
        • remove output files on error (including files below ARBHOME/lib).
        • pipe output through logs to ensure proper wrapping in Entering/Leaving lines.
    • moves Tree admin + NDS menu entries to top of menu
  • adds: log:branches/help@18783:19531
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 4.6 KB
Line 
1#       main topics:
2UP      arb.hlp
3UP      glossary.hlp
4
5#       sub topics:
6#SUB     subtopic.hlp
7
8# format described in ../help.readme
9
10
11TITLE           Column statistic
12
13OCCURRENCE      ARB_NT/SAI/Create SAI from Sequences/Positional Variability ...
14
15DESCRIPTION     Calculates the base and frequencies positional variability for
16                each column independently.
17
18                It uses the parsimony method to find the minimum number of
19                mutations for each site, as they are determined by the specified
20                topology.
21
22                The calculation is performed for sequences of all species in tree.
23                For best results you should use one of the biggest trees available.
24                The tree should have been optimized using ARB_PARSIMONY.
25
26                The result can be used by:
27
28                - Parsimony to weight the characters properly
29                - Neighbour joining to estimate the distances more accurately.
30                - Filter (read notes below)
31
32                Resulting SAI will contain the following character codes:
33
34                        '.'                          Less than 10% valid characters
35                        '-'                          No mutations.
36                        '0123456789ABCDE...'         Mutation rate category
37
38                  The higher the digit/character of the mutation rate category is,
39                  the more conserved the site is. Stepping 2 positions rightwards
40                  in the list of given characters, approximately halves the mutation
41                  rate (explicit mappings see below).
42
43                Valid characters are "ACGTUacgtu" for DNA/RNA (or all amino acid codes for AA sequences).
44
45NOTES           Opposed to consensus- and max-frequency-SAIs, the positional variability SAI
46                is calculated based on the specified topology.
47
48                Later that PVP-SAI might be used as an filter to further optimize that topology.
49                When you filter out columns with high variability, topology changes that imply
50                an increased number of mutations in these columns will receive no penalty.
51
52                Repeating several iterations of these 2 steps might lead to a systematic error:
53                 * variable columns will tend to become even more variable and
54                 * conserved columns will tend to become even more conserved.
55
56                The systematic error caused by this effect will probably mostly
57                emphasize topological errors of the initial tree.
58                To avoid that problem a tree should as well be optimized using other filters
59                (e.g. max-frequency). This is especially true for the initial tree optimization.
60
61WARNINGS        if you only have small trees (<100 species),
62                using this function makes not much sense.
63
64SECTION         Mapping of site mutation rate to categories:
65
66                    mutation rate      category
67
68                    45.8% .. 75%          0     (max. possible mutation rate is ~75%)
69                    36.5% .. 45.8%        1
70                    28.2% .. 36.5%        2
71                    21.3% .. 28.2%        3
72                    15.7% .. 21.3%        4
73                    11.5% .. 15.7%        5
74                     8.3% .. 11.5%        6
75                     6.0% ..  8.3%        7
76                     4.3% ..  6.0%        8
77                     3.1% ..  4.3%        9
78                     2.2% ..  3.1%        A
79                     1.5% ..  2.2%        B
80                     1.1% ..  1.5%        C
81                     0.78% .. 1.1%        D
82                     0.55% .. 0.78%       E
83                     0.39% .. 0.55%       F
84                     0.28% .. 0.39%       G
85                     0.20% .. 0.28%       H
86                     0.14% .. 0.20%       I
87
88                    mutations/million  category
89
90                     976 .. 1400          J
91                     691 ..  975          K
92                     489 ..  690          L
93                     346 ..  488          M
94                     245 ..  345          N
95                     173 ..  244          O
96                     123 ..  172          P
97                      87 ..  122          Q
98                      62 ..   86          R
99                      44 ..   61          S
100                      31 ..   43          T
101                      22 ..   30          U
102                      16 ..   21          V
103                      11 ..   15          W
104                       8 ..   10          X
105                       6 ..    7          Y
106                       1 ..    5          Z
107
108
109BUGS            No bugs known
Note: See TracBrowser for help on using the repository browser.