source: branches/stable/HELP_SOURCE/oldhelp/pos_var_pars.hlp

Last change on this file was 17100, checked in by westram, 7 years ago
  • document valid characters for PVP
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 4.8 KB
Line 
1#Please insert up references in the next lines (line starts with keyword UP)
2UP      arb.hlp
3UP      glossary.hlp
4
5#Please insert subtopic references  (line starts with keyword SUB)
6#SUB    subtopic.hlp
7
8# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
9
10#************* Title of helpfile !! and start of real helpfile ********
11TITLE           Column statistic
12
13OCCURRENCE      ARB_NT/SAI/Create SAI from Sequences/Positional Variability ...
14
15DESCRIPTION     Calculates the base and frequencies positional variability for
16                each column independently.
17
18                It uses the parsimony method to find the minimum number of
19                mutations for each site, as they are determined by the specified
20                topology.
21
22                The calculation is performed for sequences of all species in tree.
23                For best results you should use one of the biggest trees available.
24                The tree should have been optimized using ARB_PARSIMONY.
25
26                The result can be used by:
27
28                - Parsimony to weight the characters properly
29                - Neighbour joining to estimate the distances more accurately.
30                - Filter (read notes below)
31
32                Resulting SAI will contain the following character codes:
33
34                        '.'                          Less than 10% valid characters
35                        '-'                          No mutations.
36                        '0123456789ABCDE...'         Mutation rate category
37
38                  The higher the digit/character of the mutation rate category is,
39                  the more conserved the site is. Stepping 2 positions rightwards
40                  in the list of given characters, approximately halves the mutation
41                  rate (explicit mappings see below).
42
43                Valid characters are "ACGTUacgtu" for DNA/RNA (or all amino acid codes for AA sequences).
44
45NOTES           Opposed to consensus- and max-frequency-SAIs, the positional variability SAI
46                is calculated based on the specified topology.
47
48                Later that PVP-SAI might be used as an filter to further optimize that topology.
49                When you filter out columns with high variability, topology changes that imply
50                an increased number of mutations in these columns will receive no penalty.
51
52                Repeating several iterations of these 2 steps might lead to a systematic error:
53                 * variable columns will tend to become even more variable and
54                 * conserved columns will tend to become even more conserved.
55
56                The systematic error caused by this effect will probably mostly
57                emphasize topological errors of the initial tree.
58                To avoid that problem a tree should as well be optimized using other filters
59                (e.g. max-frequency). This is especially true for the initial tree optimization.
60
61WARNINGS        if you only have small trees (<100 species),
62                using this function makes not much sense.
63
64SECTION         Mapping of site mutation rate to categories:
65
66                    mutation rate      category
67
68                    45.8% .. 75%          0     (max. possible mutation rate is ~75%)
69                    36.5% .. 45.8%        1
70                    28.2% .. 36.5%        2
71                    21.3% .. 28.2%        3
72                    15.7% .. 21.3%        4
73                    11.5% .. 15.7%        5
74                     8.3% .. 11.5%        6
75                     6.0% ..  8.3%        7
76                     4.3% ..  6.0%        8
77                     3.1% ..  4.3%        9
78                     2.2% ..  3.1%        A
79                     1.5% ..  2.2%        B
80                     1.1% ..  1.5%        C
81                     0.78% .. 1.1%        D
82                     0.55% .. 0.78%       E
83                     0.39% .. 0.55%       F
84                     0.28% .. 0.39%       G
85                     0.20% .. 0.28%       H
86                     0.14% .. 0.20%       I
87
88                    mutations/million  category
89
90                     976 .. 1400          J
91                     691 ..  975          K
92                     489 ..  690          L
93                     346 ..  488          M
94                     245 ..  345          N
95                     173 ..  244          O
96                     123 ..  172          P
97                      87 ..  122          Q
98                      62 ..   86          R
99                      44 ..   61          S
100                      31 ..   43          T
101                      22 ..   30          U
102                      16 ..   21          V
103                      11 ..   15          W
104                       8 ..   10          X
105                       6 ..    7          Y
106                       1 ..    5          Z
107
108
109BUGS            No bugs known
Note: See TracBrowser for help on using the repository browser.