1 | #Please insert up references in the next lines (line starts with keyword UP) |
---|
2 | UP arb.hlp |
---|
3 | UP glossary.hlp |
---|
4 | |
---|
5 | #Please insert subtopic references (line starts with keyword SUB) |
---|
6 | #SUB subtopic.hlp |
---|
7 | |
---|
8 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
---|
9 | |
---|
10 | #************* Title of helpfile !! and start of real helpfile ******** |
---|
11 | TITLE Column statistic |
---|
12 | |
---|
13 | OCCURRENCE ARB_NT/SAI/Create SAI from Sequences/Positional Variability ... |
---|
14 | |
---|
15 | DESCRIPTION Calculates the base and frequencies positional variability for |
---|
16 | each column independently. |
---|
17 | |
---|
18 | It uses the parsimony method to find the minimum number of |
---|
19 | mutations for each site, as they are determined by the specified |
---|
20 | topology. |
---|
21 | |
---|
22 | The calculation is performed for sequences of all species in tree. |
---|
23 | For best results you should use one of the biggest trees available. |
---|
24 | The tree should have been optimized using ARB_PARSIMONY. |
---|
25 | |
---|
26 | The result can be used by: |
---|
27 | |
---|
28 | - Parsimony to weight the characters properly |
---|
29 | - Neighbour joining to estimate the distances more accurately. |
---|
30 | - Filter (read notes below) |
---|
31 | |
---|
32 | Resulting SAI will contain the following character codes: |
---|
33 | |
---|
34 | '.' Less than 10% valid characters |
---|
35 | '-' No mutations. |
---|
36 | '0123456789ABCDE...' Mutation rate category |
---|
37 | |
---|
38 | The higher the digit/character of the mutation rate category is, |
---|
39 | the more conserved the site is. Stepping 2 positions rightwards |
---|
40 | in the list of given characters, approximately halves the mutation |
---|
41 | rate (explicit mappings see below). |
---|
42 | |
---|
43 | Valid characters are "ACGTUacgtu" for DNA/RNA (or all amino acid codes for AA sequences). |
---|
44 | |
---|
45 | NOTES Opposed to consensus- and max-frequency-SAIs, the positional variability SAI |
---|
46 | is calculated based on the specified topology. |
---|
47 | |
---|
48 | Later that PVP-SAI might be used as an filter to further optimize that topology. |
---|
49 | When you filter out columns with high variability, topology changes that imply |
---|
50 | an increased number of mutations in these columns will receive no penalty. |
---|
51 | |
---|
52 | Repeating several iterations of these 2 steps might lead to a systematic error: |
---|
53 | * variable columns will tend to become even more variable and |
---|
54 | * conserved columns will tend to become even more conserved. |
---|
55 | |
---|
56 | The systematic error caused by this effect will probably mostly |
---|
57 | emphasize topological errors of the initial tree. |
---|
58 | To avoid that problem a tree should as well be optimized using other filters |
---|
59 | (e.g. max-frequency). This is especially true for the initial tree optimization. |
---|
60 | |
---|
61 | WARNINGS if you only have small trees (<100 species), |
---|
62 | using this function makes not much sense. |
---|
63 | |
---|
64 | SECTION Mapping of site mutation rate to categories: |
---|
65 | |
---|
66 | mutation rate category |
---|
67 | |
---|
68 | 45.8% .. 75% 0 (max. possible mutation rate is ~75%) |
---|
69 | 36.5% .. 45.8% 1 |
---|
70 | 28.2% .. 36.5% 2 |
---|
71 | 21.3% .. 28.2% 3 |
---|
72 | 15.7% .. 21.3% 4 |
---|
73 | 11.5% .. 15.7% 5 |
---|
74 | 8.3% .. 11.5% 6 |
---|
75 | 6.0% .. 8.3% 7 |
---|
76 | 4.3% .. 6.0% 8 |
---|
77 | 3.1% .. 4.3% 9 |
---|
78 | 2.2% .. 3.1% A |
---|
79 | 1.5% .. 2.2% B |
---|
80 | 1.1% .. 1.5% C |
---|
81 | 0.78% .. 1.1% D |
---|
82 | 0.55% .. 0.78% E |
---|
83 | 0.39% .. 0.55% F |
---|
84 | 0.28% .. 0.39% G |
---|
85 | 0.20% .. 0.28% H |
---|
86 | 0.14% .. 0.20% I |
---|
87 | |
---|
88 | mutations/million category |
---|
89 | |
---|
90 | 976 .. 1400 J |
---|
91 | 691 .. 975 K |
---|
92 | 489 .. 690 L |
---|
93 | 346 .. 488 M |
---|
94 | 245 .. 345 N |
---|
95 | 173 .. 244 O |
---|
96 | 123 .. 172 P |
---|
97 | 87 .. 122 Q |
---|
98 | 62 .. 86 R |
---|
99 | 44 .. 61 S |
---|
100 | 31 .. 43 T |
---|
101 | 22 .. 30 U |
---|
102 | 16 .. 21 V |
---|
103 | 11 .. 15 W |
---|
104 | 8 .. 10 X |
---|
105 | 6 .. 7 Y |
---|
106 | 1 .. 5 Z |
---|
107 | |
---|
108 | |
---|
109 | BUGS No bugs known |
---|