| 1 | # main topics: |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | # sub topics: |
|---|
| 6 | |
|---|
| 7 | # format described in ../help.readme |
|---|
| 8 | |
|---|
| 9 | |
|---|
| 10 | TITLE How is the consensus calculated? |
|---|
| 11 | |
|---|
| 12 | OCCURRENCE ARB_NT/SAI/Create SAI using../Consensus |
|---|
| 13 | ARB_EDIT4/Properties/Consensus definition |
|---|
| 14 | |
|---|
| 15 | DESCRIPTION What to do with gaps? |
|---|
| 16 | |
|---|
| 17 | Define whether to use or to ignore gaps entirely: |
|---|
| 18 | |
|---|
| 19 | If you count gaps and the gap frequency exceeds the specified |
|---|
| 20 | threshold, the consensus will show a '-'. |
|---|
| 21 | |
|---|
| 22 | If the switch is 'off', the algorithm will virtually remove all gaps. |
|---|
| 23 | That means if you have a column with 10 'A's |
|---|
| 24 | and 500 gaps the program thinks of 100% 'A' |
|---|
| 25 | (if the switch is 'on', the relative number of 'A's would be 2%). |
|---|
| 26 | |
|---|
| 27 | Regardless of how gaps are handled, |
|---|
| 28 | if a column contains only gaps, the consensus will always show a '='. |
|---|
| 29 | |
|---|
| 30 | Simplify using base character groups |
|---|
| 31 | |
|---|
| 32 | If grouping is 'off' the most frequent character will be used in consensus (even |
|---|
| 33 | if another character has the same frequency). |
|---|
| 34 | |
|---|
| 35 | If grouping is 'on' base characters will be grouped as follows: |
|---|
| 36 | |
|---|
| 37 | * RNA/DNA alignments use the IUPAC ambiguity codes (MRWSYKVHDBN) |
|---|
| 38 | * Amino acid alignments use amino acid classes (ADHIFC) |
|---|
| 39 | |
|---|
| 40 | Click the "Show IUPAC" button to display detailed information |
|---|
| 41 | about these character groups. |
|---|
| 42 | |
|---|
| 43 | The threshold defines how characters are grouped: |
|---|
| 44 | |
|---|
| 45 | * in RNA/DNA alignments the threshold specifies, whether a |
|---|
| 46 | non-ambiguous character is considered for grouping (i.e. |
|---|
| 47 | all characters below the threshold will be removed and the |
|---|
| 48 | rest will be grouped. See also example below) |
|---|
| 49 | |
|---|
| 50 | * in amino acid alignments the threshold specifies, whether |
|---|
| 51 | all characters of a group together are frequent enough to |
|---|
| 52 | show that group in the consensus. If not, an 'X' will be displayed |
|---|
| 53 | in consensus. |
|---|
| 54 | |
|---|
| 55 | Example: |
|---|
| 56 | |
|---|
| 57 | If you have 40% 'A', 10% 'C', 40% 'G' and 10% 'T' and |
|---|
| 58 | 'threshold for character' is set to 20%, arb will |
|---|
| 59 | use the iupac code representing 'A' and 'G' (i.e. 'R'). |
|---|
| 60 | |
|---|
| 61 | Reasonable thresholds explained: |
|---|
| 62 | |
|---|
| 63 | * amino acid: |
|---|
| 64 | * 51% means: if most belong to one amino acid group, then show it (otherwise show 'X') |
|---|
| 65 | * RNA/DNA: |
|---|
| 66 | * 26% means: group up to 3 nucleotides in an ambiguity code (otherwise show most frequent base) |
|---|
| 67 | * 25% means: group up to 4 nucleotides, i.e. will produce 'N' in consensus (only if the nucleotides are distributed EXACTLY even) |
|---|
| 68 | * 20% is similar to 25%, but also slightly uneven distributions will produce 'N' |
|---|
| 69 | * 51% will effectively turn off IUPAC grouping for nucleotides |
|---|
| 70 | * 50% will group 2 nucleotides (only if they are distributed EXACTLY even) |
|---|
| 71 | * 33% will group up to 3 nucleotides (only if they are distributed EXACTLY even) |
|---|
| 72 | |
|---|
| 73 | Show as upper or lower case? |
|---|
| 74 | |
|---|
| 75 | Define whether the character is displayed in upper or lower case |
|---|
| 76 | or whether a dot is displayed. |
|---|
| 77 | |
|---|
| 78 | Define upper and lower limit: |
|---|
| 79 | |
|---|
| 80 | If the percentage of a character is above or equal to the upper |
|---|
| 81 | limit, the character is displayed in upper case. |
|---|
| 82 | |
|---|
| 83 | If the percentage of a character is above or equal to the lower |
|---|
| 84 | limit and below the upper limit, the character is displayed in lower case. |
|---|
| 85 | |
|---|
| 86 | Otherwise a dot ('.') is displayed. |
|---|
| 87 | |
|---|
| 88 | If gaps are ignored (as explained at top), the percentage is calculated relative |
|---|
| 89 | to all existing bases in the column. |
|---|
| 90 | If gaps are NOT ignored, the percentage is calculated relative to the number of species. |
|---|
| 91 | |
|---|
| 92 | NOTES You can save/load the consensus settings to/from a file using the config-manager icon. |
|---|
| 93 | This allows you to exchange |
|---|
| 94 | * the consensus settings used in EDIT4 and |
|---|
| 95 | * the settings used to calculate the SAI 'CONSENSUS' from the ARB main window. |
|---|
| 96 | |
|---|
| 97 | EXAMPLES None |
|---|
| 98 | |
|---|
| 99 | WARNINGS None |
|---|
| 100 | |
|---|
| 101 | BUGS None |
|---|
| 102 | |
|---|