| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 6 | #SUB subtopic.hlp |
|---|
| 7 | |
|---|
| 8 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 9 | |
|---|
| 10 | #************* Title of helpfile !! and start of real helpfile ******** |
|---|
| 11 | TITLE Calculate sequence quality |
|---|
| 12 | |
|---|
| 13 | OCCURRENCE ARB_NT/Sequence/Calculate sequence quality |
|---|
| 14 | |
|---|
| 15 | DESCRIPTION 'Calculate sequence quality' tries to measure the quality of sequences and |
|---|
| 16 | the quality their alignment. |
|---|
| 17 | |
|---|
| 18 | HANDLING: |
|---|
| 19 | |
|---|
| 20 | Fill in the values you think are appropriate. |
|---|
| 21 | The default values are the values that worked best in the first test runs. |
|---|
| 22 | Many criteria are evaluated (see 'THE VALUES' below for details). |
|---|
| 23 | |
|---|
| 24 | A final "quality-value" (percentage) for each sequence is calculated |
|---|
| 25 | and all sequences below the given threshold may get marked. |
|---|
| 26 | |
|---|
| 27 | HOW IT WORKS: |
|---|
| 28 | |
|---|
| 29 | In the section "weights" you have quite a few options to fill in. |
|---|
| 30 | |
|---|
| 31 | These are some of the criteria used to evaluate the quality of the sequences.. |
|---|
| 32 | The values represent the share of the criteria in the final evaluation-formula. |
|---|
| 33 | All values represent percentages, therefore all values together should sum up to 100. |
|---|
| 34 | |
|---|
| 35 | THE VALUES: |
|---|
| 36 | |
|---|
| 37 | Base analysis: |
|---|
| 38 | |
|---|
| 39 | This is the number of bases that are stored in the sequence. "-" and "." are |
|---|
| 40 | not counted. |
|---|
| 41 | |
|---|
| 42 | Deviation: |
|---|
| 43 | |
|---|
| 44 | This is the deviation of the number of bases from a sequence to the average number |
|---|
| 45 | of bases in a group. |
|---|
| 46 | |
|---|
| 47 | No Helices: |
|---|
| 48 | |
|---|
| 49 | This is the number of positions in a sequence where no helix structure can be built. |
|---|
| 50 | |
|---|
| 51 | Consensus: |
|---|
| 52 | |
|---|
| 53 | For each named group found in the tree (selected below) |
|---|
| 54 | a consensus sequence is calculated. |
|---|
| 55 | |
|---|
| 56 | Every species' sequence is compared against the consensus sequences |
|---|
| 57 | of all groups of which the species is a member. |
|---|
| 58 | |
|---|
| 59 | That comparison uses conformity with and deviation from the consensus sequence. |
|---|
| 60 | |
|---|
| 61 | # A consensus is computed from sequences in one group and then from subgroups to groups. |
|---|
| 62 | # So "multilevel" consensi are generated. |
|---|
| 63 | # The value consists of two analysis: Every sequence is tested against every level of the consensus. |
|---|
| 64 | # Conformity and deviation from the consensus are measured. |
|---|
| 65 | |
|---|
| 66 | IUPAC: |
|---|
| 67 | |
|---|
| 68 | This is the number of iupac-codes stored in a sequence. |
|---|
| 69 | |
|---|
| 70 | GC proportion: |
|---|
| 71 | |
|---|
| 72 | This is the deviation in GC proportion from a sequence to group. |
|---|
| 73 | |
|---|
| 74 | NOTES Generally speaking the consensus is the mightiest tool to evaluate the quality. So keep the |
|---|
| 75 | percentage high unless you know what you're doing or you want to evaluate with just one or |
|---|
| 76 | two values. |
|---|
| 77 | |
|---|
| 78 | Be aware that the computation is very complex and can easily take hours to finish. |
|---|
| 79 | So if you don't see the statusbar moving in the first ten minutes it just means |
|---|
| 80 | that you are analyzing a huge database. |
|---|
| 81 | |
|---|
| 82 | EXAMPLES None |
|---|
| 83 | |
|---|
| 84 | WARNINGS None |
|---|
| 85 | |
|---|
| 86 | BUGS No bugs known |
|---|