| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 6 | #SUB subtopic.hlp |
|---|
| 7 | |
|---|
| 8 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 9 | |
|---|
| 10 | #************* Title of helpfile !! and start of real helpfile ******** |
|---|
| 11 | TITLE Graph Aligner |
|---|
| 12 | |
|---|
| 13 | OCCURRENCE ARB Editor -> Edit -> Prototypical Graph Aligner |
|---|
| 14 | |
|---|
| 15 | DESCRIPTION This is an alternative to the integrated aligners developed for the SILVA |
|---|
| 16 | project. Similar to those aligners it uses aligned sequences from your |
|---|
| 17 | current database as a reference to align the selected sequences. Other than |
|---|
| 18 | them it employs full dynamic programming to create the alignment. It also |
|---|
| 19 | considers all selected relatives at once, instead of falling back to less |
|---|
| 20 | similar sequences only if the current sequence is missing bases (e.g. |
|---|
| 21 | because it is a partial sequence). |
|---|
| 22 | |
|---|
| 23 | SECTION OPTIONS |
|---|
| 24 | |
|---|
| 25 | Select the sequences to be aligned as usual ("Current Species", "Selected |
|---|
| 26 | Species", "Marked Species"). |
|---|
| 27 | |
|---|
| 28 | |
|---|
| 29 | Select a PT-Server to be used. Make sure it is up to date and contains all |
|---|
| 30 | sequences you want to be considered as reference. |
|---|
| 31 | |
|---|
| 32 | HINT: Unless you deselect the "Realign" button in the advanced menu, no |
|---|
| 33 | sequence will be used as a reference for itself. |
|---|
| 34 | |
|---|
| 35 | HINT: Sequences with less than 10 gaps are considered not aligned, and |
|---|
| 36 | also not used as a reference. |
|---|
| 37 | |
|---|
| 38 | |
|---|
| 39 | Select a positional variability filter. If possible, use the filter |
|---|
| 40 | appropriate for the type of sequences you want aligned. Positional |
|---|
| 41 | variability statistics will be considered when placing the individual bases. |
|---|
| 42 | |
|---|
| 43 | |
|---|
| 44 | Decide what to do with possible overhang. If your sequence extends beyond the |
|---|
| 45 | reference sequences on either side of the alignment, those bases cannot be |
|---|
| 46 | aligned properly. Three options of handling this situation are supported: |
|---|
| 47 | |
|---|
| 48 | "keep attached" |
|---|
| 49 | |
|---|
| 50 | just leave them dangling, directly attached to the last base that |
|---|
| 51 | could be aligned properly |
|---|
| 52 | |
|---|
| 53 | "move to edge" |
|---|
| 54 | |
|---|
| 55 | move them out to the very beginning and end of |
|---|
| 56 | the alignment. This allows you to easily spot sequences |
|---|
| 57 | with overhang, and decide what to do yourself. Recommended, |
|---|
| 58 | but only if you check your sequences after alignment! |
|---|
| 59 | |
|---|
| 60 | "remove" |
|---|
| 61 | |
|---|
| 62 | automatically remove these bases. |
|---|
| 63 | |
|---|
| 64 | |
|---|
| 65 | Select a protection level higher than that of the sequences if you want the |
|---|
| 66 | alignment software to actually modify the bases. Choose a lower protection |
|---|
| 67 | level to execute a "dry run", not changing anything. Note that sequences |
|---|
| 68 | with a protection level of zero will always be changed. |
|---|
| 69 | |
|---|
| 70 | |
|---|
| 71 | The Logging Level option allows you to change the noisiness of the alignment |
|---|
| 72 | program. All output will be printed to the console from which you started |
|---|
| 73 | ARB. The Option "debug_graph" may produce several large files for every |
|---|
| 74 | sequence aligned and is not recommended for the uninitiated. |
|---|
| 75 | |
|---|
| 76 | SECTION TRICKS |
|---|
| 77 | |
|---|
| 78 | If you want to see how the alignment that would be produced by the graph |
|---|
| 79 | aligner differs from your current alignment, and why the program would |
|---|
| 80 | act that way, you can set the protection level to "0" and the Logging level |
|---|
| 81 | to "debug". The output on the console will now include all differing sections |
|---|
| 82 | of the alignment and the matching parts of the reference sequences. |
|---|
| 83 | |
|---|
| 84 | SECTION ADVANCED OPTIONS |
|---|
| 85 | |
|---|
| 86 | Select the "Show advanced options" Button at the top to gain access to |
|---|
| 87 | the you-may-now-shoot-yourself-in-the-foot-severely dialog window. |
|---|
| 88 | |
|---|
| 89 | Don't be surprised if the graph aligner crashes after you entered silly |
|---|
| 90 | values here. No sanity check of your options is done. |
|---|
| 91 | |
|---|
| 92 | |
|---|
| 93 | Turn check: |
|---|
| 94 | |
|---|
| 95 | If selected (default) sequences will be automatically reversed |
|---|
| 96 | and/or complemented if this will likely improve the alignment. |
|---|
| 97 | |
|---|
| 98 | |
|---|
| 99 | Realign: |
|---|
| 100 | |
|---|
| 101 | If selected, the sequence itself is excluded from the result of |
|---|
| 102 | the executed PT-Server family search. If deselected, the alignment |
|---|
| 103 | of an identical sequence found by the PT-Server is copied. |
|---|
| 104 | |
|---|
| 105 | |
|---|
| 106 | Load reference sequence from PT Server: |
|---|
| 107 | |
|---|
| 108 | Do not read alignment data from your current database, but from the |
|---|
| 109 | database the PT-Server was built from. This makes starting the |
|---|
| 110 | graph aligner much slower, but allows you to align against external |
|---|
| 111 | databases or PT-Servers with different sequence names than your |
|---|
| 112 | current database. |
|---|
| 113 | |
|---|
| 114 | |
|---|
| 115 | (Copy and) mark sequence used as reference: |
|---|
| 116 | |
|---|
| 117 | Mark the sequences that were used as a reference during alignment. |
|---|
| 118 | This allows you to easily load them into the editor to review the |
|---|
| 119 | decisions made by the graph aligner. |
|---|
| 120 | If you also selected the "Load reference" option, sequences will be |
|---|
| 121 | copied into your current database prior to being marked. |
|---|
| 122 | |
|---|
| 123 | |
|---|
| 124 | Gap insertion/extension penalties: (default is 5/2) |
|---|
| 125 | |
|---|
| 126 | You can change the penalties associated with opening and extending |
|---|
| 127 | gaps. |
|---|
| 128 | |
|---|
| 129 | |
|---|
| 130 | Family search min/min_score/max: (default 15,0.7,40) |
|---|
| 131 | |
|---|
| 132 | The first value tells the graph aligner how many sequences it should |
|---|
| 133 | try to always use. The second value determines the minimal identity |
|---|
| 134 | with the target sequence additional reference sequences should have. |
|---|
| 135 | The third value selects the maximal number of sequences to be used |
|---|
| 136 | as a reference. |
|---|
| 137 | |
|---|
| 138 | |
|---|
| 139 | Use at least X sequences with at least Y bases: (SSU-default is 1, 1400) |
|---|
| 140 | |
|---|
| 141 | This option allows you to require that the reference include X |
|---|
| 142 | sequences of a length larger than or equal to Y. |
|---|
| 143 | |
|---|
| 144 | |
|---|
| 145 | Aligner threads / Queue size: |
|---|
| 146 | |
|---|
| 147 | Up to 4 threads can be used to align simultaneously. If your workstation |
|---|
| 148 | sports multiple CPUs this will speed up alignment of many sequences. |
|---|
| 149 | Increase the size of the buffer between the graph aligner components to |
|---|
| 150 | about 15 when using 4 threads. |
|---|
| 151 | |
|---|
| 152 | |
|---|
| 153 | WARNINGS None |
|---|
| 154 | |
|---|
| 155 | BUGS No bugs known |
|---|