| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | UP arb_edit4.hlp |
|---|
| 5 | |
|---|
| 6 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 7 | SUB islandhopping.hlp |
|---|
| 8 | |
|---|
| 9 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 10 | |
|---|
| 11 | #************* Title of helpfile !! and start of real helpfile ******** |
|---|
| 12 | TITLE The integrated aligners |
|---|
| 13 | |
|---|
| 14 | OCCURRENCE ARB_EDIT4/Edit/Integrated Aligners |
|---|
| 15 | |
|---|
| 16 | DESCRIPTION Currently there are two integrated aligners: |
|---|
| 17 | |
|---|
| 18 | 1. Fast Aligner |
|---|
| 19 | |
|---|
| 20 | 2. Island Hopper (see Subtopic) |
|---|
| 21 | |
|---|
| 22 | The following adjustments and features should apply to both aligners. |
|---|
| 23 | |
|---|
| 24 | We did not test everything yet with island hopper, so some of them are broken. |
|---|
| 25 | Please mail to LINK{devel@arb-home.de} if you find something. |
|---|
| 26 | |
|---|
| 27 | |
|---|
| 28 | SECTION ADJUSTMENTS |
|---|
| 29 | |
|---|
| 30 | Align |
|---|
| 31 | |
|---|
| 32 | Align current, marked or selected sequences. |
|---|
| 33 | |
|---|
| 34 | If you type 'CTRL-A' in the main editor window |
|---|
| 35 | this option is set to align the current species and |
|---|
| 36 | the aligner gets called. |
|---|
| 37 | |
|---|
| 38 | Reference |
|---|
| 39 | |
|---|
| 40 | The aligner needs a sequence as reference. |
|---|
| 41 | You can either |
|---|
| 42 | |
|---|
| 43 | * select a fixed species by name, |
|---|
| 44 | * the consensus of the group containing the aligned species or |
|---|
| 45 | * the next relative(s) found by the selected PT-Server. |
|---|
| 46 | |
|---|
| 47 | If you choose 'Species by name', you may press the 'COPY' button |
|---|
| 48 | to copy the name of the 'Current Species' to the 'Reference' species. |
|---|
| 49 | Alternatively you may use CTRL-R while the focus is inside the sequence view |
|---|
| 50 | (Note: CTRL-R does not work, if LINK{viewdiff.hlp} is active). |
|---|
| 51 | |
|---|
| 52 | If you choose 'Auto search by pt_server', the |
|---|
| 53 | aligner will use the next relative(s) as reference. |
|---|
| 54 | |
|---|
| 55 | * Please read section about 'Protein alignment with pt_server' below. |
|---|
| 56 | |
|---|
| 57 | * If the nearest relative has gaps where the sequence |
|---|
| 58 | to align has bases, the aligner will use the 2nd nearest |
|---|
| 59 | relative or if that one has gaps too, the 3rd nearest, etc. |
|---|
| 60 | You can define the maximum number of relatives considered. |
|---|
| 61 | |
|---|
| 62 | * All used relatives and the number of base positions used from each relative, |
|---|
| 63 | will be written into the field 'used_rels' (see also LINK{markbyref.hlp}). |
|---|
| 64 | |
|---|
| 65 | If you enter '0' in 'Data from range only, plus', relative |
|---|
| 66 | search only uses data from the aligned range. If you enter |
|---|
| 67 | a value different from '0' the used range will be |
|---|
| 68 | expanded (positive values) or limited (negative values). |
|---|
| 69 | When the input field is empty, the complete sequence will be used. |
|---|
| 70 | |
|---|
| 71 | Press 'More settings' to define how relative search works |
|---|
| 72 | in detail. See LINK{next_neighbours_common.hlp} |
|---|
| 73 | |
|---|
| 74 | Range |
|---|
| 75 | |
|---|
| 76 | Align only a part of or the whole sequence. |
|---|
| 77 | |
|---|
| 78 | Several possibilities exist for aligning just a part of the sequence: |
|---|
| 79 | - select 'Positions around cursor' and specify how many positions shall be |
|---|
| 80 | taken into each direction from the cursor position (Example: If you align |
|---|
| 81 | 10 columns around position 100 then columns 90-110 will be aligned). |
|---|
| 82 | - if you use 'Selected range' the column range of the selected block will |
|---|
| 83 | be used. |
|---|
| 84 | - if you select 'Multi-Range by SAI', the specified SAI will be interpreted as |
|---|
| 85 | a list of ranges. A list of characters defines what is considered a range. |
|---|
| 86 | All ranges will be aligned. |
|---|
| 87 | |
|---|
| 88 | See also LINK{e4_modsai.hlp} for howto create suitable SAIs. |
|---|
| 89 | |
|---|
| 90 | Turn check |
|---|
| 91 | |
|---|
| 92 | The aligner is able to detect sequences which |
|---|
| 93 | were entered in the wrong direction. With this |
|---|
| 94 | switch you can select, if you like the aligner |
|---|
| 95 | to turn such sequences and if it should ask you. |
|---|
| 96 | |
|---|
| 97 | NOTE: In two cases turn checking isn't |
|---|
| 98 | reasonable: |
|---|
| 99 | |
|---|
| 100 | If you align only a part of a sequence or if you |
|---|
| 101 | do not search Reference via pt_server. In both |
|---|
| 102 | cases turn checking will be disabled. |
|---|
| 103 | |
|---|
| 104 | Report |
|---|
| 105 | |
|---|
| 106 | The aligner can generate reports for the aligned |
|---|
| 107 | sequence and for the reference sequence. These |
|---|
| 108 | reports can be viewed with EDIT4, if you choose |
|---|
| 109 | File/Load Configuration/DEFAULT_CONFIGURATION |
|---|
| 110 | |
|---|
| 111 | The report for the reference sequence (AMI) |
|---|
| 112 | contains a '>' for every position were the aligner |
|---|
| 113 | needed an insert in the reference sequence. |
|---|
| 114 | |
|---|
| 115 | The report for the aligned sequence (ASC) contains |
|---|
| 116 | the following characters: |
|---|
| 117 | |
|---|
| 118 | '-' for matching positions |
|---|
| 119 | |
|---|
| 120 | '+' for inserts (in aligned sequence and in reference sequence) |
|---|
| 121 | |
|---|
| 122 | '~' for matching, but not equal bases (A aligned to G, C aligned to T or U) |
|---|
| 123 | |
|---|
| 124 | '#' for mismatching positions |
|---|
| 125 | |
|---|
| 126 | SECTION Protein alignment with pt_server |
|---|
| 127 | |
|---|
| 128 | If you want to align protein sequences and use a PT-Server (to detect |
|---|
| 129 | the next relative for each sequence), |
|---|
| 130 | you need to |
|---|
| 131 | |
|---|
| 132 | * have two alignments in your database (a protein alignment and a |
|---|
| 133 | corresponding DNA alignment). ARB has functions to synchronize these |
|---|
| 134 | alignments (see LINK{aaali.hlp}), |
|---|
| 135 | * build a pt_server based on the DNA-alignment, |
|---|
| 136 | select that pt_server in the aligner window and |
|---|
| 137 | * specify the name of the DNA-alignment in the 'Alignment' field. |
|---|
| 138 | |
|---|
| 139 | NOTES This aligner knows about and uses all extended base characters |
|---|
| 140 | (ACGTUMRWSYKVHDN) for the alignment. |
|---|
| 141 | In other words: M aligned to R costs no penalty. |
|---|
| 142 | |
|---|
| 143 | The config-manager icon handles the settings in the 'Integrated Aligners' window and those in |
|---|
| 144 | its subwindows 'Parameters for Island Hopping' and 'Family search parameters'. |
|---|
| 145 | |
|---|
| 146 | EXAMPLES None |
|---|
| 147 | |
|---|
| 148 | WARNINGS None |
|---|
| 149 | |
|---|
| 150 | BUGS If you select the menu entry 'remove all aligner entries' ARB_EDIT4 |
|---|
| 151 | crashes in most cases. |
|---|
| 152 | |
|---|
| 153 | Workaround: |
|---|
| 154 | |
|---|
| 155 | 1. Close all groups containing species with aligner entries, so that no aligner entries are visible. |
|---|
| 156 | 2. Remove all aligner entries |
|---|
| 157 | 3. Reload configuration |
|---|
| 158 | |
|---|
| 159 | |
|---|
| 160 | |
|---|
| 161 | |
|---|
| 162 | |
|---|
| 163 | |
|---|
| 164 | |
|---|
| 165 | |
|---|
| 166 | |
|---|