| 1 | # main topics: |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | # sub topics: |
|---|
| 6 | SUB exec_bug.hlp |
|---|
| 7 | |
|---|
| 8 | # format described in ../help.readme |
|---|
| 9 | |
|---|
| 10 | |
|---|
| 11 | |
|---|
| 12 | TITLE ARB Command Interpreter (ACI) |
|---|
| 13 | |
|---|
| 14 | OCCURRENCE NDS |
|---|
| 15 | [ export db ] |
|---|
| 16 | [ ARB_NT/Species/search/parse_fields ] |
|---|
| 17 | |
|---|
| 18 | DESCRIPTION ACI is a simple command interpreter, which uses streams of data as central concept. |
|---|
| 19 | |
|---|
| 20 | Many ACI commands have parameters which are specified behind |
|---|
| 21 | the command in parenthesis. |
|---|
| 22 | |
|---|
| 23 | All ACI commands |
|---|
| 24 | * take the data from (one or multiple) input streams, |
|---|
| 25 | * modify that data and |
|---|
| 26 | * write that data to (one or multiple) output streams. |
|---|
| 27 | |
|---|
| 28 | e.g. the command 'count("a")' counts every 'a' for each input stream and |
|---|
| 29 | generates one output stream (containing the char count) for every input stream. |
|---|
| 30 | |
|---|
| 31 | The first input stream always is a single stream, |
|---|
| 32 | often the value of a database field (e.g. when ACI is used in LINK{props_nds.hlp}). |
|---|
| 33 | |
|---|
| 34 | The number of output streams depends on the used command: |
|---|
| 35 | * most commands produce one output stream for each input stream (as the count-example above) |
|---|
| 36 | * some commands combine two input streams into one output stream (e.g. see binary operators below) |
|---|
| 37 | * some commands ignore all input streams and create one output streams (e.g. 'readdb(fieldname)') |
|---|
| 38 | * Note: special stream related commands are documented in section 'STREAM HANDLING' |
|---|
| 39 | |
|---|
| 40 | Multiple commands can be separated by two operator symbols: ';' and '|'. |
|---|
| 41 | * ';' binds stronger than '|' |
|---|
| 42 | * commands separated by ';' form a command-list and operate independently from each other: |
|---|
| 43 | - all(!) commands use all(!) input streams |
|---|
| 44 | - each command generates its own output streams |
|---|
| 45 | * the '|' operator acts as processing sequence point, i.e. |
|---|
| 46 | - all output streams generated by the command-list on the left side of the '|' will be passed |
|---|
| 47 | - as input streams to the command-list on the right side of the '|'. |
|---|
| 48 | |
|---|
| 49 | Finally (at the end of the overall ACI expression) all generated output streams get concatenated. |
|---|
| 50 | |
|---|
| 51 | Typical uses are to |
|---|
| 52 | * show text at the tips of the tree (LINK{props_nds.hlp}) |
|---|
| 53 | * write information into database fields (LINK{mod_field_list.hlp}) |
|---|
| 54 | |
|---|
| 55 | Instead of using ACI commands (as described in this document) you may always use any of the other |
|---|
| 56 | integrated data processing languages. Simply prefix the command |
|---|
| 57 | |
|---|
| 58 | * with a ':' to use LINK{srt.hlp} |
|---|
| 59 | * with a '/' to use LINK{reg.hlp} |
|---|
| 60 | |
|---|
| 61 | Both are as well available inside ACI via the commands 'srt' and 'command', see below. |
|---|
| 62 | |
|---|
| 63 | SECTION Examples |
|---|
| 64 | |
|---|
| 65 | # PREFORMATTED 1 |
|---|
| 66 | count("A");count("AG") |
|---|
| 67 | |
|---|
| 68 | creates two streams: |
|---|
| 69 | |
|---|
| 70 | 1. how many A's |
|---|
| 71 | 2. and how many A's and G's |
|---|
| 72 | |
|---|
| 73 | # PREFORMATTED 1 |
|---|
| 74 | count("A");count("G")|per_cent |
|---|
| 75 | |
|---|
| 76 | per_cent is a command that divides two numbers |
|---|
| 77 | (number of 'A's / number of 'G's) and returns the result |
|---|
| 78 | as percent. |
|---|
| 79 | |
|---|
| 80 | SECTION Simple example to illustrate the data flow |
|---|
| 81 | |
|---|
| 82 | # PREFORMATTED 1 |
|---|
| 83 | count("A");count("G")|"a/g = "; per_cent |
|---|
| 84 | |
|---|
| 85 | input concatenate output |
|---|
| 86 | "AGG" ----> count("A") -->| -----> "a/g = " --> | --> "a/g = " ---> 'a/g = 50' |
|---|
| 87 | \ | \ / | / |
|---|
| 88 | \ | \ | / |
|---|
| 89 | \ | / \ | / |
|---|
| 90 | -> count("G") -->| -----> per_cent --> | --> "50" --- |
|---|
| 91 | |
|---|
| 92 | |
|---|
| 93 | SECTION PARAMETERS |
|---|
| 94 | |
|---|
| 95 | Several commands expect or accept additional parameters in |
|---|
| 96 | parenthesis (e.g. 'remove(aA)'). |
|---|
| 97 | |
|---|
| 98 | Multiple parameters have to be separated by ',' or ';'. |
|---|
| 99 | |
|---|
| 100 | There are two distinct ways to specify such a parameter: |
|---|
| 101 | - unquoted |
|---|
| 102 | |
|---|
| 103 | Unquoted parameters are taken as specified, despite the following exceptions: |
|---|
| 104 | - any character in ',;"|\)' needs to be escaped by prefixing one '\'. |
|---|
| 105 | - spaces will get removed if not prefixed by '\'. |
|---|
| 106 | |
|---|
| 107 | - quoted |
|---|
| 108 | |
|---|
| 109 | Quoted parameters begin and end with a '"'. You can use any character, |
|---|
| 110 | but you need to escape '\' and '"' by preceeding a '\'. |
|---|
| 111 | |
|---|
| 112 | Examples: |
|---|
| 113 | |
|---|
| 114 | remove("\"") will remove all double quotes from input. |
|---|
| 115 | remove("\\") will remove all backslashes from input. |
|---|
| 116 | |
|---|
| 117 | [@@@ behavior currently not strictly implemented] |
|---|
| 118 | |
|---|
| 119 | SECTION COMMANDLIST |
|---|
| 120 | |
|---|
| 121 | If not explicitely mentioned, every command |
|---|
| 122 | creates one output stream for each input stream. |
|---|
| 123 | |
|---|
| 124 | STREAM HANDLING |
|---|
| 125 | |
|---|
| 126 | echo(x1;x2;x3...) creates one output stream from each specified parameter |
|---|
| 127 | (parameters are separated by ';'). |
|---|
| 128 | |
|---|
| 129 | "text" same as 'echo("text")' |
|---|
| 130 | |
|---|
| 131 | dd copies all input streams to output streams |
|---|
| 132 | |
|---|
| 133 | cut(N1,N2,N3) copies the Nth input stream(s) |
|---|
| 134 | |
|---|
| 135 | drop(N1,N2) copies all but the Nth input stream(s) |
|---|
| 136 | |
|---|
| 137 | dropempty drops all empty input streams |
|---|
| 138 | |
|---|
| 139 | dropzero drops all non-numeric or zero input streams |
|---|
| 140 | |
|---|
| 141 | swap(N1,N2) swaps two input streams |
|---|
| 142 | (w/o parameters: swaps last two streams) |
|---|
| 143 | |
|---|
| 144 | toback(X) moves the Xth input stream |
|---|
| 145 | to the end of output streams |
|---|
| 146 | |
|---|
| 147 | tofront(X) moves the Xth input stream |
|---|
| 148 | to the start of output streams |
|---|
| 149 | |
|---|
| 150 | merge([sep]) merges all input streams into one output stream. |
|---|
| 151 | If 'sep' is specified, it's inserted between them. |
|---|
| 152 | If no input streams are given, it returns 1 empty |
|---|
| 153 | output stream. |
|---|
| 154 | |
|---|
| 155 | split([sep[,mode]]) splits all input streams at separator string 'sep' |
|---|
| 156 | (default: split at linefeed). |
|---|
| 157 | |
|---|
| 158 | Modes: |
|---|
| 159 | |
|---|
| 160 | 0 remove found separators (default) |
|---|
| 161 | 1 split before separator |
|---|
| 162 | 2 split after separator |
|---|
| 163 | |
|---|
| 164 | colsplit([width]) splits each input stream into multiple streams of the |
|---|
| 165 | specified width (or shorter for last output stream). |
|---|
| 166 | The default width is 1. |
|---|
| 167 | |
|---|
| 168 | streams returns the number of input streams |
|---|
| 169 | |
|---|
| 170 | STRING |
|---|
| 171 | |
|---|
| 172 | head(n) the first n characters |
|---|
| 173 | left(n) the first n characters |
|---|
| 174 | |
|---|
| 175 | tail(n) the last n characters |
|---|
| 176 | right(n) the last n characters |
|---|
| 177 | |
|---|
| 178 | the above functions return an empty string for n<=0 |
|---|
| 179 | |
|---|
| 180 | len the length of the input |
|---|
| 181 | |
|---|
| 182 | len("chr") the length of the input excluding the |
|---|
| 183 | characters in 'chr' |
|---|
| 184 | |
|---|
| 185 | mid(x,y) the substring string from position x to y |
|---|
| 186 | |
|---|
| 187 | Allowed positions are |
|---|
| 188 | - [1..N] for mid() |
|---|
| 189 | - [0..N-1] for mid0() |
|---|
| 190 | |
|---|
| 191 | A position below that range is relative to the end of the string, |
|---|
| 192 | i.e. mid(-2,0) and mid0(-3,-1) are equiv to tail(3) |
|---|
| 193 | |
|---|
| 194 | crop("str") removes characters of 'str' from |
|---|
| 195 | both ends of the input |
|---|
| 196 | |
|---|
| 197 | remove("str") removes all characters of 'str' |
|---|
| 198 | e.g. remove(" ") removes all blanks |
|---|
| 199 | |
|---|
| 200 | keep("str") keep is the opposite of remove: |
|---|
| 201 | remove all chars that are not a member of 'str' |
|---|
| 202 | |
|---|
| 203 | isEmpty return '1' for each empty input stream, '0' for others |
|---|
| 204 | |
|---|
| 205 | srt("orig=dest",...) replace command, invokes SRT |
|---|
| 206 | (see LINK{srt.hlp}) |
|---|
| 207 | |
|---|
| 208 | # PREFORMATTED 1 |
|---|
| 209 | translate("old","new"[,"other"]) |
|---|
| 210 | |
|---|
| 211 | translates all characters from input that occur in the |
|---|
| 212 | first argument ("old") by the corresponding character of the |
|---|
| 213 | second argument ("new"). |
|---|
| 214 | |
|---|
| 215 | An optional third argument (one character only) means: |
|---|
| 216 | replace all other characters with the third argument. |
|---|
| 217 | |
|---|
| 218 | Example: |
|---|
| 219 | |
|---|
| 220 | Input: "--AabBCcxXy--" |
|---|
| 221 | translate("abc-","xyz-") "--AxyBCzxXy--" |
|---|
| 222 | translate("abc-","xyz-",".") "--.xy..z...--" |
|---|
| 223 | |
|---|
| 224 | This can be used to replace illegal characters from sequence date |
|---|
| 225 | (see predefined expressions in 'Modify fields of listed species'). |
|---|
| 226 | |
|---|
| 227 | |
|---|
| 228 | tab(n) append n-len(input) spaces |
|---|
| 229 | |
|---|
| 230 | pretab(n) prepend n-len(input) spaces |
|---|
| 231 | |
|---|
| 232 | upper converts string to upper case |
|---|
| 233 | lower converts string to lower case |
|---|
| 234 | caps capitalizes string |
|---|
| 235 | |
|---|
| 236 | # PREFORMATTED 1 |
|---|
| 237 | format(options) |
|---|
| 238 | |
|---|
| 239 | takes a long string and breaks it into several lines |
|---|
| 240 | |
|---|
| 241 | option (default) description |
|---|
| 242 | ========================================================== |
|---|
| 243 | width=# (50) line width |
|---|
| 244 | firsttab=# (10) first line left indent |
|---|
| 245 | tab=# (10) left indent (not first line) |
|---|
| 246 | "nl=chrs" (" ") list of characters that specify |
|---|
| 247 | a possibly point of a line break; |
|---|
| 248 | the line break characters get removed! |
|---|
| 249 | "forcenl=chrs" ("\n") Force a newline at these characters. |
|---|
| 250 | |
|---|
| 251 | (see also format_sequence below) |
|---|
| 252 | |
|---|
| 253 | # PREFORMATTED 1 |
|---|
| 254 | extract_words("chars",val) |
|---|
| 255 | |
|---|
| 256 | Search for all words (separated by ',' ';' ':' ' ' or 'tab') that |
|---|
| 257 | contain more characters of type chars than val, sort them |
|---|
| 258 | alphabetically and write them separated by ' ' to the output |
|---|
| 259 | |
|---|
| 260 | ESCAPING AND QUOTING |
|---|
| 261 | |
|---|
| 262 | escape escapes all occurrences of '\' and '"' by preceeding a '\' |
|---|
| 263 | quote quotes the input in '"' |
|---|
| 264 | |
|---|
| 265 | unescape inverse of escape |
|---|
| 266 | unquote removes quotes (if present). otherwise return input |
|---|
| 267 | |
|---|
| 268 | |
|---|
| 269 | STRING COMPARISON |
|---|
| 270 | |
|---|
| 271 | compare(a,b) return -1 if a<b, 0 if a=b, 1 if a>b |
|---|
| 272 | equals(a,b) return 1 if a=b, 0 otherwise |
|---|
| 273 | contains(a,b) if a contains b, this returns the position of |
|---|
| 274 | b inside a (1..N) and 0 otherwise. |
|---|
| 275 | Always returns 0 if b is empty. |
|---|
| 276 | partof(a,b) if a is part of b, this returns the position of |
|---|
| 277 | a inside b (1..N) and 0 otherwise. |
|---|
| 278 | |
|---|
| 279 | For each of these functions a case-insensitive alternative |
|---|
| 280 | exists (icompare, iequals, ...). |
|---|
| 281 | |
|---|
| 282 | Note: all these functions may be used as binary operators |
|---|
| 283 | (see section 'BOOLEAN OPERATORS' below for concept). |
|---|
| 284 | |
|---|
| 285 | |
|---|
| 286 | NUMERIC COMPARISON |
|---|
| 287 | |
|---|
| 288 | All functions here operate with floating-point numbers. |
|---|
| 289 | |
|---|
| 290 | isBelow(a,b) return 1 if a<b, 0 otherwise |
|---|
| 291 | isAbove(a,b) return 1 if a>b, 0 otherwise |
|---|
| 292 | isEqual(a,b) return 1 if a=b, 0 otherwise |
|---|
| 293 | |
|---|
| 294 | Note: all functions above may be used as binary operators |
|---|
| 295 | (see section 'BOOLEAN OPERATORS' below for concept). |
|---|
| 296 | |
|---|
| 297 | # PREFORMATTED 1 |
|---|
| 298 | inRange(low,high) |
|---|
| 299 | |
|---|
| 300 | For the values of all input streams, this returns |
|---|
| 301 | * 1 if low <= value <= high, |
|---|
| 302 | * 0 otherwise. |
|---|
| 303 | |
|---|
| 304 | CALCULATOR |
|---|
| 305 | |
|---|
| 306 | plus add arguments |
|---|
| 307 | minus subtract arguments |
|---|
| 308 | mult multiply arguments |
|---|
| 309 | div divide arguments |
|---|
| 310 | per_cent divide arguments * 100 |
|---|
| 311 | (not rounded; use "fper_cent|round(0)") |
|---|
| 312 | rest divide arguments, take rest |
|---|
| 313 | |
|---|
| 314 | The above functions perform calculation with integer numbers. |
|---|
| 315 | |
|---|
| 316 | For most of these functions there also exists a floating-point variant: |
|---|
| 317 | * fplus |
|---|
| 318 | * fminus |
|---|
| 319 | * fmult |
|---|
| 320 | * fdiv |
|---|
| 321 | * fper_cent |
|---|
| 322 | |
|---|
| 323 | To avoid 'division by zero'-errors, the operators 'div', 'per_cent' and 'rest' |
|---|
| 324 | (and their f-variants) return 0, if the second argument is zero. |
|---|
| 325 | |
|---|
| 326 | Note: all functions above may be used as binary operators |
|---|
| 327 | (see section 'BOOLEAN OPERATORS' below for concept). |
|---|
| 328 | |
|---|
| 329 | # PREFORMATTED 1 |
|---|
| 330 | round(digits) |
|---|
| 331 | |
|---|
| 332 | rounds a floating-point input to the specified amount of digits |
|---|
| 333 | behind the floating-point. |
|---|
| 334 | Specify zero to round to an integer number. |
|---|
| 335 | Specify negative digits to round to multiples of 10, 100, 1000, ... |
|---|
| 336 | |
|---|
| 337 | |
|---|
| 338 | BOOLEAN OPERATORS |
|---|
| 339 | |
|---|
| 340 | All input streams are converted to boolean |
|---|
| 341 | values (i.e. 0 or 1) as follows: |
|---|
| 342 | |
|---|
| 343 | "0" -> 0 |
|---|
| 344 | any number -> 1 |
|---|
| 345 | any text -> 0 (even empty text!) |
|---|
| 346 | |
|---|
| 347 | Operators: |
|---|
| 348 | |
|---|
| 349 | Not invert values of all input streams (0<->1) |
|---|
| 350 | And return 1 if all input streams are 1, 0 otherwise |
|---|
| 351 | Or return 1 if one input streams is 1, 0 otherwise |
|---|
| 352 | |
|---|
| 353 | Use "|or|not" or "|and|not" to execute NOR or NAND. |
|---|
| 354 | |
|---|
| 355 | BINARY OPERATORS |
|---|
| 356 | |
|---|
| 357 | Several operators work as so called 'binary operators'. |
|---|
| 358 | These operators may be used in various ways, which are |
|---|
| 359 | shown using the operator 'plus': |
|---|
| 360 | |
|---|
| 361 | ACI OUTPUT STREAMS |
|---|
| 362 | plus(a,b) a+b input:0 output:1 |
|---|
| 363 | a;b|plus a+b input:2 output:1 |
|---|
| 364 | a;b;c;d|plus a+b;c+d input:4 output:2 |
|---|
| 365 | a;b;c|plus(x) a+x;b+x;c+x input:3 output:3 |
|---|
| 366 | |
|---|
| 367 | That means, if the binary operator |
|---|
| 368 | |
|---|
| 369 | - has no arguments, it expects an even number of input streams. The operator is |
|---|
| 370 | applied to the first 2 streams, then to the second 2 stream and so on. |
|---|
| 371 | The number of output streams is half the number of input streams. |
|---|
| 372 | - has 1 argument, it accepts one to many input streams. The operator |
|---|
| 373 | is applied to each input stream together with the argument. |
|---|
| 374 | For each input stream one output stream is generated. |
|---|
| 375 | - has 2 arguments, it is applied to these. The arguments are interpreted as |
|---|
| 376 | ACI commands and are applied for each input stream. The results of |
|---|
| 377 | the commands are passed as arguments to the binary operator. For each input |
|---|
| 378 | stream one output stream is generated. |
|---|
| 379 | |
|---|
| 380 | CONDITIONAL |
|---|
| 381 | |
|---|
| 382 | select(a,b,c,...) each input stream is converted into a number |
|---|
| 383 | (non-numeric text converts to zero). That number is |
|---|
| 384 | used to select one of the given arguments: |
|---|
| 385 | 0 selects 'a', |
|---|
| 386 | 1 selects 'b', |
|---|
| 387 | 2 selects 'c' and so on. |
|---|
| 388 | The selected argument is interpreted as ACI command |
|---|
| 389 | and is applied to an empty input stream. |
|---|
| 390 | |
|---|
| 391 | DEBUGGING |
|---|
| 392 | |
|---|
| 393 | trace(onoff) toggle tracing of ACI actions to standard output. |
|---|
| 394 | Parameter: 0 or 1 (switch off or on) |
|---|
| 395 | |
|---|
| 396 | All streams are copied (like 'dd'). |
|---|
| 397 | |
|---|
| 398 | Example: |
|---|
| 399 | # PREFORMATTED 1 |
|---|
| 400 | cmd1 | cmd2 | trace(1) | tracedCmd1 | tracedCmd2 | trace(0) | untracedCmd |
|---|
| 401 | |
|---|
| 402 | To see the output from trace, either |
|---|
| 403 | * start arb from a terminal or |
|---|
| 404 | * use LINK{console.hlp} |
|---|
| 405 | |
|---|
| 406 | |
|---|
| 407 | DATABASE AND SEQUENCE |
|---|
| 408 | |
|---|
| 409 | readdb(field_name) the contents of the field 'field_name' |
|---|
| 410 | |
|---|
| 411 | sequence the sequence in the current alignment. |
|---|
| 412 | |
|---|
| 413 | Note: older ARB versions returned 'no sequence' |
|---|
| 414 | if the current alignment contained no sequence. |
|---|
| 415 | Now it returns an empty string. |
|---|
| 416 | |
|---|
| 417 | For genes it returns only the corresponding part |
|---|
| 418 | of the sequence. If the field complement = 1 then the |
|---|
| 419 | result is the reverse-complement. |
|---|
| 420 | |
|---|
| 421 | sequence_type the sequence type of the selected alignment: |
|---|
| 422 | 'rna', 'dna' or 'ami' |
|---|
| 423 | ali_name the name of the selected alignment (e.g. 'ali_16s') |
|---|
| 424 | |
|---|
| 425 | Note: Because they ignore all input streams, the commands above make more |
|---|
| 426 | sense at the beginning of an ACI expression (or subexpression). |
|---|
| 427 | |
|---|
| 428 | checksum(options) calculates a CRC checksum |
|---|
| 429 | options: |
|---|
| 430 | "exclude=chrs" remove 'chrs' before calculation |
|---|
| 431 | "toupper" make everything uppercase first |
|---|
| 432 | |
|---|
| 433 | gcgchecksum a gcg compatible checksum |
|---|
| 434 | |
|---|
| 435 | # PREFORMATTED 1 |
|---|
| 436 | format_sequence(options) |
|---|
| 437 | |
|---|
| 438 | takes a long string (e.g. sequence) and breaks it into several lines. |
|---|
| 439 | |
|---|
| 440 | option (default) description |
|---|
| 441 | ============================================================= |
|---|
| 442 | width=# (50) sequence line width |
|---|
| 443 | firsttab=# (10) first line left indent |
|---|
| 444 | tab=# (10) left indent (not first line) |
|---|
| 445 | numleft (NO) numbers on the left side |
|---|
| 446 | numright=# (NO) numbers on the right side (#=width) |
|---|
| 447 | gap=# (10) insert a gap every # seq. characters. |
|---|
| 448 | |
|---|
| 449 | (see also 'format' above) |
|---|
| 450 | |
|---|
| 451 | # PREFORMATTED 1 |
|---|
| 452 | extract_sequence("chars",rel_len) |
|---|
| 453 | |
|---|
| 454 | like extract_words, but do not sort words, but rel_len is the minimum |
|---|
| 455 | percentage of characters of a word that mach a character in 'chars' |
|---|
| 456 | before word is taken. All words will be separated by white space. |
|---|
| 457 | |
|---|
| 458 | # PREFORMATTED 1 |
|---|
| 459 | taxonomy([treename,] depth) |
|---|
| 460 | |
|---|
| 461 | Returns the taxonomy of the current species or group as defined by a tree. |
|---|
| 462 | |
|---|
| 463 | If 'treename' is specified, its used as tree, otherwise the 'default tree' |
|---|
| 464 | is used (which in most cases is the tree displayed in the ARB_NT main window). |
|---|
| 465 | |
|---|
| 466 | 'depth' specifies how many "levels" of the taxonomy are used. |
|---|
| 467 | |
|---|
| 468 | FILTERING |
|---|
| 469 | |
|---|
| 470 | There are several functions to filter sequential data: |
|---|
| 471 | |
|---|
| 472 | - filter |
|---|
| 473 | - diff |
|---|
| 474 | - change |
|---|
| 475 | |
|---|
| 476 | All these functions use the following COMMON OPTIONS to define |
|---|
| 477 | what is used as filter sequence: |
|---|
| 478 | |
|---|
| 479 | - species=name |
|---|
| 480 | |
|---|
| 481 | Use species 'name' as filter. |
|---|
| 482 | |
|---|
| 483 | - SAI=name |
|---|
| 484 | |
|---|
| 485 | Use SAI 'name' as filter. |
|---|
| 486 | |
|---|
| 487 | - first=1 |
|---|
| 488 | |
|---|
| 489 | Use 1st input stream as filter for all other input streams. |
|---|
| 490 | |
|---|
| 491 | - pairwise=1 |
|---|
| 492 | |
|---|
| 493 | Use 1st input stream as filter for 2nd stream, |
|---|
| 494 | 3rd stream as filter for 4th stream, and so on. |
|---|
| 495 | |
|---|
| 496 | - align=ali_name |
|---|
| 497 | |
|---|
| 498 | Use alignment 'ali_name' instead of current default |
|---|
| 499 | alignment (only meaningful together with 'species' or 'SAI'). |
|---|
| 500 | |
|---|
| 501 | Note: Only one of the parameters 'species', 'SAI', 'first' or 'pairwise' may be used. |
|---|
| 502 | |
|---|
| 503 | # PREFORMATTED 1 |
|---|
| 504 | diff(options) |
|---|
| 505 | |
|---|
| 506 | Calculates the difference between the filter (see common options above) and the input stream(s) and |
|---|
| 507 | write the result to output stream(s). |
|---|
| 508 | |
|---|
| 509 | Additional options: |
|---|
| 510 | |
|---|
| 511 | - equal=x |
|---|
| 512 | |
|---|
| 513 | Character written to output if filter and stream are equal at |
|---|
| 514 | a position (defaults to '.'). To copy the stream contents for |
|---|
| 515 | equal columns, specify 'equal=' (directly followed by ',' or ')') |
|---|
| 516 | |
|---|
| 517 | - differ=y |
|---|
| 518 | |
|---|
| 519 | Character written to output if filter and stream don't match at one column position. |
|---|
| 520 | Default is to copy the character from the stream. |
|---|
| 521 | |
|---|
| 522 | # PREFORMATTED 1 |
|---|
| 523 | filter(options) |
|---|
| 524 | |
|---|
| 525 | Filters only specified columns out of the input stream(s). You need to |
|---|
| 526 | specify either |
|---|
| 527 | |
|---|
| 528 | - exclude=xyz |
|---|
| 529 | |
|---|
| 530 | to use all columns, where the filter (see common options above) has none |
|---|
| 531 | of the characters 'xyz' |
|---|
| 532 | |
|---|
| 533 | or |
|---|
| 534 | |
|---|
| 535 | - include=xyz |
|---|
| 536 | |
|---|
| 537 | to use only columns, where the filter has one of the characters 'xyz' |
|---|
| 538 | |
|---|
| 539 | All used columns are concatenated and written to the output stream(s). |
|---|
| 540 | |
|---|
| 541 | |
|---|
| 542 | # PREFORMATTED 1 |
|---|
| 543 | change(options) |
|---|
| 544 | |
|---|
| 545 | Randomly modifies the content of columns selected |
|---|
| 546 | by the filter (see common options above). |
|---|
| 547 | Only columns containing letters will be modified. |
|---|
| 548 | |
|---|
| 549 | The options 'include=xyz' and 'exclude=xyz' work like |
|---|
| 550 | with 'filter()', but here they select the columns to modify - all other |
|---|
| 551 | columns get copied unmodified. |
|---|
| 552 | |
|---|
| 553 | How the selected columns are modified, is specified by the following |
|---|
| 554 | parameters: |
|---|
| 555 | |
|---|
| 556 | - change=percent |
|---|
| 557 | |
|---|
| 558 | percentage of changed columns (default: silently change nothing, to make |
|---|
| 559 | it more difficult for you to ignore this helpfile) |
|---|
| 560 | |
|---|
| 561 | - to=xy |
|---|
| 562 | |
|---|
| 563 | randomly change to one of the characters 'xy'. |
|---|
| 564 | |
|---|
| 565 | Hints: |
|---|
| 566 | |
|---|
| 567 | - Use 'xyy' to produce 33% 'x' and 66% 'y' |
|---|
| 568 | - Use 'xxxxxxxxxy' to produce 90% 'x' and 10% 'y' |
|---|
| 569 | - Use 'x' to replace all matching columns by 'x' |
|---|
| 570 | |
|---|
| 571 | I think the intention for this (long undocumented) command is to easily generate |
|---|
| 572 | artificial sequences with different GC-content, in order to test treeing-software. |
|---|
| 573 | |
|---|
| 574 | SPECIALS |
|---|
| 575 | |
|---|
| 576 | # PREFORMATTED 1 |
|---|
| 577 | exec(command[,param1,param2,...]) |
|---|
| 578 | |
|---|
| 579 | Execute external (unix) command. |
|---|
| 580 | |
|---|
| 581 | Given params will be single-quoted and passed to the command. |
|---|
| 582 | |
|---|
| 583 | All input streams will be concatenated and piped into the command. |
|---|
| 584 | |
|---|
| 585 | When the command itself is a pipe, put it in parenthesis (e.g. "(sort|uniq)"). |
|---|
| 586 | Note: This won't work together with params. |
|---|
| 587 | |
|---|
| 588 | The result is the output of the command. |
|---|
| 589 | |
|---|
| 590 | WARNING!!! |
|---|
| 591 | |
|---|
| 592 | You better not use this command for NDS, |
|---|
| 593 | because any slow command will disable all editing -> You never |
|---|
| 594 | can remove this command from the NDS. Even arb_panic will not |
|---|
| 595 | easily help you. |
|---|
| 596 | |
|---|
| 597 | # PREFORMATTED 1 |
|---|
| 598 | command(action) |
|---|
| 599 | |
|---|
| 600 | applies 'action' to all input streams using |
|---|
| 601 | |
|---|
| 602 | - ACI, |
|---|
| 603 | - SRT (if starts with ':') (see LINK{srt.hlp}) |
|---|
| 604 | - or as REG (if starts with '/') (see LINK{reg.hlp}). |
|---|
| 605 | |
|---|
| 606 | If you nest calls (i.e. if 'action' contains further calls to 'command') you have to apply |
|---|
| 607 | escaping multiple times (e.g. inside an export filter - which is in fact an |
|---|
| 608 | SRT expression - you'll have to use double escapes). |
|---|
| 609 | |
|---|
| 610 | # PREFORMATTED 1 |
|---|
| 611 | eval(exprEvalToAction) |
|---|
| 612 | |
|---|
| 613 | the 'exprEvalToAction' is evaluated (using an empty string as input) |
|---|
| 614 | and the result is interpreted as action and gets applied to all |
|---|
| 615 | input streams (as in 'command' above). |
|---|
| 616 | |
|---|
| 617 | Example: Said you have two numeric positions stored in database |
|---|
| 618 | fields 'pos1' and 'pos2' for each species. Then the following |
|---|
| 619 | command extracts the sequence data from pos1 to pos2: |
|---|
| 620 | |
|---|
| 621 | # PREFORMATTED 1 |
|---|
| 622 | sequence|eval(" \"mid(\";readdb(pos1);\";\";readdb(pos2);\")\" ") |
|---|
| 623 | |
|---|
| 624 | How the example works: |
|---|
| 625 | |
|---|
| 626 | The argument is the escaped version of the |
|---|
| 627 | command |
|---|
| 628 | # PREFORMATTED 1 |
|---|
| 629 | "mid(" ; readdb(pos1) ; ";" ; readdb(pos2) ; ")" |
|---|
| 630 | |
|---|
| 631 | If pos1 contains '10' and pos2 contains '20' that command will |
|---|
| 632 | evaluate to 'mid(10;20)'. |
|---|
| 633 | |
|---|
| 634 | For these positions the executed ACI behaves like 'sequence|mid(10;20)'. |
|---|
| 635 | |
|---|
| 636 | # PREFORMATTED 1 |
|---|
| 637 | define(name,escapedCommand) |
|---|
| 638 | |
|---|
| 639 | defines a ACI-macro 'name'. 'escapedCommand' contains an escaped |
|---|
| 640 | ACI command sequence. This command sequence can be executed with |
|---|
| 641 | do(name). |
|---|
| 642 | |
|---|
| 643 | # PREFORMATTED 1 |
|---|
| 644 | do(name) |
|---|
| 645 | |
|---|
| 646 | applies a previously defined ACI-macro to all input streams (see 'define'). |
|---|
| 647 | |
|---|
| 648 | 'define(a,action)' followed by 'do(a)' works similar to 'command(action)'. |
|---|
| 649 | |
|---|
| 650 | See embl.eft for an example using define and 'do' |
|---|
| 651 | |
|---|
| 652 | # PREFORMATTED 1 |
|---|
| 653 | findspec(action) |
|---|
| 654 | |
|---|
| 655 | Each input stream is interpreted as species 'name' (ID) and a species |
|---|
| 656 | with that 'name' is searched (aborts with error if species could not be found; |
|---|
| 657 | silently ignores empty streams). |
|---|
| 658 | |
|---|
| 659 | Otherwise 'action' is applied (to one empty stream). |
|---|
| 660 | Instead of the current item, all database commands inside 'action' use the found species. |
|---|
| 661 | |
|---|
| 662 | # PREFORMATTED 1 |
|---|
| 663 | findacc(action) |
|---|
| 664 | |
|---|
| 665 | like findspec, but search for 'acc' instead of 'name'. |
|---|
| 666 | |
|---|
| 667 | # PREFORMATTED 1 |
|---|
| 668 | findgene(action) |
|---|
| 669 | |
|---|
| 670 | like findspec, but searches for genes (starting at organism or |
|---|
| 671 | at other gene of same organism). |
|---|
| 672 | |
|---|
| 673 | # PREFORMATTED WIDTH DEFAULT |
|---|
| 674 | origin_organism(action) |
|---|
| 675 | origin_gene(action) |
|---|
| 676 | # PREFORMATTED RESET |
|---|
| 677 | |
|---|
| 678 | like command() but readdb() etc. reads all data from the |
|---|
| 679 | origin organism/gene of a gene-species (not from the gene-species itself). |
|---|
| 680 | |
|---|
| 681 | This function applies only to gene-species! |
|---|
| 682 | |
|---|
| 683 | SECTION Future features |
|---|
| 684 | |
|---|
| 685 | # PREFORMATTED 1 |
|---|
| 686 | statistic |
|---|
| 687 | |
|---|
| 688 | creates a character statistic of the sequence |
|---|
| 689 | (not implemented yet) |
|---|
| 690 | |
|---|
| 691 | EXAMPLES |
|---|
| 692 | |
|---|
| 693 | Some random ACI expression examples: |
|---|
| 694 | |
|---|
| 695 | # PREFORMATTED 1 |
|---|
| 696 | sequence|format_sequence(firsttab=0;tab=10)|"SEQUENCE_";dd |
|---|
| 697 | |
|---|
| 698 | fetches the default sequence, formats it, |
|---|
| 699 | and prepends 'SEQUENCE_'. |
|---|
| 700 | |
|---|
| 701 | # PREFORMATTED 1 |
|---|
| 702 | sequence|remove(".-")|format_sequence |
|---|
| 703 | |
|---|
| 704 | get the default sequence, remove all '.-' and |
|---|
| 705 | format it |
|---|
| 706 | |
|---|
| 707 | # PREFORMATTED 1 |
|---|
| 708 | sequence|remove(".-")|len |
|---|
| 709 | |
|---|
| 710 | the number of non '.-' symbols (sequence length ) |
|---|
| 711 | |
|---|
| 712 | # PREFORMATTED 1 |
|---|
| 713 | "[";taxonomy(tree_other,3);" -> ";taxonomy(3);"]" |
|---|
| 714 | |
|---|
| 715 | shows for each species how their taxonomy |
|---|
| 716 | changed between "tree_other" and current tree |
|---|
| 717 | |
|---|
| 718 | # PREFORMATTED 1 |
|---|
| 719 | equals(readdb(tmp),readdb(acc))|select(echo("tmp and acc differ"),) |
|---|
| 720 | |
|---|
| 721 | returns 'tmp and acc differ' if the content of |
|---|
| 722 | the database fields 'tmp' and 'acc' differs. empty result |
|---|
| 723 | otherwise. |
|---|
| 724 | |
|---|
| 725 | # PREFORMATTED 1 |
|---|
| 726 | readdb(full_name)|icontains(bacillus)|compare(0)|select(echo(..),readdb(full_name)) |
|---|
| 727 | |
|---|
| 728 | returns the content of the 'full_name' database entry if it contains |
|---|
| 729 | the substring 'bacillus'. Otherwise returns '..' |
|---|
| 730 | |
|---|
| 731 | |
|---|
| 732 | BUGS The output of taxonomy() is not always instantly refreshed. |
|---|