| 1 | # main topics: |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | # sub topics: |
|---|
| 6 | #SUB subtopic.hlp |
|---|
| 7 | |
|---|
| 8 | # format described in ../help.readme |
|---|
| 9 | |
|---|
| 10 | |
|---|
| 11 | TITLE Searching |
|---|
| 12 | |
|---|
| 13 | OCCURRENCE ARB_NT/Species/Search and Query |
|---|
| 14 | ARB_NT/Genome/Search and Query |
|---|
| 15 | ARB_NT/Tree/Search groups.. |
|---|
| 16 | |
|---|
| 17 | DESCRIPTION This describes the search feature in ARB as used in the following |
|---|
| 18 | search and query modules: |
|---|
| 19 | * LINK{sp_search.hlp} |
|---|
| 20 | * LINK{group_search.hlp} |
|---|
| 21 | * LINK{gene_search.hlp} |
|---|
| 22 | |
|---|
| 23 | When we talk about 'items' below, we mean e.g. 'species', 'genes', 'taxonomic groups' |
|---|
| 24 | etc., depending which search tool you are currently using. |
|---|
| 25 | |
|---|
| 26 | |
|---|
| 27 | SECTION SEARCH FIELD |
|---|
| 28 | |
|---|
| 29 | Each search expression applies either |
|---|
| 30 | |
|---|
| 31 | - to a specific item field (e.g. 'full_name') or |
|---|
| 32 | - to some criterion calculated on the fly (e.g. amount of marked species inside a taxonomic group) or |
|---|
| 33 | - to any or all item fields, if you select one of the entries in "[...]". |
|---|
| 34 | |
|---|
| 35 | The following special search fields may be available: |
|---|
| 36 | |
|---|
| 37 | * "[any field]" reports a match if any direct field matches the expression. |
|---|
| 38 | * "[all fields]" reports a match if all direct fields match the expression. |
|---|
| 39 | * "[any recursive]" reports a match if any direct or hierarchical field matches the expression. |
|---|
| 40 | * "[all recursive]" reports a match if all direct and hierarchical fields match the expression. |
|---|
| 41 | |
|---|
| 42 | Notes: |
|---|
| 43 | |
|---|
| 44 | * search is much slower using one of the 'recursive' fields mostly |
|---|
| 45 | because sequence data is searched as well. |
|---|
| 46 | * "[all fields]" is often used together with "not equal" (see below), |
|---|
| 47 | making it equivalent to "no field matches expression". |
|---|
| 48 | |
|---|
| 49 | SECTION SEARCH OPERATORS |
|---|
| 50 | |
|---|
| 51 | There are two kinds of search operators directly available for queries: |
|---|
| 52 | |
|---|
| 53 | 1. the "equal" sign between the field and the match expression means that |
|---|
| 54 | the selected field (or any field) should match the expression. |
|---|
| 55 | Clicking on the sign inverts it into a "not equal" sign, which means |
|---|
| 56 | the selected field shall not match the expression. |
|---|
| 57 | |
|---|
| 58 | 2. the search operators at the beginning of the 2nd and 3rd line |
|---|
| 59 | allow to connect the 3 search expressions available for each query. |
|---|
| 60 | Possible values are 'and', 'or' or 'ign'. |
|---|
| 61 | |
|---|
| 62 | - 'ign' stands for "ignore" (the rest of the line will be ignored) |
|---|
| 63 | - selecting 'and' means the preceeding and the expression behind have to match |
|---|
| 64 | - selecting 'or' means the preceeding or the expression behind have to match |
|---|
| 65 | |
|---|
| 66 | There is no operator precedence, i.e. |
|---|
| 67 | - "1st and 2nd or 3rd" is interpreted as "(1st and 2nd) or 3rd" AND |
|---|
| 68 | - "1st or 2nd and 3rd" is interpreted as "(1st or 2nd) and 3rd" |
|---|
| 69 | |
|---|
| 70 | More search operators are available to connect multiple (consecutive) queries: |
|---|
| 71 | |
|---|
| 72 | - using 'Add species' provides a global OR operator (uniting the results |
|---|
| 73 | of the preceeding and the next query), |
|---|
| 74 | - using 'Keep species' provides a global AND operator (intersecting the results |
|---|
| 75 | of the preceeding and the next query) and |
|---|
| 76 | - using "that don't match the q." provides a global NOT operator for the next query |
|---|
| 77 | |
|---|
| 78 | Results of queries can be transformed into a set of 'marked species' |
|---|
| 79 | using "Mark listed unmark rest" and the marked species can be stored |
|---|
| 80 | as LINK{species_configs.hlp}. Multiple stored configurations can be |
|---|
| 81 | logically combined to new sets of marked species. |
|---|
| 82 | To again create a query result from all marked species simply use |
|---|
| 83 | "Search species ... that are marked". |
|---|
| 84 | |
|---|
| 85 | |
|---|
| 86 | SECTION MATCH EXPRESSION |
|---|
| 87 | |
|---|
| 88 | - Each expression tries to match the complete field content |
|---|
| 89 | (or the result of the underlaying calculation), |
|---|
| 90 | i.e. searching for 'test' will match only fields which |
|---|
| 91 | exactly contain 'test' (not 'my test' or 'testing'). |
|---|
| 92 | |
|---|
| 93 | - If you search for '' (empty expression), all fields w/o data, i.e. all |
|---|
| 94 | non-existing fields will be found. |
|---|
| 95 | |
|---|
| 96 | - if you want to match all fields that contain some substring |
|---|
| 97 | then use wildcards: |
|---|
| 98 | |
|---|
| 99 | - '*' |
|---|
| 100 | |
|---|
| 101 | will match any number of characters (including no characters). |
|---|
| 102 | |
|---|
| 103 | - '?' |
|---|
| 104 | |
|---|
| 105 | will match exactly one character |
|---|
| 106 | |
|---|
| 107 | If the whole search expression is '*', then it is handled like '?*' (which |
|---|
| 108 | means 'at least one character'). That means searching for '*' will match any |
|---|
| 109 | non-empty field. |
|---|
| 110 | |
|---|
| 111 | Examples: |
|---|
| 112 | |
|---|
| 113 | '*pseu*' matches all fields with the substring 'pseu' |
|---|
| 114 | 'pyrococcus*' matches all fields starting with 'pyrococcus' |
|---|
| 115 | '*bact*ther*' matches all fields with the substring 'bact' followed by 'ther' |
|---|
| 116 | (there may be many characters in-between or none, i.e. it does |
|---|
| 117 | match 'bactther' as well as 'Corynebacterium diphtheriae') |
|---|
| 118 | ~~~~#### ~~~~ #### |
|---|
| 119 | |
|---|
| 120 | - if the first character is '<' or '>' and the rest is a number, |
|---|
| 121 | then a numerical comparison is performed: |
|---|
| 122 | |
|---|
| 123 | - '<7' |
|---|
| 124 | |
|---|
| 125 | matches all fields containing a number smaller than 7 |
|---|
| 126 | |
|---|
| 127 | - '>10' |
|---|
| 128 | |
|---|
| 129 | matches all fields containing a number greater than 10 |
|---|
| 130 | |
|---|
| 131 | Be careful: |
|---|
| 132 | |
|---|
| 133 | Negating '<7' does NOT only match numbers greater or equal to seven. It as |
|---|
| 134 | well finds all non-numeric contents. Use something like '>6.999' instead. |
|---|
| 135 | |
|---|
| 136 | - if the first character is '/' then the following regular expression is used |
|---|
| 137 | for the query (see LINK{reg.hlp}). |
|---|
| 138 | |
|---|
| 139 | - if the first character is '|' then the following ACI expression is evaluated |
|---|
| 140 | and the query hits, if the evaluation is not "0". |
|---|
| 141 | See LINK{aci.hlp}. |
|---|
| 142 | |
|---|
| 143 | - if the query string is completely empty, it hits if the selected field does |
|---|
| 144 | not exist (or if a calculation produces no/empty result). |
|---|
| 145 | |
|---|
| 146 | SECTION SORTING RESULTS |
|---|
| 147 | |
|---|
| 148 | Search results are displayed unsorted by default. You can sort them, by selecting |
|---|
| 149 | a different order with the sort radio button. |
|---|
| 150 | |
|---|
| 151 | The provided sort criteria depend on the kind of query. The following list shows |
|---|
| 152 | the sort criteria available in LINK{sp_search.hlp}: |
|---|
| 153 | |
|---|
| 154 | unsorted display items like they are stored in database |
|---|
| 155 | by value sort by content of first query field |
|---|
| 156 | by number same as "by value", but sort numerically |
|---|
| 157 | (for string-type fields this sorts multiple columns of numbers) |
|---|
| 158 | by id sort by unique item id (e.g. 'name' for species) |
|---|
| 159 | by parent sort by globally unique id of parent item |
|---|
| 160 | (e.g. 'name' of organism for genes) |
|---|
| 161 | by marked sort marked before unmarked items |
|---|
| 162 | by hit sort by (and display) hit description (the hit description tells you |
|---|
| 163 | why an item was hit by query) |
|---|
| 164 | reverse reverts previously selected sort order |
|---|
| 165 | |
|---|
| 166 | ARB remembers and uses all the sort criteria you apply. |
|---|
| 167 | |
|---|
| 168 | Example: Selecting 'by id' will sort the items by their id (e.g. 'name'). If you |
|---|
| 169 | select 'by value' afterwards, ARB will sort items by the content of the first query |
|---|
| 170 | field - if the contents of some items are equal, it will still sort them by name. |
|---|
| 171 | |
|---|
| 172 | NOTES Wildcarded or exact search always searches case insensitive. |
|---|
| 173 | Regular expression search always searches case sensitive. |
|---|
| 174 | |
|---|
| 175 | EXAMPLES see LINK{sp_search.hlp} |
|---|
| 176 | |
|---|
| 177 | WARNINGS Using ACI is a bit tricky here, cause you cannot see what happens. |
|---|
| 178 | |
|---|
| 179 | Using 'trace(1)' somewhere in the ACI expression starts to print an |
|---|
| 180 | ACI trace to the console. To view the console refer to LINK{console.hlp}. |
|---|
| 181 | |
|---|
| 182 | BUGS No bugs known |
|---|