| 1 | #Please insert up references in the next lines (line starts with keyword UP) |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP e4.hlp |
|---|
| 4 | UP pfold.hlp |
|---|
| 5 | UP glossary.hlp |
|---|
| 6 | |
|---|
| 7 | #Please insert subtopic references (line starts with keyword SUB) |
|---|
| 8 | |
|---|
| 9 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
|---|
| 10 | |
|---|
| 11 | #************* Title of helpfile !! and start of real helpfile ******** |
|---|
| 12 | TITLE Protein Match Settings |
|---|
| 13 | |
|---|
| 14 | OCCURRENCE ARB_EDIT4/Properties/Protein Match Settings |
|---|
| 15 | |
|---|
| 16 | DESCRIPTION |
|---|
| 17 | |
|---|
| 18 | In the 'Protein Match Settings' window the protein structure match |
|---|
| 19 | computation can be configured. The settings are described in the following |
|---|
| 20 | section. |
|---|
| 21 | |
|---|
| 22 | SECTION Configuration |
|---|
| 23 | |
|---|
| 24 | Show protein structure match: Toggle display of protein match symbols |
|---|
| 25 | |
|---|
| 26 | Selected Protein Structure SAI: The protein secondary structure SAI used as |
|---|
| 27 | reference for match computation. The default is 'PFOLD'. |
|---|
| 28 | |
|---|
| 29 | Filter SAI names for: Via a filter the SAIs shown in the option menu can be |
|---|
| 30 | narrowed down to a selection of SAIs whose names contain the specified |
|---|
| 31 | string. This is useful for a great number of SAIs to quickly find the |
|---|
| 32 | one that should be used. Default is 'pfold'. |
|---|
| 33 | |
|---|
| 34 | Match Method: The used method for protein structure match computation. |
|---|
| 35 | Default is 'Secondary Structure <-> Sequence' which is most probable the |
|---|
| 36 | method of choice. Details on the different methods can be found below in |
|---|
| 37 | section 'Description of Match Methods'. |
|---|
| 38 | |
|---|
| 39 | Match Symbols (only relevant for the match method 'Secondary Structure <-> |
|---|
| 40 | Sequence'): Ten symbols that represent the match quality ranging from |
|---|
| 41 | 0 - 100% in steps of 10%. Take care to enter exactly ten symbols. |
|---|
| 42 | Note that spaces (' ') are symbols, too. |
|---|
| 43 | |
|---|
| 44 | Pair definitions (only relevant for the match methods 'Secondary Structure |
|---|
| 45 | <-> Secondary Structure' and 'Secondary Structure <-> Sequence (Full |
|---|
| 46 | Prediction)'). Each line contains two textfields: |
|---|
| 47 | - The left textfield contains one or more amino acid pairs. Each |
|---|
| 48 | pair contains two characters (amino acids, gaps-characters, ...). |
|---|
| 49 | Pairs are separated by spaces (' '). |
|---|
| 50 | - The right textfield contains the match symbol used for each |
|---|
| 51 | of the specified pairs. |
|---|
| 52 | |
|---|
| 53 | SECTION Description of Match Methods |
|---|
| 54 | |
|---|
| 55 | Match Method 'Secondary Structure <-> Secondary Structure' |
|---|
| 56 | |
|---|
| 57 | Use this method if you want to compare protein secondary structures |
|---|
| 58 | only. The characters representing species secondary structures are |
|---|
| 59 | compared one by one with the ones of the selected secondary structure |
|---|
| 60 | SAI using the pair definitions and the defined match symbols. If |
|---|
| 61 | undefined pairs are encountered the 'Unknown_match' symbol is |
|---|
| 62 | displayed. |
|---|
| 63 | |
|---|
| 64 | Match Method 'Secondary Structure <-> Sequence' |
|---|
| 65 | |
|---|
| 66 | Species amino acid sequences are compared with the selected secondary |
|---|
| 67 | structure SAI by taking cohesive parts of the structure - gaps in the |
|---|
| 68 | alignment are skipped - and computing values from 0 - 100% (in steps |
|---|
| 69 | of 10%) for the match quality which are mapped to the defined match |
|---|
| 70 | symbols. The whole part is marked with that symbol. Note that bends |
|---|
| 71 | ('S') are assumed to fit everywhere (=> best match symbol), and if a |
|---|
| 72 | structure is encountered but no corresponding amino acid the worst |
|---|
| 73 | match symbol is displayed. |
|---|
| 74 | |
|---|
| 75 | Match Method 'Secondary Structure <-> Sequence (Full Prediction)' |
|---|
| 76 | |
|---|
| 77 | Species amino acid sequences are compared with the selected secondary |
|---|
| 78 | structure SAI using a full prediction of secondary structures from |
|---|
| 79 | their sequences (via the Chou-Fasman algorithm) and comparing the |
|---|
| 80 | characters one by one with the reference structure SAI. Note that not |
|---|
| 81 | the structure summaries are used for comparison, but individually |
|---|
| 82 | predicted alpha-helices ('H'), beta-sheets ('E') and beta-turns ('T'). |
|---|
| 83 | The pair definitions are searched in ascending order, i.e. good |
|---|
| 84 | matches first, then the worse ones. If a match is found the |
|---|
| 85 | corresponding match symbol is displayed. Note that if a structure is |
|---|
| 86 | encountered but no corresponding amino acid the worst match symbol is |
|---|
| 87 | displayed. |
|---|
| 88 | |
|---|
| 89 | NOTES |
|---|
| 90 | |
|---|
| 91 | - The menu entry 'Properties -> Protein Match Settings' is only shown for |
|---|
| 92 | protein alignments ('Alignment Information -> <Type of Sequences>: pro', |
|---|
| 93 | see LINK{ad_align.hlp}). |
|---|
| 94 | - The match computation for sequences and secondary structures is based on |
|---|
| 95 | the Chou-Fasman algorithm or adaptions to it. See LINK{pfold.hlp} for |
|---|
| 96 | explanation and reference. |
|---|
| 97 | |
|---|
| 98 | SECTION TODO |
|---|
| 99 | |
|---|
| 100 | The settings window should only show the fields that are relevant |
|---|
| 101 | for the current match method. |
|---|
| 102 | |
|---|
| 103 | EXAMPLES None |
|---|
| 104 | |
|---|
| 105 | WARNINGS |
|---|
| 106 | |
|---|
| 107 | !!! The match computation can only give a rough overview if a given amino |
|---|
| 108 | acid sequence matches a certain secondary structure. Do not fully rely on |
|---|
| 109 | it but use it as hints for aligning your amino acid sequences. !!! |
|---|
| 110 | |
|---|
| 111 | !!! The match method 'Secondary Structure <-> Sequence (Full Prediction)' is |
|---|
| 112 | experimental. It is probably not very reliable and requires a lot of |
|---|
| 113 | computation. Thus, it should not be used for a large number of species loaded |
|---|
| 114 | in the editor. !!! |
|---|
| 115 | |
|---|
| 116 | BUGS |
|---|
| 117 | |
|---|
| 118 | The editor might be unstable and can crash if sequences are not formatted. |
|---|