1 | #Please insert up references in the next lines (line starts with keyword UP) |
---|
2 | UP arb.hlp |
---|
3 | UP arb_ntree.hlp |
---|
4 | UP arb_import.hlp |
---|
5 | UP pfold.hlp |
---|
6 | UP glossary.hlp |
---|
7 | |
---|
8 | #Please insert subtopic references (line starts with keyword SUB) |
---|
9 | |
---|
10 | # Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain} |
---|
11 | |
---|
12 | #************* Title of helpfile !! and start of real helpfile ******** |
---|
13 | TITLE NOTES: dssp |
---|
14 | |
---|
15 | OCCURRENCE ARB_IMPORT |
---|
16 | |
---|
17 | DESCRIPTION |
---|
18 | |
---|
19 | See NOTES in LINK{arb_import.hlp} for HOWTO reactivate disabled import filters. |
---|
20 | |
---|
21 | The three filters 'dssp_all.ift', 'dssp_2nd_struct.ift' and 'dssp_sequence.ift' |
---|
22 | import protein secondary structure information and/or amino acid sequences |
---|
23 | from DSSP files. In addition, some of the associated information is extracted, |
---|
24 | too. The following fields are created (see also example below): |
---|
25 | - name: [PDB ID]_[Chain char] (extracted from 'HEADER' and the optional chain |
---|
26 | character in 'RESIDUE') |
---|
27 | - full_name: [PDB ID] (extracted from 'HEADER') Chain [Chain char] (extracted |
---|
28 | from the optional chain character in 'RESIDUE'); [Description] (extracted |
---|
29 | from 'HEADER' and 'COMPND') |
---|
30 | - tax: [Organism description] (extracted from 'SOURCE') |
---|
31 | - author: [Author(s)] (extracted from 'AUTHOR') |
---|
32 | - date: [Date] (extracted from 'HEADER') |
---|
33 | - remark: [Remark] (extracted from headline and 'REFERENCE') |
---|
34 | - ali_[alignment name]/data: [Amino acid sequence or secondary structure] |
---|
35 | (extracted from 'AA' or 'STRUCTURE') |
---|
36 | - sec_struct: [Secondary structure] (extracted from 'STRUCTURE') |
---|
37 | |
---|
38 | SECTION The DSSP code |
---|
39 | |
---|
40 | - H = alpha helix |
---|
41 | - B = residue in isolated beta-bridge |
---|
42 | - E = extended strand, participates in beta ladder |
---|
43 | - G = 3-helix (3/10 helix) |
---|
44 | - I = 5-helix (pi helix) |
---|
45 | - T = hydrogen bonded turn |
---|
46 | - S = bend |
---|
47 | |
---|
48 | NOTES |
---|
49 | |
---|
50 | - If a protein consists of several chains these are extracted individually |
---|
51 | and stored as different species. |
---|
52 | - The filter 'dssp_2nd_struct.ift' fills 'ali_[alignment name]/data' with |
---|
53 | the protein secondary structure and 'dssp_sequence.ift' as well as |
---|
54 | 'dssp_all.ift' fill it with the amino acid sequence. |
---|
55 | - The field 'sec_struct' is only used by the filter 'dssp_all.ift'. |
---|
56 | - Gaps-characters ('-') are inserted where no secondary structure is present. |
---|
57 | - The DSSP files are first piped through the script 'format_dssp.pl' |
---|
58 | (in "$ARBHOME/ARB/PERL_SCRIPTS/ARBTOOLS/IFTHELP") to format the files |
---|
59 | for use with the filters 'dssp_all2.ift2', 'dssp_2nd_struct2.ift2' and |
---|
60 | 'dssp_sequence2.ift2'. |
---|
61 | - Reference to DSSP can be found in LINK{pfold.hlp} in section |
---|
62 | 'REFERENCES' [2]. |
---|
63 | |
---|
64 | EXAMPLES |
---|
65 | |
---|
66 | The DSSP format looks like this: |
---|
67 | |
---|
68 | ==== Secondary Structure Definition by the program DSSP, updated CMBI version by ElmK / April 1,2000 ==== DATE=27-JUN-2003 . |
---|
69 | REFERENCE W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637 . |
---|
70 | HEADER RNA BINDING PROTEIN 22-NOV-99 1DG1 . |
---|
71 | COMPND 2 MOLECULE: ELONGATION FACTOR TU; . |
---|
72 | SOURCE 2 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI; . |
---|
73 | AUTHOR K.ABEL,M.YODER,R.HILGENFELD,F.JURNAK . |
---|
74 | ... |
---|
75 | ... |
---|
76 | # RESIDUE AA STRUCTURE BP1 BP2 ACC N-H-->O O-->H-N N-H-->O O-->H-N TCO KAPPA ALPHA PHI PSI X-CA Y-CA Z-CA |
---|
77 | 1 9 G K 0 0 143 0, 0.0 65,-0.2 0, 0.0 64,-0.1 0.000 360.0 360.0 360.0 143.2 13.7 48.3 -15.2 |
---|
78 | 2 10 G P - 0 0 38 0, 0.0 65,-2.6 0, 0.0 2,-0.6 -0.404 360.0-137.6 -64.4 148.4 12.2 51.7 -14.1 |
---|
79 | 3 11 G H E -a 67 0A 88 63,-0.2 2,-0.3 191,-0.1 65,-0.2 -0.949 24.0-180.0-114.6 117.8 10.1 51.4 -10.9 |
---|
80 | 4 12 G V E -a 68 0A 0 63,-2.0 65,-2.3 -2,-0.6 2,-0.5 -0.855 18.2-141.7-116.0 149.5 6.8 53.4 -10.8 |
---|
81 | 5 13 G N E +a 69 0A 36 -2,-0.3 86,-2.5 63,-0.2 87,-1.2 -0.949 31.8 154.0-113.1 127.7 4.2 53.5 -8.0 |
---|
82 | 6 14 G V E -ab 70 92A 0 63,-2.5 65,-2.0 -2,-0.5 2,-0.3 -0.820 16.5-171.5-139.8-179.7 0.5 53.6 -8.8 |
---|
83 | 7 15 G G E -ab 71 93A 0 85,-0.5 87,-2.2 63,-0.3 2,-0.3 -0.969 28.1-103.8-167.9 164.5 -2.7 52.6 -7.2 |
---|
84 | 8 16 G T E + b 0 94A 0 63,-1.9 65,-0.4 -2,-0.3 2,-0.3 -0.735 37.7 175.7 -97.6 147.4 -6.4 52.2 -7.9 |
---|
85 | 9 17 G I E + b 0 95A 3 85,-1.9 87,-2.6 -2,-0.3 2,-0.2 -0.962 21.1 98.6-147.2 156.0 -8.8 54.9 -6.7 |
---|
86 | 10 18 G G - 0 0 0 -2,-0.3 87,-0.1 85,-0.2 97,-0.1 -0.829 69.1 -41.6 148.3 177.9 -12.6 55.4 -7.1 |
---|
87 | 11 19 G H S > S- 0 0 35 85,-0.4 3,-1.4 -2,-0.2 5,-0.3 -0.330 72.6 -81.6 -74.1 153.0 -15.9 54.9 -5.4 |
---|
88 | 12 20 G V T 3 S+ 0 0 58 1,-0.2 -1,-0.1 2,-0.1 94,-0.1 -0.094 111.7 15.7 -52.8 150.0 -16.8 51.7 -3.5 |
---|
89 | 13 21 G D T 3 S+ 0 0 145 1,-0.1 -1,-0.2 -3,-0.1 -2,-0.1 0.575 91.3 115.4 59.3 12.4 -17.9 48.6 -5.4 |
---|
90 | 14 22 G H S < S- 0 0 4 -3,-1.4 -2,-0.1 82,-0.1 85,-0.1 0.789 92.2 -97.1 -80.7 -25.6 -16.7 50.1 -8.6 |
---|
91 | 15 23 G G S > S+ 0 0 11 -4,-0.2 4,-2.5 81,-0.1 5,-0.2 0.577 75.0 138.4 123.2 16.6 -14.0 47.5 -9.1 |
---|
92 | 16 24 G K H > S+ 0 0 12 -5,-0.3 4,-2.7 1,-0.2 5,-0.1 0.931 82.3 40.1 -55.3 -48.3 -10.7 48.7 -7.8 |
---|
93 | 17 25 G T H > S+ 0 0 17 2,-0.2 4,-2.3 1,-0.2 -1,-0.2 0.887 114.0 51.1 -70.8 -40.4 -9.8 45.4 -6.2 |
---|
94 | 18 26 G T H > S+ 0 0 30 2,-0.2 4,-2.4 1,-0.2 -1,-0.2 0.899 113.8 47.6 -64.1 -39.4 -11.1 43.1 -9.0 |
---|
95 | 19 27 G L H X S+ 0 0 0 -4,-2.5 4,-2.6 2,-0.2 -2,-0.2 0.955 107.8 53.8 -66.2 -49.6 -9.1 |
---|
96 | ... |
---|
97 | ... |
---|
98 | |
---|
99 | The extracted ARB database entry looks like this (for alignment with the |
---|
100 | name 'ali_prot' and imported with 'dssp_all.ift'): |
---|
101 | |
---|
102 | name S6: 1DG1_G |
---|
103 | full_name S0: 1DG1 Chain G; RNA BINDING PROTEIN; MOLECULE: ELONGATION FACTOR TU |
---|
104 | tax S0: ORGANISM_SCIENTIFIC: ESCHERICHIA COLI |
---|
105 | author S0: K.ABEL,M.YODER,R.HILGENFELD,F.JURNAK |
---|
106 | date S0: 22-NOV-99 |
---|
107 | ali_prot %0: |
---|
108 | ali_prot/data S0: KPHVNVGTIGHVDHGKTTL... |
---|
109 | sec_struct S0: --EEEEEEE-STTSSHHHH... |
---|
110 | remark S0: === Secondary Structure Definition by the program DSSP, updated CMBI version by ElmK / April 1,2000 ==== DATE=22-FEB-2008 |
---|
111 | DSSP program by: W. KABSCH AND C.SANDER, BIOPOLYMERS 22 (1983) 2577-2637 |
---|
112 | ... |
---|
113 | ... |
---|
114 | |
---|
115 | WARNINGS None |
---|
116 | |
---|
117 | BUGS No bugs known |
---|