source: tags/ms_r16q4/HELP_SOURCE/oldhelp/importift.hlp

Last change on this file was 11968, checked in by westram, 10 years ago
  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
File size: 7.1 KB
Line 
1#Please insert up references in the next lines (line starts with keyword UP)
2UP      arb.hlp
3UP      glossary.hlp
4
5#Please insert subtopic references  (line starts with keyword SUB)
6SUB     srt.hlp
7SUB     aci.hlp
8
9# Hypertext links in helptext can be added like this: LINK{ref.hlp|http://add|bla@domain}
10
11#************* Title of helpfile !! and start of real helpfile ********
12TITLE           How to define new import formats
13
14OCCURRENCE      ARB_NT
15
16SECTION         BRIEF DESCRIPTION
17
18                Import filters delivered together with ARB are located in
19                the directory '$ARBHOME/lib/import'. Their file extension
20                has to be '.ift'!
21
22                When you customize your own import filters store them in
23                directory '~/.arb_prop/filter/'. You only need to copy them
24                into '$ARBHOME/lib/import' if multiple users on this machine
25                should be able to use the customized filters.
26
27                Each of the import filters describes how to analyze and read files
28                of one specific format.
29
30                A basic import description file (.ift) looks like this:
31
32                        [AUTODETECT     "Matchpattern"]
33                        BEGIN           "Matchpattern"
34                        [KEYWIDTH       #Columnnumber]
35                        [AUTOTAG        ["TAGNAME"]]
36                        [IFNOTSET       x "Reason why x is not set"]+
37                        [SETGLOBAL      x "global value"]+
38                        [INCLUDE        "file"]+
39                        [MATCH          "Matchpattern"
40                                [SRT            "SRT_STRING"]
41                                [ACI            "ACI_STRING"]
42                                [TAG            "TAGNAME"]
43                                [WRITE          "DB_FIELD_NAME"]
44                                [WRITE_INT      "DB_FIELD_NAME"]
45                                [WRITE_FLOAT    "DB_FIELD_NAME"]
46                                [APPEND         "DB_FIELD_NAME"]
47                                [SETVAR         x]
48                        ]*
49                        SEQUENCESTART   "Matchpattern"
50                        SEQUENCECOLUMN  #Columnnumber
51                        [SEQUENCESRT    "SRT_STRING"]
52                        [SEQUENCEACI    "ACI_STRING"]
53                        SEQUENCEEND     "STRING"
54                        [CREATE_ACC_FROM_SEQUENCE]
55                        [DONT_GEN_NAMES]
56                        END             "STRING"
57
58                or it can pipe the data through any external program PROGRAM to
59                convert it to an already existing format 'exformat'
60                using the following basic design:
61
62                [AUTODETECT     "Matchpattern"]
63                SYSTEM          "PROGRAM $< $>"
64                NEW_FORMAT      "lib/import/exformat.ift"
65
66                $< will be replaced by the input file name
67                $> will ve replaced by the intermediate file name
68
69DESCRIPTION     First of all the converter appends all import files maching
70                the filepattern into one file. The files are separated by the
71                string defined with the keyword  SEQUENCEEND.
72
73                1. Search the first line matching the pattern defined by BEGIN
74
75                2. Try to match all MATCH_patterns.
76
77                   For all lines that match do:
78
79                        - append all following lines, which start after
80                          column KEYWIDTH
81
82                        - run commands with the concatenated lines
83
84                   Known commands are (they are executed in the order listed here):
85
86                         - SRT "SRT_STRING"
87
88                               start the string replace tool on the current result and
89                               set the output as current result (see LINK{srt.hlp}).
90
91                         - ACI "ACI_STRING"
92
93                               run the arb command interpreter to change the current result (see LINK{aci.hlp}).
94
95                         - TAG "TAGNAME"
96
97                               tag information (i.e. "[EBI] 1997 [RDP] 1998")
98
99                         - WRITE "DB_FIELD_NAME"
100
101                               write the current result into DB_FIELD_NAME
102
103                         - WRITE_INT "DB_FIELD_NAME"
104
105                               like WRITE, but expect integer target field
106
107                         - WRITE_FLOAT "DB_FIELD_NAME"
108
109                               like WRITE, but expect floating-point target field
110
111                         - APPEND "DB_FIELD_NAME"
112
113                               append the current result to DB_FIELD_NAME
114
115                         - SETVAR x
116
117                               store the current result in the variable x, where x may be any character.
118                               After it was set this variable can be referenced by using $x in any
119                               command expression (SRT_STRING,ACI_STRING,TAGNAME,DB_FIELD_NAME).
120
121                               For each used variable there has to be defined an error reason
122                               describing, what's wrong if the variable has NOT been set.
123                               Define error reasons using
124
125                                      IFNOTSET x "Reason why x is not set"
126
127                               Note: use '$$' to insert a single '$'.
128
129                               Allowed variable names are 'a' to 'z'.
130
131                   Note: Every of these commands may only occur once in one MATCH rule.
132                         To run some of them multiple, create multiple MATCH rules.
133
134                3. If the line matches SEQUENCESTART_pattern, assume that
135                   all following lines to and except the line
136                   matching SEQUENCEEND_pattern contain the sequence data.
137
138                4. GOTO 1
139
140                Postprocesses:
141
142                        CREATE_ACC_FROM_SEQUENCE:
143
144                                Generate a checksum for all sequences with no accession
145                                entry ('acc' -field) and write it as the accession number
146
147                        DONT_GEN_NAMES:
148
149                                Do not try to generate unique identifiers (shortnames) for
150                                the species using the full_name field.
151
152                General commands:
153
154                        INCLUDE "filename"
155
156                                Simply inserts the contents of "filename" at the current position.
157
158                                It's possible to declare variables in the file where the INCLUDE
159                                happens and to use them in the included file. (Example:
160                                longebi.ift, longgenbank.ift and feature_table.ift in subdir nonformats)
161
162                        SETGLOBAL x "value"
163
164                                Sets global variable 'x' to 'value'.
165
166                        AUTOTAG ["TAGNAME"]
167
168                                If set, act like each MATCH rule has a
169                                   TAG "TAGNAME"
170                                entry. Use AUTOTAG w/o parameter to reset
171                                to default behavior.
172
173EXAMPLES        Look at the files in '$ARBHOME/lib/import'
174
175WARNINGS        Format detection does not always work
Note: See TracBrowser for help on using the repository browser.