1 | SEQUENCE DATA SUBMISSION FORM |
---|
2 | |
---|
3 | |
---|
4 | This form solicits the information needed for a nucleotide or amino acid |
---|
5 | sequence database entry. It can be filled in using any text editor or |
---|
6 | printed and filled in by hand. By completing and returning it to us |
---|
7 | promptly you help us to enter your data in the database accurately and |
---|
8 | rapidly. These data will be shared among the following databases: EMBL |
---|
9 | Data Library (Heidelberg, W. Germany); GenBank (Los Alamos, NM, U.S.A. and |
---|
10 | Mountain View, CA, U.S.A), DNA Data Bank of Japan (DDBJ; Tokyo, Japan); |
---|
11 | National Biomedical Research Foundation Protein Identification Resource |
---|
12 | (NBRF-PIR; Washington, D.C., U.S.A.); Martinsried Institute for Protein |
---|
13 | Sequence Data (MIPS; Martinsried, W. Germany) and International Protein |
---|
14 | Information Database in Japan (JIPID; Tokyo). |
---|
15 | |
---|
16 | Please answer all questions which apply to your data. If you submit 2 or |
---|
17 | more non-contiguous sequences, copy and fill out this form for each |
---|
18 | additional sequence. When submitting nucleic acid sequences containing |
---|
19 | protein coding regions, please include a translation (SEPARATELY from the |
---|
20 | nucleic acid sequence). Then send us (1) this form, (2) a pre- or reprint |
---|
21 | of any manuscript which pertains to these data, and (3) your sequence data. |
---|
22 | You can send these materials (a) electronically via computer network, (b) on |
---|
23 | magnetic tape, or (c) on a floppy diskette. More detailed information about |
---|
24 | formats for submitted data is included at the end of this form. |
---|
25 | |
---|
26 | our mailing address: EMBL Data Library Submissions, Postfach 10.2209 |
---|
27 | D-6900 Heidelberg, West Germany |
---|
28 | telephone: (06221) 387 258 |
---|
29 | computer network: datasubs@embl.earn (for data submissions) |
---|
30 | datalib@embl.earn (for general inquiries) |
---|
31 | |
---|
32 | Please include in your submission any additional sequence data which is not |
---|
33 | reported in your manuscript but which has been reliably determined (for |
---|
34 | example, introns or flanking sequences). |
---|
35 | |
---|
36 | When we receive this material we will assign the data an accession number, |
---|
37 | which serves as a reference that permanently identifies them in the |
---|
38 | database. We will inform you what accession number your data have been |
---|
39 | given and we recommend that you cite this number when referring to these |
---|
40 | data in publications. |
---|
41 | |
---|
42 | If new data become available which would make the database entry more |
---|
43 | informative (e.g., function of the gene product or location of important |
---|
44 | sites within the sequence), or if you discover errors in the sequence, we |
---|
45 | urge you to contact us so that we can update your entry. |
---|
46 | |
---|
47 | Thank you. |
---|
48 | |
---|
49 | |
---|
50 | I. GENERAL INFORMATION |
---|
51 | ============================================================================== |
---|
52 | Your name $(YOUR NAME) |
---|
53 | ------------------------------------------------------------------------------ |
---|
54 | Institution $(INSTITUTION) |
---|
55 | ------------------------------------------------------------------------------ |
---|
56 | Address $(ADDRESS) |
---|
57 | ------------------------------------------------------------------------------ |
---|
58 | Computer mail address $(MAIL) Telex number |
---|
59 | ------------------------------------------------------------------------------ |
---|
60 | Telephone $(PHONE) Telefax number $(TELEFAX) |
---|
61 | ============================================================================== |
---|
62 | On what medium and in what format are you sending us your sequence data? |
---|
63 | (see instructions at the end of this form) |
---|
64 | [X] electronic mail |
---|
65 | [ ] diskette |
---|
66 | computer:Commodore operating system:MS DOS editor: |
---|
67 | [ ] magnetic tape |
---|
68 | record length: blocksize: label type: |
---|
69 | density [ ] 800 [ ] 1600 [ ] 6250 |
---|
70 | character code [ ] ASCII [ ] EBCDIC |
---|
71 | ============================================================================== |
---|
72 | |
---|
73 | |
---|
74 | II. CITATION INFORMATION |
---|
75 | ============================================================================== |
---|
76 | These data are [ ] published [X] in press [ ] submitted [ ] in preparation |
---|
77 | [ ] no plans to publish |
---|
78 | ------------------------------------------------------------------------------ |
---|
79 | authors $(author) |
---|
80 | ------------------------------------------------------------------------------ |
---|
81 | title of paper $(title) ------------------------------------------------------------------------------ |
---|
82 | journal volume, first-last pages, $(journal) ------------------------------------------------------------------------------ |
---|
83 | Do you agree that these data can be made available in the database before |
---|
84 | they appear in print? |
---|
85 | [x] yes [ ] no, they should be made available only after publication. |
---|
86 | estimated date: $(DATE) |
---|
87 | ============================================================================== |
---|
88 | Does the sequence which you are sending with this form include data that |
---|
89 | do NOT appear in the above citation? |
---|
90 | [X] no [ ] yes, from position ______ to ______ [ ] base pairs OR |
---|
91 | [ ] amino acid residues |
---|
92 | (If your sequence contains 2 or more such spans, use the feature |
---|
93 | table in section IV to indicate their positions) |
---|
94 | If so, how should these data be cited in the database? |
---|
95 | [ ] published [ ] in press [ ] submitted [ ] in preparation |
---|
96 | [ ] no plans to publish |
---|
97 | ------------------------------------------------------------------------------ |
---|
98 | authors |
---|
99 | ------------------------------------------------------------------------------ |
---|
100 | address (if different from that given in section I) |
---|
101 | |
---|
102 | |
---|
103 | ------------------------------------------------------------------------------ |
---|
104 | title of paper |
---|
105 | |
---|
106 | ------------------------------------------------------------------------------ |
---|
107 | journal volume, first-last pages, year |
---|
108 | ============================================================================== |
---|
109 | List references to papers and/or database entries which report sequences |
---|
110 | overlapping with that submitted here. |
---|
111 | |
---|
112 | 1st author journal, vol., pages, year and/or database, accession number |
---|
113 | ------------------------------------------------------------------------------ |
---|
114 | |
---|
115 | ------------------------------------------------------------------------------ |
---|
116 | |
---|
117 | ============================================================================== |
---|
118 | |
---|
119 | |
---|
120 | III. DESCRIPTION OF SEQUENCED SEGMENT |
---|
121 | |
---|
122 | Wherever possible, please use standard nomenclature or conventions. If a |
---|
123 | question is not applicable to your sequence, answer by writing N.A. in the |
---|
124 | appropriate space; if the information is relevant but not available, write |
---|
125 | a question mark (?). |
---|
126 | ============================================================================== |
---|
127 | What kind of molecule did you sequence? (check all boxes which apply) |
---|
128 | |
---|
129 | [X] genomic DNA [ ] genomic RNA [ ] virus or [ ] provirus |
---|
130 | [ ] cDNA to mRNA [ ] cDNA to genomic RNA |
---|
131 | [ ] organelle DNA [ ] organelle RNA please specify organelle: |
---|
132 | [ ] tRNA [ ] rRNA [ ] snRNA [ ] scRNA |
---|
133 | [ ] other nucleic acid. please specify: |
---|
134 | [ ] peptide [ ] sequence assembled by [ ] overlap of sequenced fragments |
---|
135 | [ ] homology with related sequence |
---|
136 | [ ] other. please specify: |
---|
137 | |
---|
138 | [ ] partial: [ ] N-terminal |
---|
139 | [ ] C-terminal |
---|
140 | [ ] internal fragment |
---|
141 | ============================================================================== |
---|
142 | length of sequence $(SEQ_LEN) [X] base pairs or [ ] amino acid residues |
---|
143 | ------------------------------------------------------------------------------ |
---|
144 | gene name(s) (e.g., lacZ) $(gene) |
---|
145 | ------------------------------------------------------------------------------ |
---|
146 | gene product name(s) (e.g., beta-D-galactosidase) $(gene) |
---|
147 | ------------------------------------------------------------------------------ |
---|
148 | Enzyme Commission number (e.g., EC 3.2.1.23) |
---|
149 | ------------------------------------------------------------------------------ |
---|
150 | gene product subunit structure (e.g., hemoglobin alpha-2 beta-2) |
---|
151 | ============================================================================== |
---|
152 | The following items refer to the original source of the molecule you have |
---|
153 | sequenced. |
---|
154 | organism ---- name $(full_name) |
---|
155 | ------------------------------------------------------------------------------ |
---|
156 | sub-species strain $(strain) |
---|
157 | ------------------------------------------------------------------------------ |
---|
158 | name/number of individual or isolate (e.g., patient 123; influenza virus |
---|
159 | A/PR/8/34) |
---|
160 | ------------------------------------------------------------------------------ |
---|
161 | developmental stage [ ] germ line [ ] rearranged |
---|
162 | ------------------------------------------------------------------------------ |
---|
163 | haplotype tissue type cell type |
---|
164 | ============================================================================== |
---|
165 | The following items refer to the immediate experimental source of the |
---|
166 | submitted sequence. |
---|
167 | name of cell line (e.g., Hela; 3T3-L1) |
---|
168 | ------------------------------------------------------------------------------ |
---|
169 | library (type; name) clone(s) |
---|
170 | ============================================================================== |
---|
171 | The following items refer to the position of the submitted sequence in the |
---|
172 | genome. |
---|
173 | chromosome (or segment) name/number |
---|
174 | ------------------------------------------------------------------------------ |
---|
175 | map position units: [ ] genome % [ ] nucleotide number |
---|
176 | [ ] other: |
---|
177 | ============================================================================== |
---|
178 | Using single words or short phrases, describe the properties of the sequence |
---|
179 | in terms of: |
---|
180 | |
---|
181 | - its associated phenotype(s); |
---|
182 | - the biological/enzymatic activity of its product; |
---|
183 | - the general functional classification of the gene and/or gene product |
---|
184 | - macromolecules to which the gene product can bind (e.g., DNA, calcium, |
---|
185 | other proteins); |
---|
186 | - subcellular localization of the gene product; |
---|
187 | - any other relevant information. |
---|
188 | |
---|
189 | Example (for the viral erbB nucleotide sequence): transforming capacity; EGF |
---|
190 | receptor-related; tyrosine kinase; oncogene; transmembrane protein. |
---|
191 | |
---|
192 | |
---|
193 | ============================================================================== |
---|
194 | |
---|
195 | |
---|
196 | IV. FEATURES OF THE SEQUENCE |
---|
197 | |
---|
198 | Please list below the types and locations of all significant features |
---|
199 | experimentally identified within the sequence. Be sure that your sequence |
---|
200 | is numbered beginning with "1." |
---|
201 | |
---|
202 | In the column marked fill in |
---|
203 | |
---|
204 | feature type of feature (see information below) |
---|
205 | from number of first base/amino acid in the feature |
---|
206 | to number of last base/amino acid in the feature |
---|
207 | bp x, if numbering refers to position of a base pair in |
---|
208 | a nucleotide sequence |
---|
209 | aa x, if numbering refers to position of an amino acid |
---|
210 | residue in a peptide sequence |
---|
211 | id indicate method by which the feature was identified. |
---|
212 | E = experimentally; S = by similarity to known |
---|
213 | sequence or to an established consensus sequence; P = |
---|
214 | by similarity to some other pattern, such as an |
---|
215 | open reading frame |
---|
216 | comp x, if feature is located on the nucleic acid strand |
---|
217 | complementary to that reported here |
---|
218 | |
---|
219 | Significant features include: |
---|
220 | |
---|
221 | - regulatory signals (e.g., promoters, attenuators, enhancers) |
---|
222 | - transcribed regions (e.g., mRNA, rRNA, tRNA). (indicate reading frame |
---|
223 | if start and stop codons are not present) |
---|
224 | - regions subject to post-transcriptional modificaton (e.g., introns, |
---|
225 | modified bases) |
---|
226 | - translated regions |
---|
227 | - extent of signal peptide, prepropeptide, propeptide, mature peptide |
---|
228 | - regions subject to post-translational modification (e.g., glycosylated |
---|
229 | or phosphorylated sites) |
---|
230 | - other domains/sites of interest (e.g., extracellular domain, DNA- |
---|
231 | binding domain, active site, inhibitory site) |
---|
232 | - sites involved in bonding (disulfide, thiolester, intrachain, interchain) |
---|
233 | - regions of protein secondary structure (e.g., alpha helix or beta sheet) |
---|
234 | - conflicts with sequence data reported by other authors |
---|
235 | - variations and polymorphisms |
---|
236 | |
---|
237 | The first 2 lines of the table are filled in with examples. |
---|
238 | |
---|
239 | ============================================================================== |
---|
240 | Numbering for features on submitted sequence [X] matches manuscript |
---|
241 | [ ] does not match manuscript |
---|
242 | ============================================================================== |
---|
243 | feature from to bp aa id comp |
---|
244 | ------------------------------------------------------------------------------ |
---|
245 | EXAMPLE TATA box 1 8 x S |
---|
246 | ------------------------------------------------------------------------------ |
---|
247 | EXAMPLE exon 1 9 264 x |
---|
248 | ============================================================================== |
---|
249 | $(gene) 1 $(SEQ_LEN) x |
---|
250 | |
---|
251 | ------------------------------------------------------------------------------ |
---|
252 | $(tax) |
---|
253 | ------------------------------------------------------------------------------ |
---|
254 | |
---|
255 | ------------------------------------------------------------------------------ |
---|
256 | |
---|
257 | ------------------------------------------------------------------------------ |
---|
258 | |
---|
259 | ------------------------------------------------------------------------------ |
---|
260 | |
---|
261 | ------------------------------------------------------------------------------ |
---|
262 | |
---|
263 | ------------------------------------------------------------------------------ |
---|
264 | |
---|
265 | ============================================================================== |
---|
266 | |
---|
267 | |
---|
268 | |
---|
269 | FORMATS FOR SUBMITTED DATA |
---|
270 | |
---|
271 | We are happy to accept data submitted in any of the following formats: |
---|
272 | |
---|
273 | (1) Electronic file transfer: files can be sent via computer network to: |
---|
274 | DATASUBS@EMBL.EARN. This BITNET/EARN address can be reached via various |
---|
275 | gateways from Arpanet, Usenet, JANET, etc. Ask your local network expert |
---|
276 | for help or phone us. |
---|
277 | |
---|
278 | (2) Magnetic tapes: 9-track only (fixed-length records preferred); 800, |
---|
279 | 1600 or 6250 bpi (any blocksize); ASCII or EBCDIC character codes; any label |
---|
280 | type or unlabelled. |
---|
281 | |
---|
282 | (3) Floppy disks: we can read Macintosh diskettes and 5-1/4" diskettes from |
---|
283 | MS-DOS systems. |
---|
284 | |
---|
285 | Whatever format you choose, we would appreciate receiving the sequence data |
---|
286 | in a form which conforms as closely as possible to the following standards. |
---|
287 | |
---|
288 | - Each sequence should include the names of the authors. |
---|
289 | |
---|
290 | - Each distinct sequence should be listed separately using the same number |
---|
291 | of bases/residues per line. The length of each sequence in bases/ |
---|
292 | residues should be clearly indicated. |
---|
293 | |
---|
294 | - Enumeration should begin with a "1" and continue in the direction 5' to |
---|
295 | 3' (or amino- to carboxy-terminus). |
---|
296 | |
---|
297 | - Amino acid sequences should be listed using the one-letter code. |
---|
298 | |
---|
299 | - Translations of protein coding regions in nucleotide sequences should |
---|
300 | be submitted in a separate computer file from the nucleotide sequences |
---|
301 | themselves. |
---|
302 | |
---|
303 | - The code for representing the sequence characters should conform to the |
---|
304 | IUPAC-IUB standards, which are described in: Nucl. Acids Res. 13: 3021- |
---|
305 | 3030 (1985) (for nucleic acids) and J. Biol. Chem. 243: 3557-3559 |
---|
306 | (1968) and Eur. J. Biochem 5: 151-153 (1968) (for amino acids). |
---|
307 | |
---|
308 | $(SEQUENCE) |
---|
309 | |
---|
310 | |
---|