1 | ------------------------------------------------------ |
---|
2 | |
---|
3 | EMBL NUCLEOTIDE SEQUENCE DATABASE SUBMISSION FORM |
---|
4 | |
---|
5 | HOW TO USE THIS FORM - PLEASE READ FIRST |
---|
6 | |
---|
7 | 1) WEBIN: THE WORLD WIDE WEB SUBMISSION TOOL |
---|
8 | ============================================ |
---|
9 | If you have access to the World Wide Web then DO NOT use this form. Use the |
---|
10 | WebIn form on the World Wide Web at |
---|
11 | |
---|
12 | ############################################## |
---|
13 | # http://www.ebi.ac.uk/submission/webin.html # |
---|
14 | ############################################## |
---|
15 | |
---|
16 | If you do not have access to the World Wide Web then please use this form |
---|
17 | and email it to DATASUBS@EBI.AC.UK. |
---|
18 | |
---|
19 | It is only necessary to submit to one database. Public data are exchanged |
---|
20 | between EMBL, GenBank and DDBJ on a daily basis. |
---|
21 | |
---|
22 | 2) MULTIPLE SUBMISSIONS |
---|
23 | ======================= |
---|
24 | If you have more than one but less than 25 sequences to submit, copy this |
---|
25 | form and send all the submissions together in one email with a note saying |
---|
26 | how many sequences you are sending. |
---|
27 | |
---|
28 | 3) BULK SUBMISSIONS |
---|
29 | =================== |
---|
30 | If you have more than 25 related sequences to submit DO NOT send them all |
---|
31 | using this form. Instead email DATASUBS@EBI.AC.UK and include the following |
---|
32 | information |
---|
33 | a) how many sequences you are going to submit |
---|
34 | b) a short explanation of how the sequences are related |
---|
35 | c) what type of differences there are between the entries (e.g. isolate) |
---|
36 | d) one completed email submission form as an example |
---|
37 | You will be contacted by a curator who will create a template for you which |
---|
38 | you should then use to submit all of the sequences. |
---|
39 | |
---|
40 | 4) UPDATES |
---|
41 | ========== |
---|
42 | DO NOT use this form for submitting updates or corrections. |
---|
43 | If you are sending an update please complete the update form available on |
---|
44 | the web at: http://www.ebi.ac.uk/ebi_docs/update.html or get a copy of the |
---|
45 | update form via anonymous FTP: |
---|
46 | ftp://ftp.ebi.ac.uk/pub/databases/embl/release/update.doc |
---|
47 | If you need help with updates contact UPDATE@EBI.AC.UK |
---|
48 | |
---|
49 | 5) PROTEIN SEQUENCES |
---|
50 | ==================== |
---|
51 | DO NOT use this form to submit protein sequences. |
---|
52 | For submissions to the SWISS-PROT protein sequence databank access the |
---|
53 | World Wide Web at http://www.ebi.ac.uk/ebi_docs/swissprot_db/swisshome.html |
---|
54 | or email DATALIB@EBI.AC.UK |
---|
55 | |
---|
56 | 6) ACCESSION NUMBERS AND CONFIDENTIALITY |
---|
57 | ======================================== |
---|
58 | Your data can be made public immediately, or they can be kept confidential |
---|
59 | until a release date which you provide. Confidential data are ALWAYS made |
---|
60 | available to the public after publication. |
---|
61 | |
---|
62 | If your data contain all the information we require we will assign unique |
---|
63 | accession numbers within two working days. We will email you to tell you |
---|
64 | the new accession numbers. |
---|
65 | |
---|
66 | You should submit your sequence data BEFORE you have galley proofs. We |
---|
67 | suggest that the following text be used to cite the accession number(s) in |
---|
68 | publication(s): "The nucleotide sequence data reported in this paper will |
---|
69 | appear in the DDBJ/EMBL/GenBank Nucleotide Sequence Database under the |
---|
70 | accession number(s) ________" |
---|
71 | |
---|
72 | 7) FORM FILLING INSTRUCTIONS |
---|
73 | ============================ |
---|
74 | |
---|
75 | <============== DO NOT EXCEED THIS LINE WIDTH IN YOUR REPLY ==============> |
---|
76 | |
---|
77 | To display this form properly choose a fixed width font (e.g. Courier) in |
---|
78 | your editor. If you are saving files in a word processing program then |
---|
79 | please save the file as TEXT ONLY WITH LINE BREAKS. (To do this in |
---|
80 | Microscoft Word you will need to choose File, Save as, Save file type as, |
---|
81 | and select Text only with line breaks). Please do not send files that are |
---|
82 | saved in Word or Wordperfect format. Processing of the submission may be |
---|
83 | delayed if your email is text wrapped, encoded or binhexed. |
---|
84 | |
---|
85 | ######################################################################## |
---|
86 | # Fill in the form as follows: # |
---|
87 | # a) if there is a colon : then enter text (e.g. Last name : Smith) # |
---|
88 | # b) if there is an empty box [ ] and if the answer is yes then fill # |
---|
89 | # the box with an X (e.g. Genomic DNA [X]) # |
---|
90 | # c) if the option is not relevant then do not enter any text and/or # |
---|
91 | # do not write an X in the box. # |
---|
92 | # d) DO NOT delete lines from this form. # |
---|
93 | ######################################################################## |
---|
94 | |
---|
95 | 8) ENTERING FEATURES AND LOCATIONS |
---|
96 | ================================== |
---|
97 | Enter the feature key from the list given in Appendix I at the end of this |
---|
98 | document. Enter the locations, gene name, product name, and EC number, |
---|
99 | where appropriate. Use < and > in the locations to show whether the feature |
---|
100 | is partial at the 5' end and/or the 3' end. Mark with an X in the box [ ] |
---|
101 | if the feature is on the complementary strand and if you have experimental |
---|
102 | evidence for the feature. |
---|
103 | |
---|
104 | If you do not provide any features or adequate locations and names for the |
---|
105 | features you will be contacted for more information before an accession |
---|
106 | number is assigned to the sequence. For CDS features you must provide a gene |
---|
107 | name AND a product name, even if the product name is putative. |
---|
108 | |
---|
109 | If a CDS is partial at the 5' end then write the codon start number. This |
---|
110 | is the number (1,2 or 3) of the first base of the first complete codon of |
---|
111 | the translation. For example the following CDS is partial and the codon |
---|
112 | start is 2 because the first complete codon, T, starts with the base a, |
---|
113 | which is the second base in the feature. |
---|
114 | DNA tacatcgatg... |
---|
115 | Translation T S M... |
---|
116 | |
---|
117 | FEATURE EXAMPLE NO.1 |
---|
118 | Feature key :CDS |
---|
119 | >From :201 |
---|
120 | To :500 |
---|
121 | Gene name :abcD |
---|
122 | Product name :ABC repressor protein |
---|
123 | Codon start 1,2 or 3 : |
---|
124 | EC number : |
---|
125 | Complementary strand [ ] |
---|
126 | Experimental evidence [X] |
---|
127 | |
---|
128 | FEATURE EXAMPLE NO.2 |
---|
129 | Feature key :rRNA |
---|
130 | >From :<1 |
---|
131 | To :>1500 |
---|
132 | Gene name :16S rRNA |
---|
133 | Product name :16S ribosomal RNA |
---|
134 | Codon start 1,2 or 3 : |
---|
135 | EC number : |
---|
136 | Complementary strand [ ] |
---|
137 | Experimental evidence [ ] |
---|
138 | |
---|
139 | If you have further questions after reading this form please contact |
---|
140 | DATASUBS@EBI.AC.UK |
---|
141 | |
---|
142 | I. CONFIDENTIAL STATUS |
---|
143 | |
---|
144 | Enter an X if you want these data to be confidential [ ] |
---|
145 | If confidential write the release date here : |
---|
146 | (Date format DD-MMM-YYYY e.g. 30-JUN-1998) |
---|
147 | |
---|
148 | |
---|
149 | II. CONTACT INFORMATION |
---|
150 | |
---|
151 | Last name :$(LAST_NAME) |
---|
152 | First name :$(FIRST_NAME) |
---|
153 | Middle initials : |
---|
154 | Department :$(DEPT) |
---|
155 | Institution :$(INSTITUTION) |
---|
156 | Address :$(ADDRESS) |
---|
157 | : |
---|
158 | : |
---|
159 | Country :$(COUNTRY) |
---|
160 | Telephone :$(PHONE) |
---|
161 | Fax :$(TELEFAX) |
---|
162 | Email :$(MAIL) |
---|
163 | |
---|
164 | |
---|
165 | III. CITATION INFORMATION |
---|
166 | |
---|
167 | Author 1 :$(author_1) |
---|
168 | Author 2 :$(author_2) |
---|
169 | Author 3 :$(author_3) |
---|
170 | Author 4 :$(author_4) |
---|
171 | Author 5 :$(author_5) |
---|
172 | Author 6 :$(author_6) |
---|
173 | Author 7 :$(author_7) |
---|
174 | Author 8 :$(author_8) |
---|
175 | Author 9 :$(author_9) |
---|
176 | Author 10 :$(author_10) |
---|
177 | Author 11 :$(author_11) |
---|
178 | Author 12 :$(author_12) |
---|
179 | (e.g. Smith A.B.) |
---|
180 | (Copy line for extra authors) |
---|
181 | Title :$(title) |
---|
182 | Journal :$(journal) |
---|
183 | Volume :$(volume) |
---|
184 | First page :$(page_1) |
---|
185 | Last page :$(page_2) |
---|
186 | Year :$(year_pub) |
---|
187 | Institute (if thesis): |
---|
188 | |
---|
189 | Publication status |
---|
190 | Mark one of the following |
---|
191 | In preparation [ ] |
---|
192 | Accepted [x] |
---|
193 | Published [ ] |
---|
194 | Thesis/Book [ ] |
---|
195 | No plans to publish [ ] |
---|
196 | |
---|
197 | |
---|
198 | IV. SEQUENCE INFORMATION |
---|
199 | |
---|
200 | Sequence length (bp) :$(SEQ_LEN) |
---|
201 | |
---|
202 | Molecule type |
---|
203 | Mark one of the following |
---|
204 | Genomic DNA [ ] |
---|
205 | cDNA to mRNA [ ] |
---|
206 | rRNA [x] |
---|
207 | tRNA [ ] |
---|
208 | Genomic RNA [ ] |
---|
209 | cDNA to genomic RNA [ ] |
---|
210 | |
---|
211 | Mark if either of these apply |
---|
212 | Circular [ ] |
---|
213 | Checked for vector |
---|
214 | contamination [ ] |
---|
215 | |
---|
216 | |
---|
217 | V. SOURCE INFORMATION |
---|
218 | |
---|
219 | Organism :$(full_name) |
---|
220 | Sub species : |
---|
221 | Strain :$(strain) |
---|
222 | Cultivar : |
---|
223 | Variety : |
---|
224 | Isolate/individual : |
---|
225 | Developmental stage : |
---|
226 | Tissue type : |
---|
227 | Cell type : |
---|
228 | Cell line : |
---|
229 | Clone :$(clone) |
---|
230 | Clone (if >1) : |
---|
231 | Clone library : |
---|
232 | Chromosome : |
---|
233 | Map position : |
---|
234 | Haplotype : |
---|
235 | Natural host : |
---|
236 | Laboratory host : |
---|
237 | Macronuclear [ ] |
---|
238 | |
---|
239 | Mark one if immunoglobulin |
---|
240 | or T cell receptor |
---|
241 | Germline [ ] |
---|
242 | Rearranged [ ] |
---|
243 | |
---|
244 | Mark one if viral |
---|
245 | Proviral [ ] |
---|
246 | Virion [ ] |
---|
247 | |
---|
248 | Mark one if from an organelle |
---|
249 | Chloroplast [ ] |
---|
250 | Mitochondrion [ ] |
---|
251 | Chromoplast [ ] |
---|
252 | Kinetoplast [ ] |
---|
253 | Cyanelle [ ] |
---|
254 | Plasmid (not clone) [ ] |
---|
255 | |
---|
256 | Further source information |
---|
257 | (e.g. taxonomy, specimen voucher etc) |
---|
258 | Note :$(tax) |
---|
259 | |
---|
260 | |
---|
261 | VI. FEATURES OF THE SEQUENCE |
---|
262 | |
---|
263 | |
---|
264 | YOU MUST DESCRIBE AT LEAST ONE FEATURE OF THE SEQUENCE OR THERE WILL BE A |
---|
265 | DELAY IN THE PROCESSING OF YOUR SUBMISSION |
---|
266 | |
---|
267 | |
---|
268 | Complete the block below for every feature you need to describe. If you |
---|
269 | have more than one feature copy the block as many times as you require. For |
---|
270 | help see 8) ENTERING FEATURES AND LOCATIONS above. |
---|
271 | |
---|
272 | |
---|
273 | FEATURE NO.1 |
---|
274 | Feature key :$(seq_type) |
---|
275 | >From :$(start) |
---|
276 | To :$(end) |
---|
277 | Gene name :$(gene) |
---|
278 | Product name :$(gene_prod) |
---|
279 | Codon start 1,2 or 3 : |
---|
280 | EC number : |
---|
281 | Complementary strand [ ] |
---|
282 | Experimental evidence [ ] |
---|
283 | |
---|
284 | |
---|
285 | VII. SEQUENCE INFORMATION |
---|
286 | |
---|
287 | Enter the sequence data below |
---|
288 | (IUPAC nucleotide base codes, Nucl. Acids Res. 13: 3021-3030, 1985) |
---|
289 | |
---|
290 | BEGINNING OF SEQUENCE: |
---|
291 | $(SEQUENCE) |
---|
292 | |
---|
293 | END OF SEQUENCE |
---|
294 | |
---|
295 | |
---|
296 | Include the translation for each CDS feature below. |
---|
297 | |
---|
298 | |
---|
299 | BEGINNING OF TRANSLATION: |
---|
300 | |
---|
301 | |
---|
302 | END OF TRANSLATION |
---|
303 | |
---|
304 | |
---|
305 | --------------------------------------------------------------------------- |
---|
306 | These data will be shared among the following databases: DDBJ Database |
---|
307 | (DNA Data Bank of Japan; Mishima, Japan); EMBL Nucleotide Sequence Database |
---|
308 | (EBI, Cambridge, UK); GenBank (NCBI, Bethesda, USA); SWISS-PROT Protein |
---|
309 | Sequence Database (Geneva, Switzerland and Heidelberg, FRG); International |
---|
310 | Protein Information Database in Japan (JIPID; Noda, Japan) Martinsried |
---|
311 | Institute For Protein Sequence Data (MIPS; Martinsried, FRG) National |
---|
312 | Biomedical Research Foundation Protein Identification Resource (NBRF-PIR; |
---|
313 | Washington, D.C., USA.) |
---|
314 | |
---|
315 | EMBL Data Submissions E-mail datasubs@ebi.ac.uk |
---|
316 | European Bioinformatics Inst. Telephone +44 (0)1223 494499 |
---|
317 | Hinxton Hall, Hinxton Telefax +44 (0)1223 494472 |
---|
318 | Cambridge CB10 1SD, UK |
---|
319 | --------------------------------------------------------------------------- |
---|
320 | |
---|
321 | |
---|
322 | |
---|
323 | |
---|
324 | |
---|
325 | |
---|
326 | |
---|
327 | |
---|
328 | |
---|
329 | |
---|
330 | |
---|
331 | |
---|
332 | |
---|
333 | |
---|
334 | |
---|
335 | |
---|
336 | |
---|
337 | |
---|
338 | |
---|
339 | |
---|
340 | |
---|
341 | |
---|
342 | |
---|
343 | |
---|
344 | |
---|
345 | |
---|
346 | |
---|
347 | |
---|
348 | |
---|
349 | |
---|
350 | |
---|
351 | |
---|
352 | |
---|
353 | APPENDIX I FEATURE KEYS |
---|
354 | ======================= |
---|
355 | A full description of features is found in the DDBJ/EMBL/GenBank Feature |
---|
356 | Table Definition Document at |
---|
357 | ftp://ftp.ebi.ac.uk/pub/databases/embl/release/ftable.doc |
---|
358 | and on the EBI website at |
---|
359 | http://www.ebi.ac.uk/ebi_docs/embl_db/ft/feature_table.html |
---|
360 | An abbreviated list of features keys is given below |
---|
361 | |
---|
362 | C_region constant region of immunoglobulin light and heavy chain, |
---|
363 | and T-cell receptor alpha, beta and gamma chains |
---|
364 | CAAT_signal eukaryotic promoter element; consensus=GG(C or T)CAATCT |
---|
365 | CDS protein coding sequence (includes stop codon) |
---|
366 | conflict the "same" sequence reported by different laboratories |
---|
367 | differ at this site or region |
---|
368 | D-segment diversity segment of immunoglobulin heavy chain and |
---|
369 | T-cell receptor beta-chain |
---|
370 | enhancer cis-acting enhancer of eukaryotic promoter function |
---|
371 | exon region that codes for part of spliced mRNA |
---|
372 | GC_signal eukaryotic promoter element; consensus=GGGCGG |
---|
373 | intron transcribed region excised by mRNA splicing |
---|
374 | J_segment joining segment of immunoglobulin light and heavy chains, |
---|
375 | T-cell receptor alpha, beta and gamma-chains |
---|
376 | LTR long terminal repeat |
---|
377 | mat_peptide mature peptide coding region (does not include stop codon) |
---|
378 | or signal peptide |
---|
379 | misc_feature region of biological interest which cannot be described |
---|
380 | by any other known feature |
---|
381 | mRNA messenger RNA |
---|
382 | mutation a related strain has an abrupt, inheritable change in the |
---|
383 | sequence |
---|
384 | polyA_signal polyadenylation signal recognition region |
---|
385 | polyA_site polyadenylation site to which adenine residues are added |
---|
386 | primer_bind non-covalent primer binding site |
---|
387 | promoter promoter region involved in transcription initiation |
---|
388 | protein_bind non-covalent protein binding site on DNA or RNA |
---|
389 | RBS ribosome binding site |
---|
390 | rep_origin origin of replication |
---|
391 | repeat_region region of genome containing repeating units |
---|
392 | repeat_unit single repeat element |
---|
393 | rRNA ribosomal RNA |
---|
394 | S_region switch region of immunoglobulin heavy chains |
---|
395 | satellite many tandem repeats of a short basic repeating unit |
---|
396 | sig_peptide signal peptide coding region |
---|
397 | stem_loop hair-pin loop structure in DNA or RNA |
---|
398 | STS sequence tagged site |
---|
399 | TATA_signal eukaryotic promoter element; consensus=TATA(A or T)A(A or T) |
---|
400 | terminator transcription termination signal |
---|
401 | transit_peptide transit peptide coding region |
---|
402 | tRNA transfer RNA |
---|
403 | V_region variable region of immunoglobulin light and heavy chains, |
---|
404 | and T-cell receptor alpha, beta, and gamma chains |
---|
405 | V_segment variable segment of immunoglobulin light and heavy chains, |
---|
406 | and T-cell receptor alpha, beta, and gamma chains. |
---|
407 | variation a related strain contains stable mutations from the same |
---|
408 | gene (e.g., RFLPs, polymorphisms) |
---|
409 | 3'UTR region at the 3' end of a mature transcript, following the |
---|
410 | stop codon |
---|
411 | 5'UTR region at the 5' end of a mature transcript, preceding the |
---|
412 | initiation |
---|
413 | -10_signal prokaryotic promoter element, consensus=TAtAaT |
---|
414 | -35_signal prokaryotic promoter element, consensus=TTGACa or TGTTGACA |
---|
415 | |
---|
416 | (Last change: 08-DEC-1998) |
---|
417 | (Wendy Baker, EMBL nucleotide sequence database curator) |
---|
418 | |
---|
419 | |
---|
420 | |
---|
421 | |
---|
422 | |
---|
423 | Agnes Leyen |
---|
424 | EMBL Outstation - The European Bioinformatics Institute |
---|
425 | Wellcome Trust Genome Campus |
---|
426 | Cambridge CB10 1SD |
---|
427 | UK |
---|
428 | |
---|
429 | |
---|
430 | DATASUBMISSIONS: |
---|
431 | +44 1223 494499 |
---|
432 | datasubs@ebi.ac.uk |
---|
433 | |
---|
434 | UPDATES: |
---|
435 | +44 1223 494499 |
---|
436 | updates@ebi.ac.uk |
---|
437 | |
---|
438 | PERSONAL: |
---|
439 | +44 1223 494411 |
---|
440 | leyen@ebi.ac.uk |
---|
441 | |
---|