Opened 2 years ago

Last modified 2 years ago

#781 new defect

Aborting import of duplicates is broken

Reported by: westram Owned by: devel
Priority: normal Milestone:
Component: Import / Export Version: SVN
Keywords: import Cc:

Description (last modified by westram)

Reproduce:

  • import attached rna.fasta twice
    • once using run --import rna.fasta (+ generate new IDs + format ali)
    • once more using File/Import/...from external format
  • during the 2nd import, the nameserver will generate the same names as during 1st import
    ⇒ arb warns about ID conflicts
    • answering don't import does not skip the import of these duplicates, but instead creates duplicate database entries with identical names

Problems:

  • resulting database contains multiple species with identical IDs (name)
  • if user has chosen to import species w/o generating new names (i.e. answered Use found names) an attempt to synchronize IDs leads to the following weird error message:
    Expected that a species named 'MtMMazei' exists (maybe there are duplicate species, database might be corrupt)
    
    • if new names were generated during import (as described above), synchronize IDs should be able to resolve the problem (by generating IDs with suffixes ".2" etc)

Note: restarting arb with the (defect) database detects the problem with duplicate IDs and tells how it may be fixed:

Database is corrupted:
Found 4 duplicated species with identical names!
Fix the problem using
   'Search For Equal Fields and Mark Duplicates'
in ARB_NTREE search tool, save DB and restart ARB.

Attachments (3)

rna.fasta (6.8 KB) - added by westram 2 years ago.
dups.arb (12.5 KB) - added by westram 2 years ago.
resulting database (with duplicate IDs)
dups_nosync.arb (12.4 KB) - added by westram 2 years ago.
resulting DB if imported with "use found names"

Download all attachments as: .zip

Change History (4)

Changed 2 years ago by westram

Changed 2 years ago by westram

resulting database (with duplicate IDs)

Changed 2 years ago by westram

resulting DB if imported with "use found names"

comment:1 Changed 2 years ago by westram

  • Description modified (diff)
  • Priority changed from major to normal
Note: See TracTickets for help on using tickets.