Opened 4 years ago

Last modified 4 years ago

#812 assigned defect

synchronizing IDs may ressurrect zombies in trees

Reported by: westram Owned by: westram
Priority: critical Milestone:
Component: ARB_NTREE Version: SVN
Keywords: Cc:

Description

Background:

After synchronizing IDs of a database with the arb nameserver, arb changes several references to these IDs:

  • leafs of all trees
  • members of configurations (editor configs; marked species)
  • members of saved colorsets
  • links to origin organism (in "gene species")

All these sets of IDs may contain outdated IDs which are no longer used by any species in the database.
This happens e.g. if species have been deleted from the database.

The problem:

  • the database did contain species S1 with ID somename
  • S1 was member of tree_etc (i.e. one leaf of tree_etc contained somename)
  • S1 gets deleted from database (leaving a zombie <somename> in tree_etc)
  • the nameserver looses information about somename (see below when that may happen)
  • now assume that during next synchronization
    • the ID somename is regenerated and assigned to species S2
      ⇒ the zombie <somename> in tree_etc will ressurrect and will point to species S2 now.
      or assume
    • species S2 has ID somenam7 and is also member of tree_etc and its ID gets changed to somename
      ⇒ leaf somenam7 gets renamed to somename and zombie-leaf <somename> remains untouched
      ⇒ now two leafs of tree_etc point to S2 (=duplicate in tree)

Conditions triggering the problem:

Normally the nameserver knows about these former used IDs and avoids that they get reused.
The nameserver "forgets" these former IDs if you

  • reset the nameserver database,
  • begin or stop using an additional field (i.e. change the nameserver database),
  • give the database to another user (who uses a different nameserver database) or
  • move the database to another host (where you use a different nameserver database).

Even when the nameserver has forgotten such an ID, several other conditions have to be true to really trigger the problem:

  • you need zombies in trees (or in any of the other sets)
  • the fullnames of the species need to be similar and/or your database has to contain a huge number of species
    (otherwise the generated IDs wont be similar).

Change History (3)

comment:1 Changed 4 years ago by westram

  • Owner changed from devel to westram
  • Status changed from new to _started

comment:2 Changed 4 years ago by westram

Some history:

  • [4052] adds the detection for duplicates and zombies (whenever a new tree is selected inside arb).
    At that time we first encountered a tree containing duplicate species, but since that database passed the hands of multiple novice arb users, we assumed somebody just broke it and did not assume a general problem (see sentdate:[20060601 TO 20060631] AND subject:doppelte).

comment:3 Changed 4 years ago by westram

  • Status changed from _started to assigned
Note: See TracTickets for help on using tickets.