Opened 9 years ago

Closed 9 years ago

Last modified 3 years ago

#663 closed defect (fixed)

consensus calculation is defect/inconsistent

Reported by: westram Owned by: westram
Priority: major Milestone: arb7.0
Component: Library (other) Version: SVN
Keywords: Cc:

Description (last modified by westram)

Arb has two ways to calculate consensi:

  1. NTREE/SAI/Create SAI using …/Consensus
  2. Consensus displayed in EDIT4

Code is not shared, because EDIT4 consensus-calculation is mixed with alignment-compression (i.e. hiding gap-columns) in EDIT4 for better performance.
Results of both differ! :(
Reproduce:

  • load attached DB
  • start EDIT4 with config 'consensus' (contains all 41 marked species)
  • compare:
    • SAI:consensus (created with method 1.)
    • Consensus of group 'all' (method 2.)
    • ConsAll2 (same; converted to species; use with 'View/Show? only differences to selected')

Consensus-settings used: gaps=off; iupac=on,26%; upper/lower=0/75

Current differences in detail:

PosSAIGroupCol.contentReason
12 M Y 1x'Y' 'Y' is correct

Current differences with gaps=on:

PosSAIGroupCol.contentReason
115 g s 14x'G'+10x'C'+3x'U' C(37%)+G(51%)=S; 24/41=58%⇒lowercase
120 a w 14x'A'+3x'C'+6x'G'+9x'U' A(43%)+U(28%)=W; 23/41=56%⇒lowercase
153 G g 23x'G' 23/41=56%⇒lowercase
159 R r 13x'G'+8x'A'+1x'U'+1x'C' G(56%)+A(34%)=R; 21/41=51%⇒lowercase

More facts:

  • if consensus SAI is calculated with 'gaps=on'
    • the differences listed above disappear (i.e. EDIT4 group consensus seems to use gaps=on implicitely
    • starting with pos=154 several positions are uppercase in SAI and lowercase in EDIT4
  • make sure remark of CloTyro2 is ignored in method 1!

Differences before [14387]:

PosSAIGroupCol.contentReason
6 G g 14x'G'+3x'A'+1x'U' G(77%) ⇒ 'G'
12 M n 1x'Y' both wrong, should be 'Y'
21 M m 2x'A'+1x'C' A(66%)+C(33%)=M
32 R r 18x'A'+12x'G'+6x'C'+3x'U' A(46%)+G(30%)=R; 46+30=76%⇒uppercase
69 G g 2x'G' only 'G' ⇒ should be uppercase
115 S s 14x'G'+10x'C'+3x'U' C(37%)+G(51%)=S; 37+51=88%⇒uppercase

Differences before [14319]:

PosSAIGroupCol.contentReason
12 M - 1x'Y'
21 M - 2x'A'+1x'C' A(66%)+C(33%)=M; gaps are off ⇒ '-' is wrong||
69 G - 2x'G' gaps are off ⇒ '-' is wrong!

Differences before [14321]:

PosSAIGroupCol.contentReason
21 M a 2x'A'+1x'C' A(66%)+C(33%)=M
115 S g 14x'G'+10x'C'+3x'U' C(37%)+G(51%)=S; 37+51=88%⇒uppercase; gaps are off ⇒ 'g' is wrong!
120 w a 14x'A'+ 3x'C'+6x'G'+9x'U' A(43%)+T(28%)=W; 43+28=71%⇒lowercase

Amino acids

Consensus calculation for amino acids works does completely different things:

  • EDIT4 calculates simplified amino acid code (as currently shown in IUPAC info window; i.e. 'A' for any of "PAGST" etc.)
  • NTREE consensus calculation does not seem to handle any amino-IUPAC-codes

Common IUPAC codes for amino acids (both supported in ARB; but probably insufficient):

  • B = D | N
  • Z = E | Q
  • X = <any>

Some sources also mention J = L | I.

Attachments (1)

consensus.arb (138.7 KB) - added by westram 9 years ago.
demonstrate behavior

Download all attachments as: .zip

Change History (12)

Changed 9 years ago by westram

demonstrate behavior

comment:1 Changed 9 years ago by westram

  • Owner changed from devel to westram
  • Status changed from new to _started

comment:2 Changed 9 years ago by westram

  • Component changed from !NoIdea to Library (other)
  • Description modified (diff)

comment:3 Changed 9 years ago by westram

  • Description modified (diff)

comment:4 Changed 9 years ago by westram

  • Description modified (diff)

comment:5 Changed 9 years ago by westram

  • Description modified (diff)

comment:6 Changed 9 years ago by westram

  • Description modified (diff)

comment:7 Changed 9 years ago by westram

  • Description modified (diff)

comment:8 Changed 9 years ago by westram

  • Description modified (diff)

comment:9 Changed 9 years ago by westram

  • Description modified (diff)

comment:10 Changed 9 years ago by westram

  • Resolution set to fixed
  • Status changed from _started to closed

with [14396] (in branch consensus)

comment:11 Changed 3 years ago by westram

  • Milestone changed from arb6.1 to arb7.0

Milestone renamed

Note: See TracTickets for help on using tickets.