bug when realign nucleic acid according to aligned protein

For a protein database, when there is "AA-" or "TGN" in the nucleic acid, it is translated to an "X" in aa sequence.

When I need to realign the nucleotide sequence according to the protein alignment (even I've just translated from the nucleic acid sequence to the aa sequence), it results either "Not a codon (not all IUPAC-combinations of 'AAT' translate and Not all IUPAC-combiations of 'GAT' translate)", or "Error while reading 'transl_table' (Illegal (or unsuppported) value (0) in 'transl_table' (species=…))".

I know that's much easier in Bioedit, since it can keep nucleotide codons and aa residues synchronized. I can easily align or change alignment of a nucleotide-based sequence in the protein mode. However, due to the realignment problem, it makes it not possible in arb.

My thought to solve this problem is: since in ARB, nucleotide and aa are in two different alignment fields, it might cause algorithmic problem concerning the gaps, dots or 'N's in nucleotide sequence. Then you can offer an option: to keep codon reading frames not changed. Since for most of the sequences, the user has corrected these Ns and gaps themselves manually, if only the protein alignment rather than the basic sequences are changed, the program can do the realignment without having to adjust de novo, where of the nucleotide correspond to a residue.

for example: "AAA AAT AAN GGT GG- —- GGA GGG GG." (the user is sure the reading frame should be arranged like that) will be translated as "L L X G X - G G X " When the user changed the protein alignment (but no sequence change made), he should be still able to realign the nucleotide according to the protein sequence.

Best greets and thanks!

Yan Shi (syan@…)

