Opened 4 years ago

#585 new optimization

optimize database sequence compression

Reported by: westram Owned by: devel
Priority: major Milestone:
Component: Library (DB) Version: SVN
Keywords: Cc:

Description

Problems:

  1. when copying an alignment, master sequence (MS) compression is NOT copied ⇒ destination alignment uses more space than source alignment
  2. when sequence data gets modified, MS compression is dropped

Solutions:

  1. copy MS information when copying alignment
  2. reuse previously used MS when compressing changed sequence

Special case:

  • when inserting/deleting columns, the MS could be adapted (=perform insert/delete there as well)

Possible optimization:

  • currently optimize uses the following steps:
    1. re-compress sequences w/o using MS
    2. delete old and create new MS
    3. re-compress sequences using MS
  • step 1. could be skipped by using 2 sets of MS and 2 separate 'compression mode' flags (currently always GB_COMPRESSION_SEQUENCE). optimize then would perform as:
    1. create new MS
    2. re-compress sequences using old MS to decompress and new MS to compress
    3. delete old MS

Change History (0)

Note: See TracTickets for help on using tickets.