|
Last change
on this file was
19532,
checked in by westram, 4 months ago
|
- reintegrates 'help' into 'trunk'
- tweak arb documentation:
- automatically link
- ticket references to arb
tracker (only affects html version).
- found URLs.
- page titles
- warn about long titles.
- introduce SUBTITLEs (automatically triggered by multi-line titles in source files).
- increase allowed length (limited by subwindow width).
- cleanup header sections in all helpfiles.
- fix and/or update several help files.
- document syntax of help sources.
- build issues:
- when xml validation fails, next build no longer uses invalid xml ⇒ keeps failing.
- remove output files on error (including files below ARBHOME/lib).
- pipe output through logs to ensure proper wrapping in Entering/Leaving lines.
- moves Tree admin + NDS menu entries to top of menu
- adds: log:branches/help@18783:19531
|
-
Property svn:eol-style set to
native
-
Property svn:keywords set to
Author Date Id Revision
|
|
File size:
1.4 KB
|
| Line | |
|---|
| 1 | # main topics: |
|---|
| 2 | UP arb.hlp |
|---|
| 3 | UP glossary.hlp |
|---|
| 4 | |
|---|
| 5 | #SUB subtopic.hlp |
|---|
| 6 | |
|---|
| 7 | |
|---|
| 8 | TITLE Optimize database compression |
|---|
| 9 | |
|---|
| 10 | OCCURRENCE ARB_NT |
|---|
| 11 | |
|---|
| 12 | DESCRIPTION Sequence data normally need's a lot of memory. To be able to |
|---|
| 13 | handle thousands of sequences we implemented an online |
|---|
| 14 | compression. All data is compressed most of the time and only |
|---|
| 15 | uncompressed on demand. As a user you only find smaller database |
|---|
| 16 | files, that's all. |
|---|
| 17 | Without understanding the data, the program can compress data only |
|---|
| 18 | by a limited factor. With the help of a tree aligned sequences |
|---|
| 19 | can be compressed much better by storing only the differences |
|---|
| 20 | to a consensus sequence. |
|---|
| 21 | Once a sequence is compressed using a tree, it will keep |
|---|
| 22 | the good compression method until it is changed. Then only the |
|---|
| 23 | older method is used. |
|---|
| 24 | As long as you change only a few (up to 100) sequences, the |
|---|
| 25 | database won't grow very much. |
|---|
| 26 | |
|---|
| 27 | To compress the entire database, the program needs a tree, |
|---|
| 28 | which should cover most of the sequences. The larger and better |
|---|
| 29 | the tree, the better the compression. |
|---|
| 30 | |
|---|
| 31 | EXAMPLE 10000 aligned 16s sequences need 50 mega-bytes of memory. |
|---|
| 32 | Without your help ARB will reduce them to 10 mega-bytes, |
|---|
| 33 | and given a tree not more than 2 mega-bytes will be needed. |
|---|
| 34 | |
|---|
| 35 | |
|---|
| 36 | NOTES Any major database update, especially inserting or deleting |
|---|
| 37 | gaps in an alignment, should be followed by a new optimization |
|---|
| 38 | step. |
|---|
| 39 | |
|---|
| 40 | BUGS No bugs known |
|---|
Note: See
TracBrowser
for help on using the repository browser.