Last change
on this file was
19532,
checked in by westram, 5 weeks ago
|
- reintegrates 'help' into 'trunk'
- tweak arb documentation:
- automatically link
- ticket references to arb
tracker (only affects html version).
- found URLs.
- page titles
- warn about long titles.
- introduce SUBTITLEs (automatically triggered by multi-line titles in source files).
- increase allowed length (limited by subwindow width).
- cleanup header sections in all helpfiles.
- fix and/or update several help files.
- document syntax of help sources.
- build issues:
- when xml validation fails, next build no longer uses invalid xml ⇒ keeps failing.
- remove output files on error (including files below ARBHOME/lib).
- pipe output through logs to ensure proper wrapping in Entering/Leaving lines.
- moves Tree admin + NDS menu entries to top of menu
- adds: log:branches/help@18783:19531
|
-
Property svn:eol-style set to
native
-
Property svn:keywords set to
Author Date Id Revision
|
File size:
1.4 KB
|
Line | |
---|
1 | # main topics: |
---|
2 | UP arb.hlp |
---|
3 | UP glossary.hlp |
---|
4 | |
---|
5 | #SUB subtopic.hlp |
---|
6 | |
---|
7 | |
---|
8 | TITLE Optimize database compression |
---|
9 | |
---|
10 | OCCURRENCE ARB_NT |
---|
11 | |
---|
12 | DESCRIPTION Sequence data normally need's a lot of memory. To be able to |
---|
13 | handle thousands of sequences we implemented an online |
---|
14 | compression. All data is compressed most of the time and only |
---|
15 | uncompressed on demand. As a user you only find smaller database |
---|
16 | files, that's all. |
---|
17 | Without understanding the data, the program can compress data only |
---|
18 | by a limited factor. With the help of a tree aligned sequences |
---|
19 | can be compressed much better by storing only the differences |
---|
20 | to a consensus sequence. |
---|
21 | Once a sequence is compressed using a tree, it will keep |
---|
22 | the good compression method until it is changed. Then only the |
---|
23 | older method is used. |
---|
24 | As long as you change only a few (up to 100) sequences, the |
---|
25 | database won't grow very much. |
---|
26 | |
---|
27 | To compress the entire database, the program needs a tree, |
---|
28 | which should cover most of the sequences. The larger and better |
---|
29 | the tree, the better the compression. |
---|
30 | |
---|
31 | EXAMPLE 10000 aligned 16s sequences need 50 mega-bytes of memory. |
---|
32 | Without your help ARB will reduce them to 10 mega-bytes, |
---|
33 | and given a tree not more than 2 mega-bytes will be needed. |
---|
34 | |
---|
35 | |
---|
36 | NOTES Any major database update, especially inserting or deleting |
---|
37 | gaps in an alignment, should be followed by a new optimization |
---|
38 | step. |
---|
39 | |
---|
40 | BUGS No bugs known |
---|
Note: See
TracBrowser
for help on using the repository browser.