|
||
Space utilization is efficient
the underlying permanent file store adds minimal overhead to maintain its structure: store imposes a fixed 46 bytes + 2 bytes per additional 16K; each stream requires 7 bytes.
the database has minimal space requirements to store its schema and data structure.
including the stream overhead, row storage overhead can be less than 2 bytes per row in ideal clustering conditions, even with lower clustering the overhead is usually below 1% of the data volume.
The storage required for a row can be determined as follows:
each non-null value requires the storage for fixed width columns, and the raw storage plus 1 byte for variable width columns.
long columns are stored embedded in the row data when they are small enough, and otherwise they are stored in a separate stream: this makes more efficient use of the both space and speed. When embedded they require just 1 bit more than the short columns, when separated they require 8 bytes in the row data plus any stream overhead.
each nullable column requires an extra 1 bit.
bits are packed into bytes in the row storage.
Indexes are implemented using STORE B+trees. Note in particular that indexes use fixed length keys, so that indexes on longer text fields can consume significant space.
If the key for the index is k bytes, the number of rows to index is n, the index page size is P, and the B-tree packing density is r:
a = [(P-8)/(k+4)] * r
Where [x] is the largest integer <= x. Then the number of pages required, N, is
N = n * (1/a + 1/(a*a))
Each page requires P+7 bytes in the store, so the total indexing overhead, S, is
S = N * (P+7)
For DBMS P=512 and r=0.86.