Building Hash Table, Part 3

Memory management

It’s very important part of the system. Because the system relies on huge memory allocation, I decided to abandon garbage collection as much as possible. The key/value binary data resides in one of the following locations:

  • in-memory block, stored in continuous byte array; data is appended or modified in place; may be allocated; extraction returns just reference to it
  • file builder block; stored in continuous byte array; data is only archived; it may not be allocated; extraction returns just reference to it
  • file disk block, stored in pure file; data is dumped using the builder; it cannot be appended or modified; extraction returns transient read data
MemoryPool:
- block(index) -> MemoryBlock
- builder(index) -> FileBuilder
MemoryManager:
- allocate(size) -> Allocation
- extract(node) -> Tree | Value
- archive(node)
MemoryBlock:
- allocate(size) -> Allocation
- extract(node) -> Tree | Value
FileBuilder:
- allocate(size) -> Allocation
- extract(node) -> Tree | Value
- flush() -> FileBlock
FileBlock:
- extract(node) -> Tree | Value

Hashing

The core concept of hash table relies on hashing and buckets. The number of buckets is always a power of 2, therefore it is very important that the hashing equally distributes values. Additionally it brings one benefit that re-hashing may be done in place, because each bucket can only be split into two entries.

# Let's imaging 4 hashes:A: 0b.00011001.11101000.10010010.11101100 ->    434.672.364
B: 0b.10100101.10111011.10010010.11101100 -> -1.514.433.812
C: 0b.10001010.01110110.10010010.11101100 -> -1.971.940.628
D: 0b.11111100.10000110.10010010.11101100 -> -58.289.428
# 16 bits bucket size will put all hashes into one bucket10010010.11101100: A, B, C, D# Extending buckets into 17 bits, splits bucket into 2 buckets0.10010010.11101100: A, C, D
1.10010010.11101100: B
# Further extending into 18 bits, splits buckets as following00.10010010.11101100: A
11.10010010.11101100: B
10.10010010.11101100: C, D
# 19, 20 bits does not change distribution, but 21 bits does01000.10010010.11101100: A
11011.10010010.11101100: B
10110.10010010.11101100: C
00110.10010010.11101100: D

Trees

The hash entry is not implemented as linked list anymore. It became advanced structure containing information about it size, key data and value descriptors. Such structure is called a tree. If any key/value was accessed by get operation, it is marked as accessed. This information is stored per tree and is used in the archiving process.

Tree Node:
- Block # 2 bytes
- Offset # 4 bytes
- Accessed # 2 bytes
Tree Structure:
- n # 1 bytes, number of entries
- Lengths # n bytes, the length of each key
- Descriptors # n*8 bytes, describes each value
- Block # 2 bytes, block where value resides
- Offset # 4 bytes, offset of the value in the block
- Length # 2 bytes, the length of value in the block
- Keys # binary data of all n keys, one by one

Benchmark

One hour run generated following results. Each run had assigned expected memory usage. I measured how many rows I was able to insert, how much actual memory was used and how was the speed in thousands operations per second. The benchmark was running batches in which 1 million random rows were inserted/updated and 1 million rows were randomly accessed. The key size was 4 bytes.

memory | rows | used   | speed
---------------------------------
4096M | 325M | 5057M | 198 kop/s
8192M | 434M | 9153M | 246 kop/s
16384M | 523M | 17343M | 304 kop/s

Codebase

As always the code is available here.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adrian Macal

Adrian Macal

Software Developer, Data Engineer with solid knowledge of Business Intelligence. Passionate about programming.