Building Hash Table, Part 3

Dealing with a lot of key/value pairs is not easy. First, huge data does not fit memory; secondly, disk offloading slows down. It is already the third iteration of my playground. Let’s sum up what I built so far.

Memory management

It’s very important part of the system. Because the system relies on huge memory allocation, I decided to abandon garbage collection as much as possible. The key/value binary data resides in one of the following locations:

  • in-memory block, stored in continuous byte array; data is appended or modified in place; may be allocated; extraction returns just reference to it
  • file builder block; stored in continuous byte array; data is only archived; it may not be allocated; extraction returns just reference to it
  • file disk block, stored in pure file; data is dumped using the builder; it cannot be appended or modified; extraction returns transient read data

All three locations are transparently managed by high level component and utilizes memory pool to efficiently reuse already allocated memory.


The core concept of hash table relies on hashing and buckets. The number of buckets is always a power of 2, therefore it is very important that the hashing equally distributes values. Additionally it brings one benefit that re-hashing may be done in place, because each bucket can only be split into two entries.

The buckets are abstracted as continuous array. In order to easy extend it, but not reallocate memory, it is built out of smaller same size arrays. Each bucket entry is just a 4 bytes number pointing at tree node or nothing (zero).


The hash entry is not implemented as linked list anymore. It became advanced structure containing information about it size, key data and value descriptors. Such structure is called a tree. If any key/value was accessed by get operation, it is marked as accessed. This information is stored per tree and is used in the archiving process.

The available memory cannot hold all key/value pairs. It is important that some data is offloaded into disk when memory is exhausted. Such process is called archiving and is triggered when memory in blocks exceed defined threshold. When it happens, the system tries to get rid of 20% of data in the following order: non-used values, used values, non-used trees, used trees.

Once archiving is finished, the memory should be rewritten to reclaim wasted space. It requires scanning all trees and moving affected trees/values from reclaimed block to other blocks.


One hour run generated following results. Each run had assigned expected memory usage. I measured how many rows I was able to insert, how much actual memory was used and how was the speed in thousands operations per second. The benchmark was running batches in which 1 million random rows were inserted/updated and 1 million rows were randomly accessed. The key size was 4 bytes.


As always the code is available here.

Software Developer, Data Engineer with solid knowledge of Business Intelligence. Passionate about programming.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store