hash_table


Data Structure: Hash Table

A hash table is a data structure that implements an associative array, a structure that can map keys to values. Hash tables are designed to provide efficient insertion, deletion, and lookup operations, typically achieving average-case constant-time complexity (O(1)) for these operations. Hash tables are widely used in various programming languages and are fundamental to many implementations of associative containers, such as std::unordered_map and std::unordered_set in C++.

Key Concepts of Hash Tables

Basic Operations in a Hash Table

Example 1: Conceptual Implementation of a Hash Table

Let's explore a simple, conceptual implementation of a hash table in C++ using chaining for collision resolution.

#include <iostream>
#include <vector>
#include <list>
#include <string>

// Simple hash function
int hashFunction(const std::string& key, int tableSize) {
    int hashValue = 0;
    for (char ch : key) {
        hashValue += ch;
    }
    return hashValue % tableSize;
}

class HashTable {
public:
    HashTable(int size) : table(size) {}

    void insert(const std::string& key, int value) {
        int hashValue = hashFunction(key, table.size());
        table[hashValue].emplace_back(key, value);
    }

    bool lookup(const std::string& key, int& value) {
        int hashValue = hashFunction(key, table.size());
        for (auto& [k, v] : table[hashValue]) {
            if (k == key) {
                value = v;
                return true;
            }
        }
        return false;
    }

    void remove(const std::string& key) {
        int hashValue = hashFunction(key, table.size());
        table[hashValue].remove_if([&](const std::pair<std::string, int>& item) {
            return item.first == key;
        });
    }

private:
    std::vector<std::list<std::pair<std::string, int>>> table;
};

int main() {
    HashTable ht(10);

    // Insert key-value pairs
    ht.insert("apple", 100);
    ht.insert("banana", 150);
    ht.insert("orange", 200);

    // Lookup values
    int value;
    if (ht.lookup("banana", value)) {
        std::cout << "Value for 'banana': " << value << std::endl;
    } else {
        std::cout << "'banana' not found." << std::endl;
    }

    // Remove a key-value pair
    ht.remove("banana");
    if (ht.lookup("banana", value)) {
        std::cout << "Value for 'banana': " << value << std::endl;
    } else {
        std::cout << "'banana' not found." << std::endl;
    }

    return 0;
}

Explanation:

Collision Resolution Techniques

There are two primary methods to handle collisions in a hash table: - Chaining: Each bucket in the hash table contains a list of entries that hash to the same index. This method is simple and easy to implement. - Open Addressing: When a collision occurs, the hash table probes the array to find the next available bucket. Techniques include linear probing, quadratic probing, and double hashing.

Example 2: Open Addressing with Linear Probing

#include <iostream>
#include <vector>
#include <string>

class HashTable {
public:
    HashTable(int size) : table(size, ""), values(size, 0), size(size) {}

    void insert(const std::string& key, int value) {
        int index = hashFunction(key);
        while (!table[index].empty()) {
            index = (index + 1) % size;  // Linear probing
        }
        table[index] = key;
        values[index] = value;
    }

    bool lookup(const std::string& key, int& value) {
        int index = hashFunction(key);
        while (!table[index].empty()) {
            if (table[index] == key) {
                value = values[index];
                return true;
            }
            index = (index + 1) % size;  // Linear probing
        }
        return false;
    }

    void remove(const std::string& key) {
        int index = hashFunction(key);
        while (!table[index].empty()) {
            if (table[index] == key) {
                table[index] = "";  // Mark as deleted
                return;
            }
            index = (index + 1) % size;  // Linear probing
        }
    }

private:
    int hashFunction(const std::string& key) {
        int hashValue = 0;
        for (char ch : key) {
            hashValue += ch;
        }
        return hashValue % size;
    }

    std::vector<std::string> table;
    std::vector<int> values;
    int size;
};

int main() {
    HashTable ht(10);

    ht.insert("apple", 100);
    ht.insert("banana", 150);
    ht.insert("orange", 200);

    int value;
    if (ht.lookup("banana", value)) {
        std::cout << "Value for 'banana': " << value << std::endl;
    }

    ht.remove("banana");
    if (!ht.lookup("banana", value)) {
        std::cout << "'banana' not found." << std::endl;
    }

    return 0;
}

Explanation:

Advantages and Disadvantages of Hash Tables

Advantages:

Disadvantages:

Practical Applications

Hash tables are widely used in various real-world applications: - Associative Containers: Implementing maps, dictionaries, and sets in programming languages. - Database Indexing: Storing indexes for quick access to database records. - Caching: Implementing caches where key-value pairs are stored for fast retrieval.

Summary

Hash tables are a fundamental data structure in computer science, offering a highly efficient way to manage and access data based on keys. Understanding hash tables and their underlying principles is crucial for both theoretical and practical applications in programming.

Previous Page | Course Schedule | Course Content