std::unordered_set


STL container: std::unordered_set

std::unordered_set is a container in the C++ Standard Template Library (STL) that stores unique elements in no particular order. It uses a hash table for its internal implementation, providing average constant-time complexity for insertion, deletion, and search operations.

Key Characteristics

Example 1: Basic Usage

#include <iostream>
#include <unordered_set>
#include <string>

int main() {
    std::unordered_set<std::string> fruits;

    // Inserting elements
    fruits.insert("apple");
    fruits.insert("banana");
    fruits.insert("cherry");
    fruits.insert("apple"); // Duplicate, won't be inserted

    // Printing the set
    std::cout << "Fruits in the set:" << std::endl;
    for (const auto& fruit : fruits) {
        std::cout << fruit << std::endl;
    }

    // Checking if an element exists
    if (fruits.find("banana") != fruits.end()) {
        std::cout << "Banana is in the set" << std::endl;
    }

    // Size of the set
    std::cout << "Number of fruits: " << fruits.size() << std::endl;

    // Removing an element
    fruits.erase("cherry");

    return 0;
}

Explanation:

Example 2: Custom Type with Custom Hash Function

#include <iostream>
#include <unordered_set>
#include <string>

struct Person {
    std::string name;
    int age;

    bool operator==(const Person& other) const {
        return name == other.name && age == other.age;
    }
};

// Custom hash function for Person
struct PersonHash {
    std::size_t operator()(const Person& p) const {
        return std::hash<std::string>()(p.name) ^ std::hash<int>()(p.age);
    }
};

int main() {
    std::unordered_set<Person, PersonHash> people;

    people.insert({"Alice", 30});
    people.insert({"Bob", 25});
    people.insert({"Charlie", 35});
    people.insert({"Alice", 30}); // Duplicate, won't be inserted

    for (const auto& person : people) {
        std::cout << person.name << ": " << person.age << std::endl;
    }

    return 0;
}

Explanation:

Example 3: Performance Comparison with std::set

#include <iostream>
#include <unordered_set>
#include <set>
#include <chrono>
#include <random>

// Function to measure execution time
template<typename Func>
long long measureTime(Func func) {
    auto start = std::chrono::high_resolution_clock::now();
    func();
    auto end = std::chrono::high_resolution_clock::now();
    return std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
}

int main() {
    const int NUM_ELEMENTS = 1000000;
    const int NUM_SEARCHES = 1000000;

    std::unordered_set<int> uset;
    std::set<int> set;

    // Random number generation
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dis(1, NUM_ELEMENTS);

    // Insertion
    auto insertUnordered = [&]() {
        for (int i = 0; i < NUM_ELEMENTS; ++i) {
            uset.insert(dis(gen));
        }
    };
    auto insertOrdered = [&]() {
        for (int i = 0; i < NUM_ELEMENTS; ++i) {
            set.insert(dis(gen));
        }
    };

    std::cout << "Insertion time (microseconds):" << std::endl;
    std::cout << "unordered_set: " << measureTime(insertUnordered) << std::endl;
    std::cout << "set: " << measureTime(insertOrdered) << std::endl;

    // Search
    auto searchUnordered = [&]() {
        for (int i = 0; i < NUM_SEARCHES; ++i) {
            uset.find(dis(gen));
        }
    };
    auto searchOrdered = [&]() {
        for (int i = 0; i < NUM_SEARCHES; ++i) {
            set.find(dis(gen));
        }
    };

    std::cout << "Search time (microseconds):" << std::endl;
    std::cout << "unordered_set: " << measureTime(searchUnordered) << std::endl;
    std::cout << "set: " << measureTime(searchOrdered) << std::endl;

    return 0;
}

Explanation:

Example 4: Bucket Interface

#include <iostream>
#include <unordered_set>
#include <string>

int main() {
    std::unordered_set<std::string> words = {
        "apple", "banana", "cherry", "date", "elderberry",
        "fig", "grape", "honeydew", "imbe", "jackfruit"
    };

    // Print bucket information
    std::cout << "Bucket count: " << words.bucket_count() << std::endl;
    std::cout << "Max bucket count: " << words.max_bucket_count() << std::endl;
    std::cout << "Load factor: " << words.load_factor() << std::endl;
    std::cout << "Max load factor: " << words.max_load_factor() << std::endl;

    // Print contents of each bucket
    for (size_t i = 0; i < words.bucket_count(); ++i) {
        std::cout << "Bucket " << i << " contains:";
        for (auto it = words.begin(i); it != words.end(i); ++it) {
            std::cout << " " << *it;
        }
        std::cout << std::endl;
    }

    // Find which bucket an element is in
    std::string search = "grape";
    size_t bucket = words.bucket(search);
    std::cout << "'" << search << "' is in bucket " << bucket << std::endl;

    // Rehash the container
    std::cout << "\nRehashing to 20 buckets..." << std::endl;
    words.rehash(20);
    std::cout << "New bucket count: " << words.bucket_count() << std::endl;

    return 0;
}

Explanation:

Additional Considerations

  1. Iterator Stability: Iterators and references to elements in an unordered_set remain valid after insertion or deletion of other elements.

  2. Rehashing: When the load factor exceeds the max load factor, the container automatically increases the number of buckets and rehashes the elements.

  3. Custom Types: When using custom types, you need to provide a hash function and an equality comparison function.

  4. No Duplicates: unordered_set automatically handles duplicate elements by not inserting them.

  5. Memory Usage: unordered_set typically uses more memory than set due to its hash table structure, but this trade-off allows for faster access times.

Summary

std::unordered_set is a powerful container in C++ for storing unique elements with fast access times. Key points to remember:

unordered_set is ideal for scenarios where fast lookup and uniqueness of elements are required, and the order of elements is not important. It's commonly used in situations like: - Removing duplicates from a collection - Fast membership testing - Implementing certain algorithms that require quick element access and uniqueness

Understanding when to use std::unordered_set versus other containers like std::set or std::vector is crucial for writing efficient and clear C++ code in various application domains. Its constant-time average complexity for key operations makes it a go-to choice for many performance-critical applications where set operations are frequently performed.

Related

Previous Page | Course Schedule | Course Content