unordered_set

Header: `<unordered_set>`

The <unordered_set> is a container in the C++ Standard Template Library (STL) that stores unique elements in no particular order. It uses a hash table for its internal implementation, providing average constant-time complexity for insertion, deletion, and search operations.

Key Characteristics

Stores unique elements
Elements are not ordered
Fast insertion, deletion, and search operations (average O(1) complexity)
Uses a hash function to map elements to buckets
Allows for custom hash functions and equality comparators
Does not allow modification of elements (to preserve hash integrity)
Part of the C++11 standard and later

Example 1: Basic Usage

#include <iostream>
#include <unordered_set>
#include <string>

int main() {
    std::unordered_set<std::string> fruits;

    // Inserting elements
    fruits.insert("apple");
    fruits.insert("banana");
    fruits.insert("cherry");
    fruits.insert("apple"); // Duplicate, won't be inserted

    // Printing the set
    std::cout << "Fruits in the set:" << std::endl;
    for (const auto& fruit : fruits) {
        std::cout << fruit << std::endl;
    }

    // Checking if an element exists
    if (fruits.find("banana") != fruits.end()) {
        std::cout << "Banana is in the set" << std::endl;
    }

    // Size of the set
    std::cout << "Number of fruits: " << fruits.size() << std::endl;

    return 0;
}

Explanation:

An unordered_set of strings is created to store fruit names
Elements are inserted using the insert() function
Duplicate insertions are ignored (e.g., "apple")
The set is iterated using a range-based for loop
The find() function is used to check for the existence of an element
The size() function returns the number of elements in the set

Example 2: Custom Type with Custom Hash Function

#include <iostream>
#include <unordered_set>
#include <string>

struct Person {
    std::string name;
    int age;

    bool operator==(const Person& other) const {
        return name == other.name && age == other.age;
    }
};

// Custom hash function for Person
struct PersonHash {
    std::size_t operator()(const Person& p) const {
        return std::hash<std::string>()(p.name) ^ std::hash<int>()(p.age);
    }
};

int main() {
    std::unordered_set<Person, PersonHash> people;

    people.insert({"Alice", 30});
    people.insert({"Bob", 25});
    people.insert({"Charlie", 35});
    people.insert({"Alice", 30}); // Duplicate, won't be inserted

    for (const auto& person : people) {
        std::cout << person.name << ": " << person.age << std::endl;
    }

    return 0;
}

Explanation:

A custom Person struct is defined with name and age
A custom operator== is defined for Person to check equality
A custom hash function PersonHash is created for Person objects
An unordered_set of Person objects is created, using PersonHash
The custom hash function allows the set to properly handle Person objects

Example 3: Performance Comparison with `std::set`

#include <iostream>
#include <unordered_set>
#include <set>
#include <chrono>
#include <random>

// Function to measure execution time
template<typename Func>
long long measureTime(Func func) {
    auto start = std::chrono::high_resolution_clock::now();
    func();
    auto end = std::chrono::high_resolution_clock::now();
    return std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
}

int main() {
    const int NUM_ELEMENTS = 1000000;
    const int NUM_SEARCHES = 10000;

    std::unordered_set<int> uset;
    std::set<int> set;

    // Random number generation
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<> dis(1, NUM_ELEMENTS);

    // Insertion
    auto insertUnordered = [&]() {
        for (int i = 0; i < NUM_ELEMENTS; ++i) {
            uset.insert(dis(gen));
        }
    };
    auto insertOrdered = [&]() {
        for (int i = 0; i < NUM_ELEMENTS; ++i) {
            set.insert(dis(gen));
        }
    };

    std::cout << "Insertion time (microseconds):" << std::endl;
    std::cout << "unordered_set: " << measureTime(insertUnordered) << std::endl;
    std::cout << "set: " << measureTime(insertOrdered) << std::endl;

    // Search
    auto searchUnordered = [&]() {
        for (int i = 0; i < NUM_SEARCHES; ++i) {
            uset.find(dis(gen));
        }
    };
    auto searchOrdered = [&]() {
        for (int i = 0; i < NUM_SEARCHES; ++i) {
            set.find(dis(gen));
        }
    };

    std::cout << "Search time (microseconds):" << std::endl;
    std::cout << "unordered_set: " << measureTime(searchUnordered) << std::endl;
    std::cout << "set: " << measureTime(searchOrdered) << std::endl;

    return 0;
}

Explanation:

This example compares the performance of std::unordered_set and std::set
It measures the time taken for insertion and search operations
A large number of random integers are inserted into both containers
The measureTime function is used to calculate execution time
std::unordered_set typically shows faster insertion and search times due to its hash-based implementation

Example 4: Bucket Interface

#include <iostream>
#include <unordered_set>
#include <string>

int main() {
    std::unordered_set<std::string> words = {
        "apple", "banana", "cherry", "date", "elderberry",
        "fig", "grape", "honeydew", "imbe", "jackfruit"
    };

    // Print bucket information
    std::cout << "Bucket count: " << words.bucket_count() << std::endl;
    std::cout << "Max bucket count: " << words.max_bucket_count() << std::endl;
    std::cout << "Load factor: " << words.load_factor() << std::endl;
    std::cout << "Max load factor: " << words.max_load_factor() << std::endl;

    // Print contents of each bucket
    for (size_t i = 0; i < words.bucket_count(); ++i) {
        std::cout << "Bucket " << i << " contains:";
        for (auto it = words.begin(i); it != words.end(i); ++it) {
            std::cout << " " << *it;
        }
        std::cout << std::endl;
    }

    // Find which bucket an element is in
    std::string search = "grape";
    size_t bucket = words.bucket(search);
    std::cout << "'" << search << "' is in bucket " << bucket << std::endl;

    return 0;
}

Explanation:

This example demonstrates the bucket interface of std::unordered_set
bucket_count() returns the number of buckets in the container
max_bucket_count() shows the maximum number of buckets the container can have
load_factor() returns the average number of elements per bucket
max_load_factor() returns the current maximum load factor
The example iterates through each bucket, showing its contents
bucket(key) is used to find which bucket a specific element is in

Additional Considerations

Iterator Stability: Iterators and references to elements in an unordered_set remain valid after insertion or deletion of other elements.
Rehashing: When the load factor exceeds the max load factor, the container automatically increases the number of buckets and rehashes the elements.
Custom Types: When using custom types, you need to provide a hash function and an equality comparison function.
No Duplicates: unordered_set automatically handles duplicate elements by not inserting them.

Summary

std::unordered_set is a powerful container in C++ for storing unique elements with fast access times. Key points to remember:

It provides average constant-time complexity for insertion, deletion, and search operations
Elements are unordered and unique
It uses a hash function to distribute elements into buckets
Custom types can be used with custom hash and equality functions
It's generally faster than std::set for most operations, especially with large datasets
The bucket interface allows for fine-grained control and analysis of the underlying hash table

unordered_set is ideal for scenarios where fast lookup and uniqueness of elements are required, and the order of elements is not important. It's commonly used in situations like removing duplicates from a collection, fast membership testing, and implementing certain algorithms that require quick element access and uniqueness.

What are the key differences between std::unordered_set and std::set
How can I optimize the performance of std::unordered_set
What are some common use cases for std::unordered_set
How does std::unordered_set handle hash collisions
Can you provide an example of using std::unordered_set with custom data types

Previous Page | Course Schedule | Course Content

unordered_set

Header: <unordered_set>

Key Characteristics

Example 1: Basic Usage

Explanation:

Example 2: Custom Type with Custom Hash Function

Explanation:

Example 3: Performance Comparison with std::set

Explanation:

Example 4: Bucket Interface

Explanation:

Additional Considerations

Summary

Related

Header: `<unordered_set>`

Example 3: Performance Comparison with `std::set`