std::unordered_set
std::unordered_set
is a container in the C++ Standard Template Library (STL) that stores unique elements in no particular order. It uses a hash table for its internal implementation, providing average constant-time complexity for insertion, deletion, and search operations.
#include <iostream>
#include <unordered_set>
#include <string>
int main() {
std::unordered_set<std::string> fruits;
// Inserting elements
fruits.insert("apple");
fruits.insert("banana");
fruits.insert("cherry");
fruits.insert("apple"); // Duplicate, won't be inserted
// Printing the set
std::cout << "Fruits in the set:" << std::endl;
for (const auto& fruit : fruits) {
std::cout << fruit << std::endl;
}
// Checking if an element exists
if (fruits.find("banana") != fruits.end()) {
std::cout << "Banana is in the set" << std::endl;
}
// Size of the set
std::cout << "Number of fruits: " << fruits.size() << std::endl;
// Removing an element
fruits.erase("cherry");
return 0;
}
unordered_set
of strings is created to store fruit namesinsert()
functionfind()
is used to check for the existence of an elementsize()
returns the number of elements in the seterase()
removes an element from the set#include <iostream>
#include <unordered_set>
#include <string>
struct Person {
std::string name;
int age;
bool operator==(const Person& other) const {
return name == other.name && age == other.age;
}
};
// Custom hash function for Person
struct PersonHash {
std::size_t operator()(const Person& p) const {
return std::hash<std::string>()(p.name) ^ std::hash<int>()(p.age);
}
};
int main() {
std::unordered_set<Person, PersonHash> people;
people.insert({"Alice", 30});
people.insert({"Bob", 25});
people.insert({"Charlie", 35});
people.insert({"Alice", 30}); // Duplicate, won't be inserted
for (const auto& person : people) {
std::cout << person.name << ": " << person.age << std::endl;
}
return 0;
}
Person
struct is defined with name
and age
operator==
is defined for Person
to check equalityPersonHash
is created for Person
objectsunordered_set
of Person
objects is created, using PersonHash
unordered_set
std::set
#include <iostream>
#include <unordered_set>
#include <set>
#include <chrono>
#include <random>
// Function to measure execution time
template<typename Func>
long long measureTime(Func func) {
auto start = std::chrono::high_resolution_clock::now();
func();
auto end = std::chrono::high_resolution_clock::now();
return std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();
}
int main() {
const int NUM_ELEMENTS = 1000000;
const int NUM_SEARCHES = 1000000;
std::unordered_set<int> uset;
std::set<int> set;
// Random number generation
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dis(1, NUM_ELEMENTS);
// Insertion
auto insertUnordered = [&]() {
for (int i = 0; i < NUM_ELEMENTS; ++i) {
uset.insert(dis(gen));
}
};
auto insertOrdered = [&]() {
for (int i = 0; i < NUM_ELEMENTS; ++i) {
set.insert(dis(gen));
}
};
std::cout << "Insertion time (microseconds):" << std::endl;
std::cout << "unordered_set: " << measureTime(insertUnordered) << std::endl;
std::cout << "set: " << measureTime(insertOrdered) << std::endl;
// Search
auto searchUnordered = [&]() {
for (int i = 0; i < NUM_SEARCHES; ++i) {
uset.find(dis(gen));
}
};
auto searchOrdered = [&]() {
for (int i = 0; i < NUM_SEARCHES; ++i) {
set.find(dis(gen));
}
};
std::cout << "Search time (microseconds):" << std::endl;
std::cout << "unordered_set: " << measureTime(searchUnordered) << std::endl;
std::cout << "set: " << measureTime(searchOrdered) << std::endl;
return 0;
}
std::unordered_set
and std::set
measureTime
function is used to calculate execution timestd::unordered_set
typically shows faster insertion and search times due to its hash-based implementation#include <iostream>
#include <unordered_set>
#include <string>
int main() {
std::unordered_set<std::string> words = {
"apple", "banana", "cherry", "date", "elderberry",
"fig", "grape", "honeydew", "imbe", "jackfruit"
};
// Print bucket information
std::cout << "Bucket count: " << words.bucket_count() << std::endl;
std::cout << "Max bucket count: " << words.max_bucket_count() << std::endl;
std::cout << "Load factor: " << words.load_factor() << std::endl;
std::cout << "Max load factor: " << words.max_load_factor() << std::endl;
// Print contents of each bucket
for (size_t i = 0; i < words.bucket_count(); ++i) {
std::cout << "Bucket " << i << " contains:";
for (auto it = words.begin(i); it != words.end(i); ++it) {
std::cout << " " << *it;
}
std::cout << std::endl;
}
// Find which bucket an element is in
std::string search = "grape";
size_t bucket = words.bucket(search);
std::cout << "'" << search << "' is in bucket " << bucket << std::endl;
// Rehash the container
std::cout << "\nRehashing to 20 buckets..." << std::endl;
words.rehash(20);
std::cout << "New bucket count: " << words.bucket_count() << std::endl;
return 0;
}
std::unordered_set
bucket_count()
returns the number of buckets in the containermax_bucket_count()
shows the maximum number of buckets the container can haveload_factor()
returns the average number of elements per bucketmax_load_factor()
returns the current maximum load factorbucket(key)
is used to find which bucket a specific element is inrehash()
is used to manually rehash the container with a new bucket countIterator Stability: Iterators and references to elements in an unordered_set
remain valid after insertion or deletion of other elements.
Rehashing: When the load factor exceeds the max load factor, the container automatically increases the number of buckets and rehashes the elements.
Custom Types: When using custom types, you need to provide a hash function and an equality comparison function.
No Duplicates: unordered_set
automatically handles duplicate elements by not inserting them.
Memory Usage: unordered_set
typically uses more memory than set
due to its hash table structure, but this trade-off allows for faster access times.
std::unordered_set
is a powerful container in C++ for storing unique elements with fast access times. Key points to remember:
std::set
for most operations, especially with large datasetsunordered_set
is ideal for scenarios where fast lookup and uniqueness of elements are required, and the order of elements is not important. It's commonly used in situations like:
- Removing duplicates from a collection
- Fast membership testing
- Implementing certain algorithms that require quick element access and uniqueness
Understanding when to use std::unordered_set
versus other containers like std::set
or std::vector
is crucial for writing efficient and clear C++ code in various application domains. Its constant-time average complexity for key operations makes it a go-to choice for many performance-critical applications where set operations are frequently performed.