C++ 自定义 `std::allocator`：为特定容器定制内存分配策略

大家好！欢迎来到今天的“内存魔法秀”！我是你们的表演嘉宾，今天我们将一起探索C++标准库中一个神秘而强大的角色——std::allocator。

可能很多人听到“allocator”就觉得头大，觉得这玩意儿太底层，太复杂，跟自己没啥关系。但事实上，allocator就像容器的“房东”，决定了容器里的数据住在哪儿，住得舒不舒服。如果你想让你的容器跑得更快，更省内存，或者想做一些特殊的内存管理，那么自定义allocator绝对是你的秘密武器。

今天，我们就来扒一扒std::allocator的底裤，看看它到底是个什么东西，以及如何通过自定义allocator来提升你的代码性能。

1. std::allocator：容器背后的“房东”

在C++中，标准容器（比如std::vector，std::list，std::map）使用allocator来分配和释放内存。默认情况下，它们使用std::allocator，这个玩意儿基本上就是调用new和delete，简单粗暴。

#include <iostream>
#include <vector>

int main() {
    std::vector<int> my_vector; // 使用默认的 std::allocator<int>
    my_vector.push_back(10);
    my_vector.push_back(20);
    my_vector.push_back(30);

    for (int i : my_vector) {
        std::cout << i << " ";
    }
    std::cout << std::endl;

    return 0;
}

这段代码看起来平平无奇，但实际上，std::vector<int>在背后默默地使用了std::allocator<int>来分配内存，存储这些整数。

那么，为什么要自定义allocator呢？理由如下：

性能优化: 默认的std::allocator可能不是最高效的，特别是在频繁分配和释放小块内存时。你可以自定义allocator来使用更高效的内存池或者缓存机制。
内存管理控制: 你可能需要对内存分配进行更精细的控制，例如限制内存使用量，或者使用特定的内存区域。
定制化需求: 你可能需要在分配内存时做一些额外的操作，比如记录分配信息，或者进行安全检查。
嵌入式系统: 在嵌入式系统中，内存资源非常有限，自定义allocator可以帮助你更好地管理内存。

2. allocator的接口：房东的“规矩”

要成为一个合格的“房东”，allocator必须遵守一定的“规矩”，也就是实现特定的接口。这些接口定义了allocator必须提供的功能：

方法	说明
`value_type`	`allocator`分配的对象的类型。
`allocate(n)`	分配`n * sizeof(value_type)`字节的内存，返回指向分配的内存的指针。
`deallocate(p, n)`	释放之前由`allocate(n)`分配的内存。`p`是指向要释放的内存的指针，`n`是之前分配的对象的数量。
`construct(p, ...)`	在`p`指向的内存位置构造一个对象。`...`是传递给对象构造函数的参数。
`destroy(p)`	销毁`p`指向的对象。
`max_size()`	返回`allocator`可以分配的最大对象数量。
`operator==`	比较两个`allocator`是否相等。通常情况下，如果两个`allocator`类型相同，并且分配策略相同，则它们相等。
`operator!=`	比较两个`allocator`是否不相等。
`rebind<U>::other`	一个类型定义，表示一个`allocator`，用于分配类型为`U`的对象。这允许`allocator`在容器内部分配不同类型的对象（例如，在`std::map`中，`allocator`需要能够分配`std::pair<const Key, T>`类型的对象）。

3. 自定义allocator：打造你的专属“房东”

现在，我们来创建一个自定义的allocator，让它来管理我们的容器。

3.1 一个简单的counting_allocator

首先，我们创建一个简单的counting_allocator，它可以跟踪分配和释放的内存块数量。

#include <iostream>
#include <memory>

template <typename T>
class counting_allocator {
public:
    using value_type = T;

    counting_allocator() noexcept : allocated_count(0), deallocated_count(0) {}

    template <typename U>
    counting_allocator(const counting_allocator<U>& other) noexcept : allocated_count(other.allocated_count), deallocated_count(other.deallocated_count) {}

    T* allocate(std::size_t n) {
        allocated_count += n;
        std::cout << "Allocated " << n << " objects. Total allocated: " << allocated_count << std::endl;
        T* ptr = static_cast<T*>(::operator new(n * sizeof(T)));
        if (ptr == nullptr) {
            throw std::bad_alloc();
        }
        return ptr;
    }

    void deallocate(T* p, std::size_t n) {
        deallocated_count += n;
        std::cout << "Deallocated " << n << " objects. Total deallocated: " << deallocated_count << std::endl;
        ::operator delete(p);
    }

    template <typename U, typename... Args>
    void construct(U* p, Args&&... args) {
        new (p) U(std::forward<Args>(args)...);
    }

    void destroy(T* p) {
        p->~T();
    }

    std::size_t max_size() const noexcept {
        return std::numeric_limits<std::size_t>::max() / sizeof(T);
    }

    bool operator==(const counting_allocator& other) const noexcept {
        return true; // For simplicity, we consider all counting_allocators equal
    }

    bool operator!=(const counting_allocator& other) const noexcept {
        return !(*this == other);
    }

private:
    mutable std::size_t allocated_count;
    mutable std::size_t deallocated_count;
};

这个counting_allocator做了以下几件事：

定义了value_type: 指定了allocator分配的对象的类型。
实现了allocate: 分配内存，并增加allocated_count。
实现了deallocate: 释放内存，并增加deallocated_count。
实现了construct和destroy: 构造和销毁对象。
实现了max_size: 返回可以分配的最大对象数量。
实现了operator==和operator!=: 比较两个allocator是否相等。

现在，我们可以使用这个counting_allocator来创建一个std::vector：

#include <vector>

int main() {
    std::vector<int, counting_allocator<int>> my_vector(counting_allocator<int>());
    my_vector.push_back(10);
    my_vector.push_back(20);
    my_vector.push_back(30);

    for (int i : my_vector) {
        std::cout << i << " ";
    }
    std::cout << std::endl;

    return 0;
}

运行这段代码，你会看到控制台输出了分配和释放内存的信息。

3.2 一个简单的fixed_size_allocator

接下来，我们创建一个简单的fixed_size_allocator，它使用预先分配的一块内存来满足分配请求。如果内存不足，则抛出异常。

#include <iostream>
#include <memory>
#include <stdexcept>

template <typename T>
class fixed_size_allocator {
public:
    using value_type = T;

    fixed_size_allocator(std::size_t size) : buffer(new char[size * sizeof(T)]), buffer_size(size * sizeof(T)), allocated_size(0) {}

    ~fixed_size_allocator() {
        delete[] buffer;
    }

    template <typename U>
    fixed_size_allocator(const fixed_size_allocator<U>& other) noexcept : buffer(nullptr), buffer_size(0), allocated_size(0) {
        // This is a simplified version.  A proper implementation would copy the buffer.
    }

    T* allocate(std::size_t n) {
        std::size_t required_size = n * sizeof(T);
        if (allocated_size + required_size > buffer_size) {
            throw std::bad_alloc();
        }

        T* ptr = reinterpret_cast<T*>(buffer + allocated_size);
        allocated_size += required_size;
        return ptr;
    }

    void deallocate(T* p, std::size_t n) {
        // In a real implementation, you might need to track which blocks are free.
        // For this simple example, we don't actually free the memory.
        // allocated_size -= n * sizeof(T); // This is incorrect and will lead to issues.

        // Instead, we do nothing, as we can't reliably track individual allocations without
        // additional metadata.  This allocator is only suitable for scenarios where all allocated
        // memory is deallocated at once (e.g., when the allocator itself is destroyed).

        // WARNING: This deallocate implementation is a no-op.  It is not safe to call it multiple
        // times for the same memory region.
    }

    template <typename U, typename... Args>
    void construct(U* p, Args&&... args) {
        new (p) U(std::forward<Args>(args)...);
    }

    void destroy(T* p) {
        p->~T();
    }

    std::size_t max_size() const noexcept {
        return buffer_size / sizeof(T);
    }

    bool operator==(const fixed_size_allocator& other) const noexcept {
        return buffer == other.buffer && buffer_size == other.buffer_size;
    }

    bool operator!=(const fixed_size_allocator& other) const noexcept {
        return !(*this == other);
    }

private:
    char* buffer;
    std::size_t buffer_size;
    std::size_t allocated_size;
};

这个fixed_size_allocator的实现：

预先分配一块内存： 在构造函数中，分配一块固定大小的内存。
在预分配的内存中分配对象： allocate方法在预分配的内存中分配对象。
deallocate方法 这里需要特别注意，这个简单的实现中，deallocate方法实际上是一个空操作。原因是：
- 要正确实现 deallocate，我们需要维护一个空闲块的列表，或者使用某种方法来跟踪哪些内存块是空闲的。
- 在没有这些元数据的情况下，我们无法安全地释放单个内存块，因为我们不知道它的大小或位置。
因此，此fixed_size_allocator 仅适用于以下情况：分配的所有内存都一次性释放（例如，在分配器本身被销毁时）。

现在，我们可以使用这个fixed_size_allocator来创建一个std::vector：

#include <vector>

int main() {
    fixed_size_allocator<int> my_allocator(10); // Allocate space for 10 integers
    std::vector<int, fixed_size_allocator<int>> my_vector(my_allocator);
    my_vector.push_back(10);
    my_vector.push_back(20);
    my_vector.push_back(30);
    my_vector.push_back(40);
    my_vector.push_back(50);
    my_vector.push_back(60);
    my_vector.push_back(70);
    my_vector.push_back(80);
    my_vector.push_back(90);
    my_vector.push_back(100);

    // my_vector.push_back(110); // This will throw std::bad_alloc

    for (int i : my_vector) {
        std::cout << i << " ";
    }
    std::cout << std::endl;

    return 0;
}

如果尝试分配超过10个整数的空间，将会抛出std::bad_alloc异常。

4. scoped_allocator_adaptor：组合拳的威力

std::scoped_allocator_adaptor是一个非常强大的工具，它可以让你将多个allocator组合在一起，形成一个层次化的内存分配策略。

想象一下，你有一个容器，它存储了另一个容器。你希望外层容器使用一个allocator，而内层容器使用另一个allocator。这时，std::scoped_allocator_adaptor就派上用场了。

#include <iostream>
#include <vector>
#include <list>
#include <memory>

int main() {
    // Create a counting allocator
    counting_allocator<int> counting_alloc;

    // Create a vector of lists, using the counting allocator for the vector
    // and the default allocator for the lists.  We use scoped_allocator_adaptor to
    // propagate the counting allocator to the vector.
    std::vector<std::list<int>, std::scoped_allocator_adaptor<counting_allocator<std::list<int>>>> my_vector(std::scoped_allocator_adaptor<counting_allocator<std::list<int>>>(counting_alloc));

    // Add a list to the vector
    my_vector.emplace_back();

    // Add some elements to the list
    my_vector[0].push_back(1);
    my_vector[0].push_back(2);
    my_vector[0].push_back(3);

    // The counting allocator will track the allocations made by the vector, but not the list.

    return 0;
}

在这个例子中，my_vector使用了counting_allocator，而std::list<int>使用了默认的std::allocator。 std::scoped_allocator_adaptor确保了counting_allocator只用于分配std::vector本身，而不用于分配std::list<int>中的元素。

5. 注意事项：自定义allocator的坑

自定义allocator虽然强大，但也需要注意一些坑：

状态一致性: 如果你的allocator有状态（比如计数器，内存池），你需要确保在拷贝和赋值allocator时，状态能够正确传递。
异常安全: allocate和deallocate方法应该保证异常安全，避免内存泄漏。
线程安全: 如果你的代码是多线程的，你需要确保allocator的线程安全。
rebind: 确保 rebind 模板正确工作，以便容器可以在内部使用你的分配器来分配不同类型的对象 (例如，在 std::map 中，allocator 需要能够分配 std::pair<const Key, T> 类型的对象)。
deallocate 的正确实现: deallocate 函数需要正确的释放内存。简单的 new/delete 方式比较容易实现。更复杂的内存管理方式，需要更仔细的设计。
避免过度优化: 不要为了优化而过度设计allocator，导致代码过于复杂难以维护。

总结：让allocator成为你的秘密武器

自定义std::allocator是一个高级技巧，它可以让你对内存管理进行更精细的控制，从而提升代码性能和可维护性。虽然自定义allocator需要一定的技巧和经验，但是只要你掌握了基本原理，就可以让allocator成为你的秘密武器，让你的代码在性能和可维护性方面更上一层楼。

希望今天的“内存魔法秀”能让你对std::allocator有更深入的了解。记住，内存管理是编程中非常重要的一部分，掌握好内存管理技巧，你就能写出更高效、更健壮的代码。

感谢大家的观看，我们下次再见！

C++ 自定义 std::allocator：为特定容器定制内存分配策略

发表回复 取消回复

C++ 自定义 `std::allocator`：为特定容器定制内存分配策略

发表回复取消回复