C++ `valgrind` 深度：自定义工具与错误报告解析

哈喽，各位好！今天咱们来聊聊C++世界里的大侦探——Valgrind，以及如何把它打造成你的专属超级侦探。

Valgrind：不只是内存泄漏检测器

Valgrind，很多人第一印象就是“内存泄漏检测器”。没错，它在这方面确实非常出色，但Valgrind的功能远不止于此。它是一个强大的动态分析框架，可以用来构建各种各样的分析工具。

Valgrind 的核心思想是 二进制代码重写。它将你的程序加载到自己的虚拟CPU环境中，然后逐条指令地执行你的程序。在执行过程中，Valgrind会修改（重写）这些指令，插入一些额外的代码，用于追踪内存使用、检测错误等。这使得 Valgrind 能够深入到程序的每一个角落，找出潜在的问题。

Valgrind 的组成部分：工具箱

Valgrind 并不是一个单一的工具，而是一个工具集合。每个工具都专注于不同的分析任务。最常用的几个工具包括：

Memcheck: 内存错误检测器，查找内存泄漏、非法访问等问题。
Cachegrind: 缓存分析器，帮助你了解程序的缓存命中率，优化性能。
Callgrind: 程序剖析器，可以生成函数调用图，找出程序的瓶颈。
Helgrind: 线程错误检测器，查找死锁、竞争条件等并发问题。
DRD (Data Race Detector): 数据竞争检测器，与Helgrind类似，但更专注于数据竞争。
Massif: 堆栈分析器，用于可视化程序的堆栈使用情况。

Memcheck：内存错误的克星

咱们先来重点聊聊Memcheck，毕竟它是C++程序员最常打交道的工具。Memcheck 可以检测以下几种常见的内存错误：

使用未初始化的内存： 读取未初始化的变量或内存区域。
读取/写入已释放的内存： 访问已经被 free 或 delete 掉的内存。
读取/写入超出分配的内存： 访问数组越界，缓冲区溢出等。
内存泄漏： 分配了内存，但没有释放。
使用不匹配的 malloc/free/new/delete: 例如，用 free 释放 new 分配的内存。
重叠的 src 和 dst 内存块: 例如，在 memcpy 或 memmove 中，源和目标内存区域重叠。

Memcheck 的基本用法

使用 Memcheck 非常简单，只需要在你的程序前加上 valgrind --leak-check=full 命令即可。

valgrind --leak-check=full ./my_program

--leak-check=full 选项会进行更详细的内存泄漏检查，报告所有类型的泄漏。

Memcheck 错误报告解析

Memcheck 的错误报告可能一开始会让你觉得有些晦涩，但只要掌握了关键信息，就能轻松定位问题。

一个典型的 Memcheck 错误报告可能如下所示：

==12345== Memcheck, a memory error detector
==12345== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12345== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==12345== Command: ./my_program
==12345==
==12345== Invalid read of size 4
==12345==    at 0x400624: foo (in /path/to/my_program)
==12345==    by 0x400678: main (in /path/to/my_program)
==12345==  Address 0x4a06040 is 0 bytes after a block of size 16 alloc'd
==12345==    at 0x483571F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12345==    by 0x4005F6: foo (in /path/to/my_program)
==12345==
==12345==
==12345== LEAK SUMMARY:
==12345==    definitely lost: 24 bytes in 1 blocks
==12345==      indirectly lost: 0 bytes in 0 blocks
==12345==        possibly lost: 0 bytes in 0 blocks
==12345==      still reachable: 0 bytes in 0 blocks
==12345==           suppressed: 0 bytes in 0 blocks
==12345==
==12345== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

让我们逐行解析这个报告：

==12345== Invalid read of size 4: 错误类型，这里表示读取了 4 个字节的无效内存。
==12345== at 0x400624: foo (in /path/to/my_program): 错误发生的位置，foo 函数，地址 0x400624。
==12345== by 0x400678: main (in /path/to/my_program): 调用 foo 函数的函数，main 函数。
==12345== Address 0x4a06040 is 0 bytes after a block of size 16 alloc'd: 错误发生的内存地址 0x4a06040，以及该地址是分配的 16 字节内存块之后的 0 字节。这意味着你访问了分配的内存块之外的区域。
==12345== at 0x483571F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so): 分配内存的位置，malloc 函数。
==12345== by 0x4005F6: foo (in /path/to/my_program): 调用 malloc 函数的函数，foo 函数。
==12345== LEAK SUMMARY:: 内存泄漏总结，报告了不同类型的内存泄漏。
- definitely lost: 肯定泄漏，没有任何指针指向这块内存。
- indirectly lost: 间接泄漏，指向这块内存的指针本身也泄漏了。
- possibly lost: 可能泄漏，Valgrind 不确定这块内存是否会被释放。
- still reachable: 仍然可达，程序退出时，这块内存仍然被指针指向，但没有被释放。通常不是问题，但在长时间运行的程序中可能需要注意。

一个简单的 Memcheck 示例

#include <iostream>

int main() {
  int* arr = new int[10];

  // 故意越界访问
  arr[10] = 42;

  // 忘记释放内存
  // delete[] arr;

  return 0;
}

使用 valgrind --leak-check=full ./my_program 运行这个程序，你会得到一个类似下面的错误报告：

==12345== Invalid write of size 4
==12345==    at 0x40066C: main (in /path/to/my_program)
==12345==  Address 0x4a060a0 is 0 bytes after a block of size 40 alloc'd
==12345==    at 0x483571F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12345==    by 0x400647: main (in /path/to/my_program)
==12345==
==12345==
==12345== LEAK SUMMARY:
==12345==    definitely lost: 40 bytes in 1 blocks
==12345==      indirectly lost: 0 bytes in 0 blocks
==12345==        possibly lost: 0 bytes in 0 blocks
==12345==      still reachable: 0 bytes in 0 blocks
==12345==           suppressed: 0 bytes in 0 blocks
==12345==
==12345== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

这个报告清晰地指出了两个错误：

Invalid write of size 4: 越界写入。
definitely lost: 40 bytes in 1 blocks: 内存泄漏。

进阶技巧：抑制错误报告

有时候，Valgrind 可能会报告一些你明知故犯，或者无法避免的错误。例如，某些第三方库可能存在内存泄漏，但你无法修改它们的代码。在这种情况下，你可以使用 抑制文件 来告诉 Valgrind 忽略这些错误。

抑制文件是一个文本文件，包含一系列规则，用于匹配和忽略特定的错误报告。

创建一个名为 suppressions.txt 的抑制文件，内容如下：

{
   <insert_a_suppression_name_here>
   Memcheck:Leak
   fun:operator new[](unsigned long)
   ...
}

<insert_a_suppression_name_here>: 抑制规则的名称，可以随意命名。
Memcheck:Leak: 要抑制的错误类型，这里是内存泄漏。
fun:operator new[](unsigned long): 错误发生的函数，可以使用通配符 *。

然后，使用 --suppressions=suppressions.txt 选项来告诉 Valgrind 使用这个抑制文件：

valgrind --leak-check=full --suppressions=suppressions.txt ./my_program

Valgrind 会读取 suppressions.txt 文件，并忽略匹配的错误报告。

自定义 Valgrind 工具：打造你的专属侦探

Valgrind 真正的强大之处在于它的可扩展性。你可以使用 Valgrind 的 API 创建自己的分析工具，用于检测特定类型的错误，或者进行更深入的性能分析。

自定义 Valgrind 工具需要一些 C 语言的知识，以及对 Valgrind API 的理解。

自定义工具的基本结构

一个自定义 Valgrind 工具通常包含以下几个部分：

工具描述文件 (.desc): 描述工具的名称、版本、作者等信息。
客户端代码 (.c/.cpp): 包含你的分析逻辑，与 Valgrind API 交互。
服务器端代码 (.c): Valgrind 核心的一部分，负责处理客户端的请求，并进行指令重写。
Makefile: 用于编译和安装你的工具。

一个简单的自定义工具示例：指令计数器

咱们来创建一个简单的自定义工具，用于统计程序执行的指令数量。

创建工具描述文件 (instrcount.desc):

name:        instrcount
description: Counts the number of executed instructions.
version:     1.0
author:      Your Name

创建客户端代码 (instrcount_client.c):

#include <iostream>
#include <valgrind/memcheck.h>
#include <valgrind/valgrind.h>

// 定义一个全局变量，用于存储指令计数
static unsigned long long instr_count = 0;

// 这个函数会在每次执行指令前被调用
extern "C" void inc_instr_count() {
  instr_count++;
}

// 这个函数会在程序退出时被调用
extern "C" void print_instr_count() {
  std::cout << "Total instructions executed: " << instr_count << std::endl;
}

// Valgrind 会调用这个函数来初始化客户端
extern "C" int client_request(void* cmd) {
  if (strcmp((char*)cmd, "print_instr_count") == 0) {
    print_instr_count();
    return 0;
  }
  return -1;
}

// Valgrind 会调用这个函数来初始化客户端
extern "C" void VG_(tool_fini)(int exitcode) {
    std::cout << "VG_(tool_fini) called with exitcode: " << exitcode << std::endl;
}

// 这个函数会在程序启动时被调用
extern "C" void VG_(needs_client_requests)(void)
{
   VALGRIND_DO_CLIENT_REQUEST_EMPTY(client_request);
}

创建服务器端代码 (instrcount_server.c):

#include "pub_tool_basics.h"
#include "pub_tool_tooliface.h"
#include "pub_tool_machine.h"
#include "pub_tool_libcassert.h"

// 全局变量，用于存储客户端代码的地址
static Addr client_code_addr = 0;

// 指令重写函数
static void instrument_instruction(IRSB* irsb) {
  // 创建一个 IRStmt，用于调用客户端代码中的 inc_instr_count 函数
  IRStmt* call_stmt = mkIRStmt_Call(
      0, // 调用的目标地址，稍后设置
      IRConst_None, // 参数
      0 // 返回值
  );

  // 将调用语句插入到 IR 基本块的开头
  insertIRStmt(irsb, 0, call_stmt);

  // 设置调用目标地址
  ((IRCallee*)&(call_stmt->details.call.callee))->addr = client_code_addr;
}

// 工具初始化函数
static void instrcount_init(void) {
  // 获取客户端代码中 inc_instr_count 函数的地址
  client_code_addr = VG_getClientCodeAddr("inc_instr_count");
  if (client_code_addr == 0) {
    VG_(printf)("Error: Could not find inc_instr_count function in client code.n");
    VG_(exit)(1);
  }

  // 设置指令重写回调函数
  VG_(instrument_instruction) = instrument_instruction;
}

// 工具终止函数
static void instrcount_fini(Int exitcode) {
  // 调用客户端代码中的 print_instr_count 函数
  VG_USERREQ__CLIENT_COMMAND("print_instr_count");
}

// 工具接口结构体
static const ToolInterface instrcount_interface = {
  .name           = "instrcount",
  .init           = instrcount_init,
  .fini           = instrcount_fini,
  .instrument     = NULL,
  .discard_state  = NULL,
  .redirect_syscall= NULL,
  .print_state    = NULL,
  .variant        = NULL,
  .first_tool_arg = 0,
  .options        = NULL
};

// 工具入口点
VG_DETERMINE_INTERFACE(instrcount_interface)

创建 Makefile:

TOOL_NAME = instrcount
include $(VALGRIND_DIR)/Makefile.tool

将以上文件放在一个单独的目录中，然后运行 make 命令编译你的工具。

运行你的自定义工具

编译完成后，你就可以使用你的自定义工具来分析程序了。

valgrind --tool=instrcount ./my_program

你的程序运行结束后，会打印出执行的指令总数。

错误报告解析：不仅仅是 Memcheck

虽然 Memcheck 是最常用的工具，但 Valgrind 的其他工具也能提供非常有价值的信息。

Cachegrind: 可以帮助你了解程序的缓存行为，找出缓存瓶颈。通过分析 Cachegrind 的输出，你可以优化你的代码，提高缓存命中率。
Callgrind: 可以生成函数调用图，找出程序的性能瓶颈。通过分析 Callgrind 的输出，你可以了解程序的时间都花在哪里，优化关键函数的性能。
Helgrind/DRD: 可以帮助你找出并发程序中的死锁、竞争条件等问题。这些工具可以帮助你编写更可靠的多线程程序。

总结

Valgrind 是一个强大的动态分析框架，可以帮助你找出 C++ 程序中的各种问题。无论是内存错误、性能瓶颈，还是并发问题，Valgrind 都能提供有价值的信息。掌握 Valgrind 的使用方法，并学会自定义工具，可以让你成为一个更优秀的 C++ 程序员。

希望今天的讲解对大家有所帮助！以后有机会再和大家分享更多关于 C++ 的知识。

发表回复 取消回复

发表回复取消回复