C++ sanitizers：ASan, UBSan, TSAN 助力代码调试与问题定位 - 智猿学院-前后端，数据库，人工智能，云计算等领域前沿技术讲座

好的，各位观众，欢迎来到今天的“C++ Sanitizers：代码卫士， Bug 克星” 讲座！

今天，我们要聊的是 C++ 程序员手中的三大神器：ASan, UBSan, 和 TSan。别害怕，虽然名字听起来像科幻电影里的秘密武器，但它们实际上是帮助我们找出代码中隐藏的 bug 的好帮手。想象一下，它们就像是你的代码的私人医生，时刻关注着你的代码，一旦发现问题，立刻报警。

开场白：Bug 的那些事儿

作为程序员，我们最怕什么？不是需求变更，也不是 deadline 临近，而是…BUG！那些隐藏在代码深处的 bug，就像幽灵一样，时常在你最不希望的时候冒出来，让你抓狂。更可怕的是，有些 bug 隐藏得很深，即使你用尽各种调试技巧，也难以发现它们。

这些 bug 往往会导致程序崩溃、数据损坏，甚至安全漏洞。而更更更可怕的是，这些 bug 往往只有在生产环境才会出现，让你在老板面前颜面扫地。

所以，我们需要一些强大的工具来帮助我们找出这些 bug。这就是 Sanitizers 的用武之地。

Sanitizers 是什么？

简单来说，Sanitizers 是一组编译器和运行时库，它们会在你的程序运行时，对内存访问、未定义行为和线程并发问题进行检查。如果发现问题，它们会立即报告错误信息，帮助你定位 bug 的位置。

可以将 Sanitizers 理解为代码的“健康检查”工具，它们会监视代码的运行，一旦发现任何“不健康”的行为，立刻发出警报。

主角登场：ASan, UBSan, TSan

接下来，让我们隆重介绍今天的三位主角：

ASan (AddressSanitizer): 内存错误检测专家。它主要负责检测内存相关的错误，比如堆溢出、栈溢出、使用已释放的内存等等。
UBSan (UndefinedBehaviorSanitizer): 未定义行为检测大师。它主要负责检测 C++ 标准中未定义的行为，比如整数溢出、空指针解引用、除以零等等。
TSan (ThreadSanitizer): 线程并发问题侦探。它主要负责检测多线程程序中的数据竞争和死锁等问题。

接下来，我们将逐一介绍这三位“卫士”，并结合代码示例，让你了解它们是如何工作的。

第一位：ASan – 内存错误终结者

内存错误是 C++ 中最常见的 bug 之一。它们往往会导致程序崩溃，或者更糟糕的是，导致程序出现不可预测的行为。ASan 就是用来解决这些问题的。

ASan 能做什么？

堆溢出 (Heap-buffer-overflow): 写入超出堆分配内存范围的数据。
栈溢出 (Stack-buffer-overflow): 写入超出栈分配内存范围的数据。
使用已释放的内存 (Use-after-free): 访问已经被释放的内存。
重复释放 (Double-free): 多次释放同一块内存。
内存泄漏 (Memory leak): 分配的内存没有被释放。

ASan 怎么用？

使用 ASan 非常简单，只需要在编译和链接时加上 -fsanitize=address 选项即可。

g++ -fsanitize=address your_code.cpp -o your_program

ASan 实战演练

让我们来看一个堆溢出的例子：

#include <iostream>

int main() {
  int *array = new int[10];
  for (int i = 0; i <= 10; ++i) { // 错误：循环条件应该是 i < 10
    array[i] = i;
  }
  delete[] array;
  return 0;
}

在这个例子中，我们分配了一个大小为 10 的整型数组，然后在循环中向数组写入数据。但是，循环的条件是 i <= 10，这意味着我们会向 array[10] 写入数据，这会超出数组的边界，导致堆溢出。

如果我们用 ASan 编译并运行这个程序，ASan 会立即检测到这个错误，并输出如下错误信息：

==30776==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000040 at pc 0x000000400754 bp 0x7ffd4f681140 sp 0x7ffd4f681138
WRITE of size 4 at 0x602000000040 thread T0
    #0 0x400753 in main (/tmp/test.cpp:5)
    #1 0x7f2f7a617d0a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)
    #2 0x4005ed in _start (/tmp/test+0x4005ed)

0x602000000040 is located 0 bytes to the right of 40-byte region [0x602000000010,0x602000000038)
allocated by thread T0 here:
    #0 0x41909d in operator new[](unsigned long) (/tmp/test+0x41909d)
    #1 0x40069e in main (/tmp/test.cpp:4)
    #2 0x7f2f7a617d0a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)
    #3 0x4005ed in _start (/tmp/test+0x4005ed)

SUMMARY: AddressSanitizer: heap-buffer-overflow /tmp/test.cpp:5 in main
Shadow bytes around the buggy address:
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Invalid freed memory:    ff
  Out of memory:           fe
  ASan internal:           fc
==30776==ABORTING

这个错误信息非常详细，它告诉我们：

错误类型是 heap-buffer-overflow，表示发生了堆溢出。
错误发生的地址是 0x602000000040。
错误发生在 /tmp/test.cpp 文件的第 5 行，也就是 array[i] = i; 这一行。
错误是由写入 4 个字节的数据引起的。
错误发生的线程是 T0。

有了这些信息，我们可以很容易地定位到 bug 的位置，并进行修复。

再来一个例子：使用已释放的内存

#include <iostream>

int main() {
  int *ptr = new int;
  *ptr = 10;
  delete ptr;
  *ptr = 20; // 错误：使用已释放的内存
  return 0;
}

在这个例子中，我们先分配了一块内存，然后释放了它，最后又试图访问这块内存。这会导致使用已释放的内存的错误。

ASan 会检测到这个错误，并输出如下错误信息：

==30804==ERROR: AddressSanitizer: use-after-free READ of size 4 at 0x602000000010 thread T0
    #0 0x400753 in main (/tmp/test.cpp:6)
    #1 0x7f2f7a617d0a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)
    #2 0x4005ed in _start (/tmp/test+0x4005ed)

0x602000000010 is located 0 bytes inside of 4-byte region [0x602000000010,0x602000000014)
freed by thread T0 here:
    #0 0x41962d in operator delete(void*) (/tmp/test+0x41962d)
    #1 0x400716 in main (/tmp/test.cpp:5)
    #2 0x7f2f7a617d0a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)
    #3 0x4005ed in _start (/tmp/test+0x4005ed)

SUMMARY: AddressSanitizer: use-after-free /tmp/test.cpp:6 in main
Shadow bytes around the buggy address:
  fa fa fa fa fd fd fd fd
  fd fd fd fd fd fd fd fd
=>fd fd fd fd 00 00 00 00
  fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Invalid freed memory:    ff
  Out of memory:           fe
  ASan internal:           fc
==30804==ABORTING

这个错误信息告诉我们，错误类型是 use-after-free，表示发生了使用已释放的内存的错误。

第二位：UBSan – 未定义行为猎手

C++ 标准中有很多未定义的行为。这些行为会导致程序出现不可预测的结果，而且很难调试。UBSan 就是用来检测这些未定义行为的。

UBSan 能做什么？

整数溢出 (Integer overflow): 整数运算的结果超出了整数类型的范围。
空指针解引用 (Null pointer dereference): 访问空指针指向的内存。
除以零 (Division by zero): 用零作为除数。
有符号数位移 (Shift out of bounds): 位移操作的位数超出了类型的范围。
虚函数调用错误 (Invalid vtable): 调用了无效的虚函数。
返回语句缺失 (Missing return statement): 有返回值的函数没有返回任何值。
不可达代码 (Unreachable code): 代码永远不会被执行。

UBSan 怎么用？

使用 UBSan 也非常简单，只需要在编译和链接时加上 -fsanitize=undefined 选项即可。

g++ -fsanitize=undefined your_code.cpp -o your_program

UBSan 实战演练

让我们来看一个整数溢出的例子：

#include <iostream>
#include <limits>

int main() {
  int x = std::numeric_limits<int>::max();
  x = x + 1; // 错误：整数溢出
  std::cout << x << std::endl;
  return 0;
}

在这个例子中，我们将 x 设置为 int 类型的最大值，然后将 x 加 1。这会导致整数溢出。

如果我们用 UBSan 编译并运行这个程序，UBSan 会立即检测到这个错误，并输出如下错误信息：

/tmp/test.cpp:6:7: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
-fsanitize=undefined-behavior (UBSan)

这个错误信息告诉我们，错误类型是 signed integer overflow，表示发生了有符号整数溢出。

再来一个例子：空指针解引用

#include <iostream>

int main() {
  int *ptr = nullptr;
  *ptr = 10; // 错误：空指针解引用
  return 0;
}

在这个例子中，我们将 ptr 设置为空指针，然后试图访问它指向的内存。这会导致空指针解引用的错误。

UBSan 会检测到这个错误，并输出如下错误信息：

/tmp/test.cpp:4:3: runtime error: null pointer dereference
-fsanitize=undefined-behavior (UBSan)

这个错误信息告诉我们，错误类型是 null pointer dereference，表示发生了空指针解引用。

第三位：TSan – 并发问题排查员

多线程编程是 C++ 中一个非常重要的领域。但是，多线程程序很容易出现数据竞争和死锁等并发问题。TSan 就是用来检测这些问题的。

TSan 能做什么？

数据竞争 (Data race): 多个线程同时访问同一个变量，并且至少有一个线程在写入该变量。
死锁 (Deadlock): 多个线程互相等待对方释放资源，导致程序无法继续执行。

TSan 怎么用？

使用 TSan 也非常简单，只需要在编译和链接时加上 -fsanitize=thread 选项即可。

g++ -fsanitize=thread your_code.cpp -o your_program -lpthread

注意：在使用 TSan 时，需要链接 pthread 库。

TSan 实战演练

让我们来看一个数据竞争的例子：

#include <iostream>
#include <thread>

int counter = 0;

void increment() {
  for (int i = 0; i < 100000; ++i) {
    counter++; // 错误：数据竞争
  }
}

int main() {
  std::thread t1(increment);
  std::thread t2(increment);

  t1.join();
  t2.join();

  std::cout << "Counter: " << counter << std::endl;
  return 0;
}

在这个例子中，两个线程同时对 counter 变量进行自增操作。由于没有使用任何同步机制，这会导致数据竞争。

如果我们用 TSan 编译并运行这个程序，TSan 会立即检测到这个错误，并输出如下错误信息：

==================
WARNING: ThreadSanitizer: data race (pid=30852, tid=30854)
  Read of size 4 at 0x000000404040 by thread T2:
    #0 increment() /tmp/test.cpp:6 (test+0x40095b)
    #1 void std::thread::_State_impl<std::thread::_Invoker<void (*)()>>::_M_run() /usr/include/c++/9/thread:239 (test+0x40118a)
    #2  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xbd632)
    #3 start_thread  (/lib/x86_64-linux-gnu/libpthread.so.0+0x9609)

  Previous write of size 4 at 0x000000404040 by thread T1:
    #0 increment() /tmp/test.cpp:6 (test+0x40095b)
    #1 void std::thread::_State_impl<std::thread::_Invoker<void (*)()>>::_M_run() /usr/include/c++/9/thread:239 (test+0x40118a)
    #2  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xbd632)
    #3 start_thread  (/lib/x86_64-linux-gnu/libpthread.so.0+0x9609)

  Location is global 'counter' of size 4 at 0x000000404040 (test+0x000000404040)

  Thread T2 (tid=30854, running) created at:
    #0 std::thread::thread<void (&)(), , void>(void (&)(), std::tuple<>, std::_Index_tuple<>) /usr/include/c++/9/thread:134 (test+0x4012a6)
    #1 main /tmp/test.cpp:13 (test+0x400a65)
    #2 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)

  Thread T1 (tid=30853, running) created at:
    #0 std::thread::thread<void (&)(), , void>(void (&)(), std::tuple<>, std::_Index_tuple<>) /usr/include/c++/9/thread:134 (test+0x4012a6)
    #1 main /tmp/test.cpp:12 (test+0x400a3a)
    #2 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x22d0a)

SUMMARY: ThreadSanitizer: data race /tmp/test.cpp:6 in increment()
==================

这个错误信息告诉我们，错误类型是 data race，表示发生了数据竞争。

如何解决数据竞争？

解决数据竞争的方法有很多，比如使用互斥锁、原子变量等等。

下面是一个使用互斥锁解决数据竞争的例子：

#include <iostream>
#include <thread>
#include <mutex>

int counter = 0;
std::mutex mtx;

void increment() {
  for (int i = 0; i < 100000; ++i) {
    std::lock_guard<std::mutex> lock(mtx); // 加锁
    counter++;
  }
}

int main() {
  std::thread t1(increment);
  std::thread t2(increment);

  t1.join();
  t2.join();

  std::cout << "Counter: " << counter << std::endl;
  return 0;
}

在这个例子中，我们使用互斥锁 mtx 来保护 counter 变量。在访问 counter 变量之前，我们需要先获取锁，访问完毕之后再释放锁。这样可以保证同一时刻只有一个线程可以访问 counter 变量，从而避免数据竞争。

Sanitizers 的优缺点

优点：

准确性高： Sanitizers 能够准确地检测到各种类型的错误。
易于使用： 只需要在编译和链接时加上几个选项即可。
错误信息详细： Sanitizers 提供的错误信息非常详细，可以帮助你快速定位 bug 的位置。
运行时检测： Sanitizers 在程序运行时进行检测，可以检测到一些静态分析工具无法检测到的错误。

缺点：

性能开销： Sanitizers 会增加程序的运行时间和内存占用。
误报： 在某些情况下，Sanitizers 可能会产生误报。
并非万能： Sanitizers 只能检测到特定类型的错误，不能检测到所有类型的 bug。

一些使用小技巧

尽早使用： 越早使用 Sanitizers，越容易发现 bug。
持续集成： 将 Sanitizers 集成到持续集成系统中，可以自动检测代码中的 bug。
结合其他工具： 将 Sanitizers 与其他调试工具（比如 gdb）结合使用，可以更有效地调试程序。
关注错误信息： 仔细阅读 Sanitizers 提供的错误信息，可以帮助你快速定位 bug 的位置。
理解误报： 了解 Sanitizers 的局限性，避免被误报所迷惑。

总结：Sanitizers，你值得拥有！

总而言之，ASan, UBSan, 和 TSan 是 C++ 程序员手中的三大神器。它们可以帮助我们找出代码中隐藏的 bug，提高代码的质量和可靠性。虽然使用 Sanitizers 会带来一些性能开销，但是相比于 bug 带来的损失，这些开销是完全可以接受的。

所以，从今天开始，拿起你的 Sanitizers，让你的代码更加健康、更加强壮吧！

表格总结

工具	功能	检测的错误类型	使用方法	性能开销
ASan	内存错误检测	堆溢出、栈溢出、使用已释放的内存、重复释放、内存泄漏	`-fsanitize=address`	高
UBSan	未定义行为检测	整数溢出、空指针解引用、除以零、有符号数位移、虚函数调用错误、返回语句缺失、不可达代码	`-fsanitize=undefined`	中
TSan	线程并发问题检测	数据竞争、死锁	`-fsanitize=thread -lpthread`	高

最后的忠告

记住，Sanitizers 只是工具，它们不能代替你的思考和测试。在使用 Sanitizers 的同时，也要注重代码的规范性和可读性，编写高质量的代码，才能真正减少 bug 的数量。

好了，今天的讲座就到这里。希望大家有所收获！感谢大家的观看！

发表回复 取消回复

发表回复取消回复