Python的`asyncio`性能调优：如何通过`asyncio.gather`和`asyncio.Semaphore`控制并发。 - 智猿学院-前后端，数据库，人工智能，云计算等领域前沿技术讲座

好的，下面开始正文：

Python `asyncio` 性能调优：并发控制与效率提升

大家好，今天我们来深入探讨Python asyncio 框架下的性能调优，重点是如何利用 asyncio.gather 和 asyncio.Semaphore 来更有效地控制并发，从而提升程序的整体性能。asyncio 虽然提供了异步编程的能力，但如果使用不当，反而可能导致性能瓶颈。因此，理解并发控制的原理和实践至关重要。

`asyncio.gather`: 任务并发执行的利器

asyncio.gather 是一个强大的工具，它允许我们并发地运行多个协程，并等待它们全部完成。其基本用法如下：

import asyncio

async def my_coroutine(i):
  print(f"Coroutine {i} started")
  await asyncio.sleep(1)  # 模拟耗时操作
  print(f"Coroutine {i} finished")
  return f"Result from {i}"

async def main():
  tasks = [my_coroutine(i) for i in range(5)]
  results = await asyncio.gather(*tasks)
  print(f"All results: {results}")

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，asyncio.gather 并发地运行了5个 my_coroutine 协程，并在所有协程完成后，将它们的结果收集到一个列表中。 *tasks 的作用是将 tasks 列表解包，作为 gather 的参数传入，每个参数代表一个需要并发执行的协程。

asyncio.gather 的优势：

并发性： 显著减少总执行时间，尤其是在IO密集型任务中。
结果收集： 方便地获取所有协程的返回值。
异常处理： 默认情况下，如果任何一个协程抛出异常，gather 会立即取消所有其他协程，并抛出相同的异常。

asyncio.gather 的局限性：

无序性： 协程的完成顺序是不确定的，取决于它们的执行速度。
资源消耗： 如果并发的任务数量过多，可能会导致资源消耗过大，甚至引发性能问题。
错误处理： 默认的错误处理机制可能会过于激进，有时我们希望即使个别协程失败，也能继续执行其他协程。

`asyncio.Semaphore`: 并发控制的有效手段

为了解决 asyncio.gather 的资源消耗问题，我们需要引入并发控制机制。asyncio.Semaphore 就是一种常用的方法。 Semaphore 维护一个内部计数器，表示可用资源的数量。协程可以通过 acquire() 方法来获取一个资源，使计数器减1；通过 release() 方法来释放一个资源，使计数器加1。当计数器为0时，acquire() 方法会阻塞，直到有其他协程释放资源。

import asyncio

async def worker(semaphore, i):
  async with semaphore:
    print(f"Worker {i} acquired semaphore")
    await asyncio.sleep(1)  # 模拟耗时操作
    print(f"Worker {i} released semaphore")

async def main():
  semaphore = asyncio.Semaphore(3)  # 限制并发数为3
  tasks = [worker(semaphore, i) for i in range(5)]
  await asyncio.gather(*tasks)

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，我们创建了一个 Semaphore，并将它的初始值设置为3。这意味着最多只能有3个 worker 协程同时运行。其他协程需要等待直到有协程释放 Semaphore 才能继续执行。 async with semaphore: 语句确保了在使用完资源后，一定会释放 Semaphore，即使发生异常。这是一种更安全、更简洁的写法，等价于 await semaphore.acquire(); try: ... finally: semaphore.release()

asyncio.Semaphore 的作用：

限制并发数： 防止资源过度消耗，避免性能瓶颈。
资源管理： 控制对共享资源的访问，避免竞争条件。
流量控制： 平滑流量，防止服务过载。

`asyncio.gather` 与 `asyncio.Semaphore` 结合使用

将 asyncio.gather 和 asyncio.Semaphore 结合起来，可以同时实现并发执行和并发控制。

import asyncio

async def fetch_url(semaphore, url):
  async with semaphore:
    print(f"Fetching URL: {url}")
    await asyncio.sleep(0.5)  # 模拟网络请求
    print(f"Finished fetching URL: {url}")
    return f"Content from {url}"

async def main():
  urls = [f"http://example.com/{i}" for i in range(10)]
  semaphore = asyncio.Semaphore(5)  # 限制并发数为5
  tasks = [fetch_url(semaphore, url) for url in urls]
  results = await asyncio.gather(*tasks)
  print(f"All results: {results}")

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，我们模拟了并发地获取多个URL的内容。通过 Semaphore，我们将并发请求的数量限制为5，避免对 example.com 服务器造成过大的压力。如果没有 Semaphore 的限制，10个请求可能会同时发送，导致服务器响应缓慢，甚至崩溃。

高级用法与优化技巧

自定义信号量： 可以通过继承 asyncio.Semaphore 类来实现更复杂的并发控制逻辑。例如，可以添加优先级机制，让某些协程优先获取资源。
超时控制： 可以结合 asyncio.wait_for 来为每个协程设置超时时间。如果协程在指定时间内没有完成，则会被取消。

import asyncio

async def my_coroutine(i):
  try:
    await asyncio.sleep(2)
    return f"Coroutine {i} completed"
  except asyncio.TimeoutError:
    return f"Coroutine {i} timed out"

async def main():
  tasks = [asyncio.wait_for(my_coroutine(i), timeout=1) for i in range(3)]
  results = await asyncio.gather(*tasks)
  print(f"Results: {results}")

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，我们为 my_coroutine 设置了1秒的超时时间。如果协程执行时间超过1秒，就会抛出 asyncio.TimeoutError 异常，并被捕获。

异常处理： 可以修改 asyncio.gather 的默认行为，使其在个别协程失败时，继续执行其他协程。通过设置 return_exceptions=True 参数，可以让 gather 将异常作为结果返回，而不是立即抛出。

import asyncio

async def my_coroutine(i):
  if i == 1:
    raise ValueError("Something went wrong")
  await asyncio.sleep(0.5)
  return f"Result from {i}"

async def main():
  tasks = [my_coroutine(i) for i in range(3)]
  results = await asyncio.gather(*tasks, return_exceptions=True)
  print(f"Results: {results}")

  for result in results:
    if isinstance(result, Exception):
      print(f"Caught exception: {result}")

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，my_coroutine(1) 会抛出一个 ValueError 异常。但是，由于我们设置了 return_exceptions=True，gather 会继续执行其他协程，并将 ValueError 作为结果返回。我们可以通过检查结果的类型来判断是否发生了异常。

流式处理： 对于需要处理大量数据的任务，可以使用 asyncio.Queue 来实现流式处理。可以将数据分批放入队列，然后由多个协程并发地从队列中取出数据进行处理。
选择合适的并发数： 并发数并不是越高越好。需要根据具体的任务类型、硬件资源和网络状况进行调整。可以使用性能测试工具来找到最佳的并发数。一个简单的原则是：对于CPU密集型任务，并发数应该接近CPU核心数；对于IO密集型任务，并发数可以适当增加，但也要避免对服务器造成过大的压力。

案例分析

案例1：Web爬虫

假设我们需要编写一个Web爬虫，并发地抓取多个网页的内容。使用 asyncio.gather 和 asyncio.Semaphore 可以有效地控制并发，避免对目标网站造成过大的压力。

import asyncio
import aiohttp

async def fetch_page(semaphore, session, url):
  async with semaphore:
    try:
      async with session.get(url) as response:
        if response.status == 200:
          return await response.text()
        else:
          print(f"Error fetching {url}: {response.status}")
          return None
    except Exception as e:
      print(f"Exception fetching {url}: {e}")
      return None

async def main():
  urls = ["https://www.example.com"] * 20  # 模拟20个相同的URL
  semaphore = asyncio.Semaphore(10)  # 限制并发数为10
  async with aiohttp.ClientSession() as session:
    tasks = [fetch_page(semaphore, session, url) for url in urls]
    results = await asyncio.gather(*tasks)
    # 处理抓取结果
    for i, result in enumerate(results):
      if result:
        #print(f"Content from {urls[i]}: {result[:100]}...") # 打印前100个字符
        pass
      else:
        print(f"Failed to fetch {urls[i]}")

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，我们使用了 aiohttp 库来进行异步HTTP请求。 Semaphore 将并发请求的数量限制为10，避免对 example.com 造成过大的压力。 try...except 语句用于处理网络请求过程中可能发生的异常。

案例2：数据处理管道

假设我们需要构建一个数据处理管道，从多个数据源读取数据，进行转换，然后写入到数据库。使用 asyncio.Queue 和 asyncio.Semaphore 可以实现高效的流式处理。

import asyncio
import random

async def data_source(queue, num_items):
  for i in range(num_items):
    data = random.randint(1, 100)  # 模拟生成数据
    await queue.put(data)
    print(f"Produced: {data}")
    await asyncio.sleep(0.1)  # 模拟数据生成速度
  await queue.put(None)  # 放入结束标志

async def data_processor(queue, semaphore, output_queue):
  async with semaphore:
    while True:
      data = await queue.get()
      if data is None:
        break
      processed_data = data * 2  # 模拟数据处理
      await output_queue.put(processed_data)
      print(f"Processed: {data} -> {processed_data}")
      queue.task_done()
      await asyncio.sleep(0.2) # 模拟处理速度

async def data_writer(queue):
  while True:
    data = await queue.get()
    if data is None:
      break
    print(f"Written: {data}")
    queue.task_done()
    await asyncio.sleep(0.3) # 模拟写入速度

async def main():
  queue1 = asyncio.Queue()
  queue2 = asyncio.Queue()
  semaphore = asyncio.Semaphore(3)  # 限制并发处理器的数量

  # 启动生产者、处理器和写入器
  producer = asyncio.create_task(data_source(queue1, 20))
  processors = [asyncio.create_task(data_processor(queue1, semaphore, queue2)) for _ in range(5)] # 5个处理器
  writer = asyncio.create_task(data_writer(queue2))

  # 等待生产者完成
  await producer
  await queue1.join() # 等待队列1为空

  # 通知所有处理器结束
  for _ in range(5):
    await queue2.put(None)

  await asyncio.gather(*processors)
  await queue2.join() # 等待队列2为空
  await writer

if __name__ == "__main__":
  asyncio.run(main())

在这个例子中，data_source 协程负责生成数据，并将数据放入 queue1 队列。 data_processor 协程从 queue1 队列中取出数据，进行处理，并将处理后的数据放入 queue2 队列。 data_writer 协程从 queue2 队列中取出数据，写入到数据库。 Semaphore 限制了并发处理器的数量，避免资源过度消耗。 queue.join() 用于等待队列中的所有任务完成。放入 None 标志用于通知处理器和写入器结束。

性能测试与分析

在进行 asyncio 性能调优时，性能测试是必不可少的步骤。可以使用以下工具进行性能测试：

timeit 模块： 用于测量代码片段的执行时间。
cProfile 模块： 用于分析代码的性能瓶颈。
第三方性能测试工具： 例如 locust、wrk 等，用于模拟高并发场景。

在进行性能测试时，需要注意以下几点：

选择合适的测试场景： 测试场景应该尽可能地接近实际应用场景。
控制变量： 在测试过程中，应该尽量减少其他因素的干扰，例如网络波动、CPU负载等。
多次测试： 为了获得更准确的结果，应该进行多次测试，并取平均值。
监控资源使用情况： 可以使用 top、htop 等工具来监控CPU、内存、网络等资源的使用情况。

通过性能测试，可以找到程序的性能瓶颈，并针对性地进行优化。

性能调优小提示

优化点	说明
减少IO操作	尽量减少不必要的IO操作，例如文件读写、网络请求等。
使用缓存	对于经常访问的数据，可以使用缓存来提高访问速度。
避免全局锁	全局锁会限制并发性，尽量避免使用。如果必须使用锁，可以使用更细粒度的锁，例如 `asyncio.Lock`、`asyncio.RLock` 等。
选择合适的数据结构	选择合适的数据结构可以提高程序的执行效率。例如，如果需要频繁地查找数据，可以使用字典或集合；如果需要频繁地插入或删除数据，可以使用链表。
使用更快的库	可以使用一些性能更高的库来替代Python内置的库。例如，可以使用 `uvloop` 来替代默认的事件循环，使用 `orjson` 来替代 `json` 模块。
优化算法	优化算法可以减少程序的计算量，提高执行效率。
代码优化	编写简洁、高效的代码可以提高程序的整体性能。
使用JIT编译器	可以使用JIT编译器，例如 `PyPy`，来提高Python程序的执行速度。

总结并发控制与性能优化

今天我们深入探讨了 asyncio.gather 和 asyncio.Semaphore 在并发控制和性能优化中的应用。合理利用它们可以显著提升异步程序的性能和稳定性，希望今天的分享能帮助大家更好地掌握 asyncio 框架，编写出更高效的异步应用程序。

Python asyncio 性能调优：并发控制与效率提升

asyncio.gather: 任务并发执行的利器

asyncio.Semaphore: 并发控制的有效手段

asyncio.gather 与 asyncio.Semaphore 结合使用