`Search Generative Experience`（`SGE`）的`对话`式搜索：其对传统`SERP`的颠覆。 - 智猿学院-前后端，数据库，人工智能，云计算等领域前沿技术讲座

SGE 的对话式搜索：技术解析与对传统 SERP 的颠覆

各位好，今天我们来聊聊 Google 的 Search Generative Experience (SGE) 及其核心的对话式搜索功能。作为一名程序员，我们不仅要了解 SGE 是什么，更要深入到其背后的技术原理，以及它对传统搜索引擎结果页面 (SERP) 带来的颠覆性影响。

一、传统 SERP 的局限性

在深入 SGE 之前，我们需要回顾一下传统 SERP 的运作方式及其固有的局限性。传统 SERP 基本上是一个链接列表，外加一些广告和精选摘要。用户需要浏览这些链接，自行提取信息，并整合答案。

信息碎片化： 用户需要点击多个链接才能找到所需的完整信息。
理解成本高： 用户需要自行分析和理解各个网页的内容，才能得到最终的答案。
缺乏交互性： 用户只能通过点击链接来探索信息，无法与搜索引擎进行更深入的互动。
SEO 竞争激烈： 网站为了获得更高的排名，往往过度优化内容，导致用户体验下降。

为了更直观地说明，我们可以将传统 SERP 的流程简化为以下 Python 代码：

class SERP:
    def __init__(self, query, results):
        self.query = query
        self.results = results  # A list of web page URLs

    def display_results(self):
        print(f"Search results for: {self.query}")
        for i, url in enumerate(self.results):
            print(f"{i+1}. {url}")

    def get_information(self, url):
        # Placeholder for fetching and parsing web page content
        # In reality, this would involve web scraping and text processing
        return f"Information from {url}"

# Example usage
results = [
    "https://www.example.com/article1",
    "https://www.example.com/article2",
    "https://www.example.com/article3",
]

serp = SERP("What is Python?", results)
serp.display_results()

url_to_explore = results[0]
information = serp.get_information(url_to_explore)
print(f"nExtracting information from: {url_to_explore}")
print(information)

这段代码模拟了一个简单的 SERP，用户需要手动选择链接并提取信息。它突显了传统 SERP 的被动性和信息分散性。

二、SGE 的核心：对话式搜索

SGE 旨在解决传统 SERP 的局限性，它通过引入对话式搜索，提供更智能、更个性化的搜索体验。用户不再只是获得一个链接列表，而是可以与搜索引擎进行对话，逐步 уточнять 查询，并获得更精准、更全面的答案。

SGE 的核心技术包括：

自然语言处理 (NLP)： 用于理解用户的查询意图，并生成自然流畅的回答。
大型语言模型 (LLM)： 用于生成摘要、回答问题、以及进行对话。Google 使用 LaMDA 和 PaLM 等 LLM 为 SGE 提供支持。
知识图谱： 用于提供结构化的知识，帮助 LLM 理解实体之间的关系。
检索增强生成 (RAG)： 将检索到的相关文档信息作为 LLM 的上下文，提升生成答案的准确性和可靠性。

对话式搜索的关键在于，它允许用户逐步 уточнять 查询，从而获得更精准的答案。例如，用户可以先问 "What is the best restaurant in New York?", 然后追问 "That is affordable?", 最后问 "Does it serve vegetarian options?". SGE 能够理解这些问题的上下文，并根据用户的需求调整答案。

我们可以用以下 Python 代码来模拟一个简化的对话式搜索流程：

class ConversationalSearch:
    def __init__(self, knowledge_base):
        self.knowledge_base = knowledge_base  # A dictionary-like structure storing information
        self.context = {}  # Store the context of the conversation

    def process_query(self, query):
        # Placeholder for NLP to understand user intent
        intent = self.analyze_intent(query)

        # Update context based on the query
        self.update_context(intent)

        # Retrieve relevant information from the knowledge base
        information = self.retrieve_information(intent)

        # Generate a response using a simplified LLM
        response = self.generate_response(information)

        return response

    def analyze_intent(self, query):
        # Simplified intent analysis (replace with a real NLP model)
        if "best restaurant" in query.lower():
            return {"entity": "restaurant", "criteria": "best"}
        elif "affordable" in query.lower():
            return {"criteria": "affordable"}
        elif "vegetarian" in query.lower():
            return {"criteria": "vegetarian"}
        else:
            return {"unknown": query}

    def update_context(self, intent):
        self.context.update(intent)

    def retrieve_information(self, intent):
        # Simplified information retrieval from the knowledge base
        results = []
        for item, details in self.knowledge_base.items():
            valid = True
            for criteria, value in self.context.items():
                if criteria == "restaurant" and item != "restaurant":
                    valid = False
                    break
                if criteria == "affordable" and details.get("price") != "affordable":
                    valid = False
                    break
                if criteria == "vegetarian" and details.get("vegetarian") != True:
                    valid = False
                    break
            if valid:
                results.append((item, details))
        return results

    def generate_response(self, information):
        if not information:
            return "Sorry, I couldn't find any matching results."
        else:
            response = "The following restaurants match your criteria:n"
            for item, details in information:
                response += f"- {item}: {details}n"
            return response

# Example usage with a simple knowledge base
knowledge_base = {
    "restaurant_A": {"price": "affordable", "vegetarian": True, "cuisine": "Italian"},
    "restaurant_B": {"price": "expensive", "vegetarian": False, "cuisine": "French"},
    "restaurant_C": {"price": "affordable", "vegetarian": False, "cuisine": "Mexican"}
}

search_engine = ConversationalSearch(knowledge_base)

query1 = "What is the best restaurant?"
response1 = search_engine.process_query(query1)
print(f"User: {query1}nSGE: {response1}n")

query2 = "That is affordable?"
response2 = search_engine.process_query(query2)
print(f"User: {query2}nSGE: {response2}n")

query3 = "Does it serve vegetarian options?"
response3 = search_engine.process_query(query3)
print(f"User: {query3}nSGE: {response3}n")

这段代码模拟了一个简单的对话式搜索系统。它使用了 knowledge_base 存储信息，context 保存对话上下文，并根据用户的查询逐步 уточнять 结果。请注意，这只是一个简化模型，实际的 SGE 系统会更加复杂，并使用更先进的 NLP 和 LLM 技术。

三、SGE 对传统 SERP 的颠覆

SGE 的对话式搜索对传统 SERP 带来了多方面的颠覆：

从链接列表到答案引擎： SGE 不再只是提供链接列表，而是直接生成答案，并提供相关的上下文和解释。
从被动搜索到主动对话： 用户可以与搜索引擎进行对话，逐步 уточнять 查询，并获得更精准的答案。
从信息碎片化到知识整合： SGE 从多个来源整合信息，提供更全面、更结构化的知识。
从 SEO 竞争到用户体验优先： 网站不再需要为了获得更高的排名而过度优化内容，而是应该专注于提供高质量、有价值的信息。

具体体现在以下几个方面：

特征	传统 SERP	SGE 的对话式搜索
信息呈现方式	链接列表，广告，精选摘要	生成式摘要，对话界面，相关链接
用户交互方式	点击链接，浏览网页	与搜索引擎对话， уточнять 查询，逐步探索
信息整合程度	信息分散，用户需要自行整合	信息整合，提供更全面、更结构化的知识
重点	SEO 排名，网站流量	用户体验，答案的准确性和相关性
技术核心	关键词匹配，链接分析	自然语言处理，大型语言模型，知识图谱，检索增强生成

四、SGE 的技术挑战

尽管 SGE 具有巨大的潜力，但也面临着一些技术挑战：

答案的准确性和可靠性： LLM 可能会生成不准确或不真实的答案，需要有效的机制来验证和纠正。
信息的来源和透明度： 需要明确标示答案的来源，并提供链接，以便用户可以进一步验证信息。
偏见和公平性： LLM 可能会受到训练数据的偏见影响，需要采取措施来确保答案的公平性和客观性。
计算成本和效率： LLM 的计算成本很高，需要优化算法和硬件，以提高效率。
上下文理解和对话管理： 需要更先进的 NLP 技术来理解复杂的查询，并管理多轮对话。

五、SGE 对开发者和 SEO 的影响

SGE 的出现对开发者和 SEO 带来了深远的影响：

内容创作策略的转变： 开发者需要更加注重内容质量和用户体验，而不是过度优化 SEO。
结构化数据的重要性： 结构化数据可以帮助 SGE 更好地理解网页内容，并将其用于生成答案。
专业知识图谱的价值： 构建专业知识图谱可以帮助 SGE 提供更精准、更权威的答案。
新的 SEO 策略： 传统的 SEO 策略可能不再有效，开发者需要探索新的 SEO 策略，例如专注于提供高质量的内容，并使用结构化数据来标记内容。

六、代码示例：利用 Schema.org 结构化数据

为了帮助 SGE 更好地理解网页内容，我们可以使用 Schema.org 提供的结构化数据标记。例如，如果我们有一个关于食谱的网页，可以使用以下 JSON-LD 代码来标记食谱的名称、描述、成分、步骤等信息：

<script type="application/ld+json">
{
  "@context": "https://schema.org/",
  "@type": "Recipe",
  "name": "Delicious Chocolate Cake",
  "image": [
    "https://example.com/photos/1x1/photo.jpg",
    "https://example.com/photos/4x3/photo.jpg",
    "https://example.com/photos/16x9/photo.jpg"
   ],
  "author": {
    "@type": "Person",
    "name": "John Doe"
  },
  "datePublished": "2023-10-26",
  "description": "A rich and decadent chocolate cake recipe that's perfect for any occasion.",
  "prepTime": "PT30M",
  "cookTime": "PT45M",
  "totalTime": "PT1H15M",
  "recipeCategory": "Dessert",
  "recipeCuisine": "American",
  "recipeYield": "12 servings",
  "keywords": ["chocolate cake", "dessert", "baking"],
  "recipeIngredient": [
    "2 cups all-purpose flour",
    "2 cups granulated sugar",
    "¾ cup unsweetened cocoa powder",
    "1 ½ teaspoons baking powder",
    "1 ½ teaspoons baking soda",
    "1 teaspoon salt",
    "1 cup buttermilk",
    "½ cup vegetable oil",
    "2 large eggs",
    "2 teaspoons vanilla extract",
    "1 cup boiling water"
  ],
  "recipeInstructions": [
    {
      "@type": "HowToStep",
      "text": "Preheat oven to 350°F (175°C). Grease and flour a 9x13 inch baking pan."
    },
    {
      "@type": "HowToStep",
      "text": "In a large bowl, whisk together the flour, sugar, cocoa powder, baking powder, baking soda, and salt."
    },
    {
      "@type": "HowToStep",
      "text": "Add the buttermilk, oil, eggs, and vanilla extract to the dry ingredients. Beat with an electric mixer on medium speed for 2 minutes."
    },
    {
      "@type": "HowToStep",
      "text": "Gradually add the boiling water to the batter, mixing until well combined. The batter will be thin."
    },
    {
      "@type": "HowToStep",
      "text": "Pour the batter into the prepared pan and bake for 30-35 minutes, or until a wooden skewer inserted into the center comes out clean."
    },
    {
      "@type": "HowToStep",
      "text": "Let the cake cool in the pan for 10 minutes before inverting it onto a wire rack to cool completely."
    }
  ],
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.5",
    "ratingCount": "100"
  }
}
</script>

通过使用结构化数据，我们可以让 SGE 更好地理解网页内容，并将其用于生成摘要、回答问题、以及提供相关的上下文信息。

七、未来的展望

SGE 的对话式搜索代表了搜索引擎的未来发展方向。随着 NLP 和 LLM 技术的不断进步，我们可以期待 SGE 能够提供更智能、更个性化的搜索体验。未来的 SGE 可能会具备以下功能：

更深入的上下文理解： 能够理解更复杂的查询，并根据用户的历史搜索记录和个人偏好调整答案。
更自然的对话界面： 能够与用户进行更自然的对话，并提供更丰富的交互方式，例如语音和图像。
更广泛的应用场景： 应用于更多的领域，例如教育、医疗、金融等，提供专业的知识和解决方案。
更强大的个性化推荐： 能够根据用户的需求和兴趣，推荐相关的内容和服务。

总结

SGE 的对话式搜索是对传统 SERP 的一次革命，它通过引入 NLP 和 LLM 技术，将搜索引擎从链接列表转变为答案引擎。开发者需要适应这种变化，更加注重内容质量和用户体验，并使用结构化数据来帮助 SGE 更好地理解网页内容。

SGE 带来了搜索方式的转变

SGE 的对话式搜索不仅改变了搜索结果的呈现方式，更重要的是改变了用户的搜索习惯。从被动地浏览链接到主动地与搜索引擎对话，用户可以更高效地获取信息，并解决问题。

拥抱新技术，迎接新挑战

SGE 的发展离不开技术的进步。我们需要不断学习和探索新的技术，才能更好地适应 SGE 带来的变化，并迎接新的挑战。

未来属于更智能的搜索

SGE 的对话式搜索只是一个开始，未来的搜索引擎将会更加智能、更加个性化，并能够更好地满足用户的需求。

SGE 的对话式搜索：技术解析与对传统 SERP 的颠覆

发表回复 取消回复

发表回复取消回复