手写实现 JSON.parse 的安全变体：防止输入字符串中的代码注入（eval 风险） - 智猿学院-前后端，数据库，人工智能，云计算等领域前沿技术讲座

各位同学，大家好。

今天，我们将深入探讨一个在现代软件开发中至关重要的主题：数据解析的安全性。特别是，我们将聚焦于 JSON (JavaScript Object Notation) 数据的解析，以及如何确保这一过程的安全性，避免潜在的代码注入风险。

JSON 已经成为互联网上数据交换的事实标准。它的简洁、易读以及与 JavaScript 的天然亲和性，使其在 API、配置文件、数据存储等众多领域无处不在。然而，就像任何强大的工具一样，如果使用不当，JSON 的解析也可能带来严重的安全隐患。其中最臭名昭著的，就是与 eval() 函数相关的代码注入风险。

我们的目标是，手写实现一个“安全变体”的 JSON 解析器。这里的“安全变体”并非指要替代 JavaScript 原生提供的 JSON.parse()——事实上，原生的 JSON.parse() 已经是高度优化且安全的——而是通过亲手构建一个解析器，来深入理解其内部机制，特别是如何从根本上杜绝 eval() 带来的风险，从而加深我们对数据解析安全原则的理解。这对于当我们需要处理非标准但类似 JSON 的格式，或者在某些极端受限的环境中实现自定义解析时，具有重要的指导意义。

第一章：JSON 的本质与 `eval()` 的陷阱

1.1 JSON：严谨的语法规范

首先，让我们回顾一下 JSON 的基本语法。JSON 并非 JavaScript 对象的子集，而是一个独立的数据格式，其语法由 ECMA-404 和 RFC 8259 严格定义。它只包含以下六种基本数据类型：

对象 (Object): { key: value, ... }，键必须是双引号引起来的字符串。
数组 (Array): [ value, value, ... ]。
字符串 (String): "any characters with escapes"，必须用双引号包围。
数字 (Number): 整数或浮点数，支持科学计数法。
布尔值 (Boolean): true 或 false。
空值 (Null): null。

关键点在于： JSON 不允许以下内容：

函数调用（如 alert(1)）
变量声明或赋值
注释（// 或 /* */）
未加引号的键名（如 { key: value }，在 JavaScript 中合法，但在 JSON 中非法）
单引号字符串（如 'string'）
undefined、NaN、Infinity（虽然某些 JavaScript 环境的 JSON.parse 可能接受，但严格的 JSON 规范不包含它们）
日期对象（必须序列化为字符串）
正则表达式

正是这种严格的语法，使得 JSON 数据在被正确解析时，本质上是安全的。它只描述数据结构，不包含任何可执行逻辑。

1.2 `eval()`：双刃剑的危险

eval() 是 JavaScript 中一个功能强大的全局函数，它能够将一个字符串作为 JavaScript 代码来执行。它的强大之处在于其动态性，可以在运行时解释并执行代码。然而，正是这种强大，也带来了巨大的安全风险。

考虑以下场景：如果我们的后端或前端代码，接收到一个字符串，然后不加思考地使用 eval() 来解析它，期望它是一个 JSON 结构，会发生什么？

// 假设这是从网络接收到的一个字符串
const untrustedInput = `{ "name": "Alice", "age": 30 }`;

// 错误的做法：使用 eval() 来解析
try {
    const data = eval('(' + untrustedInput + ')'); // 外层括号是为避免解析为代码块
    console.log(data.name); // Alice
} catch (e) {
    console.error("解析失败:", e);
}

看起来很方便，对吧？但现在，如果恶意用户发送了这样的字符串呢？

const maliciousInput = `{ "user": "attacker", "action": alert('您被攻击了！') }`;
// 或者更隐蔽和危险的：
// const maliciousInput = `{ "user": "attacker", "payload": (function(){ /* 恶意代码 */ return 'ok'; })() }`;
// const maliciousInput = `{ "user": "attacker", "payload": require('child_process').execSync('rm -rf /') }`; // Node.js 环境下

如果我们将 maliciousInput 传递给 eval()：

try {
    const data = eval('(' + maliciousInput + ')'); // 恶意代码被执行
    console.log(data);
} catch (e) {
    console.error("解析失败:", e);
}

结果就是：alert('您被攻击了！') 会立即执行，或者更糟的是，导致数据泄露、系统破坏等严重后果。这是因为 eval() 不仅仅是解析数据，它是在当前作用域下执行了任意的 JavaScript 代码。

核心结论： 永远不要使用 eval() 来解析来自不可信源的 JSON 数据，甚至不推荐在任何场景下使用它来解析 JSON。原生 JSON.parse() 函数的出现，正是为了提供一个安全、高效且符合规范的 JSON 解析方案，它不使用 eval()。

那么，既然 JSON.parse() 已经存在且安全，我们为什么还要手写一个呢？

深入理解： 掌握其安全机制，即如何通过词法分析和语法分析来严格遵循 JSON 规范，从而拒绝任何非 JSON 的可执行代码。
教育目的： 当我们谈论“防止代码注入”时，理解解析器内部如何工作，是理解安全边界的关键。
特殊需求： 在某些情况下，你可能需要解析一个“几乎是 JSON 但有细微差别”的格式，或者需要在没有原生 JSON.parse 的极简环境中工作（虽然现在非常罕见）。在这种情况下，了解如何从头构建一个安全的解析器至关重要。

接下来，我们将分两步构建我们的安全 JSON 解析器：词法分析器 (Lexer) 和 语法分析器 (Parser)。

第二章：构建词法分析器 (Lexer/Tokenizer)

词法分析器，也称为分词器或扫描器，是解析过程的第一步。它的任务是将输入的字符串分解成一系列有意义的“词法单元”或“标记 (Token)”。这些标记是语法分析器的基本构建块。

2.1 JSON 的词法单元类型

对于 JSON 来说，我们定义的词法单元类型如下：

标记类型	描述	示例
`STRING`	双引号包围的字符串	`"hello"`, `"key"`
`NUMBER`	整数或浮点数，支持科学计数法	`123`, `3.14`, `-5e-2`
`BOOLEAN`	布尔值	`true`, `false`
`NULL`	空值	`null`
`LBRACE`	左花括号	`{`
`RBRACE`	右花括号	`}`
`LBRACKET`	左方括号	`[`
`RBRACKET`	右方括号	`]`
`COLON`	冒号	`:`
`COMMA`	逗号	`,`
`EOF`	输入结束

2.2 词法分析器的实现原理

词法分析器会从输入字符串的开头开始，逐个字符地读取，并根据预定义的规则识别出完整的标记。它会跳过空白字符，并检查每个字符序列是否符合某个标记的模式。

安全核心： 如果词法分析器遇到任何不符合上述任何标记规则的字符序列，它必须立即报错。这意味着，像 alert、function、//、undefined 等这些非 JSON 语法元素，都将在词法分析阶段就被拒绝，从而从源头上阻止了代码注入的可能。

2.3 词法分析器代码实现

我们将创建一个 Lexer 类，它维护一个输入字符串和当前的位置，并提供 getNextToken() 方法来获取下一个标记。

// token.js
const TokenType = {
    STRING: 'STRING',
    NUMBER: 'NUMBER',
    BOOLEAN: 'BOOLEAN',
    NULL: 'NULL',
    LBRACE: 'LBRACE',       // {
    RBRACE: 'RBRACE',       // }
    LBRACKET: 'LBRACKET',   // [
    RBRACKET: 'RBRACKET',   // ]
    COLON: 'COLON',         // :
    COMMA: 'COMMA',         // ,
    EOF: 'EOF',             // End Of File
};

class Token {
    constructor(type, value, line, column) {
        this.type = type;
        this.value = value;
        this.line = line;
        this.column = column;
    }

    toString() {
        return `Token(type: ${this.type}, value: '${this.value}', line: ${this.line}, column: ${this.column})`;
    }
}

// lexer.js
class Lexer {
    constructor(input) {
        this.input = input;
        this.pos = 0;           // Current position in input string
        this.currentChar = input[0]; // Current character under examination
        this.line = 1;          // Current line number
        this.column = 1;        // Current column number
    }

    error(message) {
        throw new Error(`Lexer Error: ${message} at line ${this.line}, column ${this.column}`);
    }

    advance() {
        if (this.currentChar === 'n') {
            this.line++;
            this.column = 0;
        }
        this.pos++;
        this.column++;
        this.currentChar = this.input[this.pos];
    }

    peek() {
        return this.input[this.pos + 1];
    }

    skipWhitespace() {
        while (this.currentChar && /s/.test(this.currentChar)) {
            this.advance();
        }
    }

    // --- Token Recognition Methods ---

    readString() {
        let result = '';
        this.advance(); // Consume the opening double quote
        while (this.currentChar && this.currentChar !== '"') {
            if (this.currentChar === '\') { // Handle escape sequences
                this.advance(); // Consume ''
                switch (this.currentChar) {
                    case '"': result += '"'; break;
                    case '\': result += '\'; break;
                    case '/': result += '/'; break;
                    case 'b': result += 'b'; break;
                    case 'f': result += 'f'; break;
                    case 'n': result += 'n'; break;
                    case 'r': result += 'r'; break;
                    case 't': result += 't'; break;
                    case 'u': // Unicode escape sequence uXXXX
                        let hex = '';
                        for (let i = 0; i < 4; i++) {
                            this.advance();
                            if (!/[0-9a-fA-F]/.test(this.currentChar)) {
                                this.error("Invalid unicode escape sequence");
                            }
                            hex += this.currentChar;
                        }
                        result += String.fromCharCode(parseInt(hex, 16));
                        break;
                    default:
                        this.error("Invalid escape sequence in string");
                }
            } else {
                result += this.currentChar;
            }
            this.advance();
        }
        if (this.currentChar !== '"') {
            this.error("Unterminated string literal");
        }
        this.advance(); // Consume the closing double quote
        return result;
    }

    readNumber() {
        let result = '';
        // Handle negative sign
        if (this.currentChar === '-') {
            result += this.currentChar;
            this.advance();
        }

        // Integer part
        if (this.currentChar === '0') {
            result += this.currentChar;
            this.advance();
        } else if (/[1-9]/.test(this.currentChar)) {
            while (this.currentChar && /[0-9]/.test(this.currentChar)) {
                result += this.currentChar;
                this.advance();
            }
        } else {
            this.error("Invalid number format");
        }

        // Fractional part
        if (this.currentChar === '.') {
            result += this.currentChar;
            this.advance();
            if (!/[0-9]/.test(this.currentChar)) { // Must have digits after decimal point
                this.error("Invalid number format: digit expected after decimal point");
            }
            while (this.currentChar && /[0-9]/.test(this.currentChar)) {
                result += this.currentChar;
                this.advance();
            }
        }

        // Exponent part
        if (this.currentChar === 'e' || this.currentChar === 'E') {
            result += this.currentChar;
            this.advance();
            if (this.currentChar === '+' || this.currentChar === '-') {
                result += this.currentChar;
                this.advance();
            }
            if (!/[0-9]/.test(this.currentChar)) { // Must have digits after exponent sign
                this.error("Invalid number format: digit expected after exponent");
            }
            while (this.currentChar && /[0-9]/.test(this.currentChar)) {
                result += this.currentChar;
                this.advance();
            }
        }
        return parseFloat(result); // Convert to JS Number
    }

    readKeyword(keyword) {
        const startPos = this.pos;
        const startColumn = this.column;
        for (let i = 0; i < keyword.length; i++) {
            if (this.currentChar !== keyword[i]) {
                this.error(`Unexpected character, expected '${keyword[i]}'`);
            }
            this.advance();
        }
        return new Token(TokenType[keyword.toUpperCase()], keyword, this.line, startColumn);
    }

    getNextToken() {
        while (this.currentChar) {
            this.skipWhitespace();

            if (!this.currentChar) {
                return new Token(TokenType.EOF, null, this.line, this.column);
            }

            const startColumn = this.column;

            if (this.currentChar === '"') {
                const value = this.readString();
                return new Token(TokenType.STRING, value, this.line, startColumn);
            }

            if (/[0-9-]/.test(this.currentChar)) { // Numbers can start with 0-9 or -
                const value = this.readNumber();
                return new Token(TokenType.NUMBER, value, this.line, startColumn);
            }

            if (this.currentChar === 't') {
                return this.readKeyword('true');
            }
            if (this.currentChar === 'f') {
                return this.readKeyword('false');
            }
            if (this.currentChar === 'n') {
                return this.readKeyword('null');
            }

            switch (this.currentChar) {
                case '{': this.advance(); return new Token(TokenType.LBRACE, '{', this.line, startColumn);
                case '}': this.advance(); return new Token(TokenType.RBRACE, '}', this.line, startColumn);
                case '[': this.advance(); return new Token(TokenType.LBRACKET, '[', this.line, startColumn);
                case ']': this.advance(); return new Token(TokenType.RBRACKET, ']', this.line, startColumn);
                case ':': this.advance(); return new Token(TokenType.COLON, ':', this.line, startColumn);
                case ',': this.advance(); return new Token(TokenType.COMMA, ',', this.line, startColumn);
                default:
                    this.error(`Unexpected character: '${this.currentChar}'`);
            }
        }
        return new Token(TokenType.EOF, null, this.line, this.column);
    }
}

词法分析器的安全保障：

严格的字符匹配： readString 只接受 " 包围的字符串，并严格处理 JSON 规范定义的转义序列。任何非法的转义（如 x）都会报错。
数字格式校验： readNumber 严格遵循 JSON 数字的规则，不允许 NaN, Infinity 或其他非标准数字表示。
关键字匹配： readKeyword 确保 true, false, null 是完全匹配的，例如 tru 或 nullx 都会被拒绝。
字符白名单： getNextToken 的 switch 语句和前面的 if 语句构成了所有允许的字符和序列的白名单。任何不在此白名单中的字符（例如 a-z 中除了 t, f, n 之外的字母，或者 $，#，! 等），都将在 default 分支中触发 this.error()，从而阻止恶意或不规范的输入进入后续的解析阶段。

第三章：构建语法分析器 (Parser)

语法分析器，也称为解析器，是解析过程的第二步。它接收词法分析器产生的标记流，并根据 JSON 的语法规则来验证这些标记的顺序和结构，最终构建出对应的 JavaScript 对象。

3.1 JSON 的语法规则（简化版）

我们将 JSON 的语法规则表示为一系列递归下降的函数：

parseValue(): 尝试解析一个 JSON 值（可以是对象、数组、字符串、数字、布尔值或 null）。
parseObject(): 解析一个 JSON 对象。
parseArray(): 解析一个 JSON 数组。

这些函数会根据当前的标记类型，决定调用哪个子解析函数，并前进到下一个标记。

安全核心： 语法分析器会严格检查标记的顺序。例如，在一个对象中，期望的顺序是 STRING -> COLON -> VALUE -> COMMA 或 STRING -> COLON -> VALUE -> RBRACE。如果解析器在某个位置期望一个 COLON 但却得到了一个 COMMA，它就会立即报错，拒绝解析。这种严格的结构验证确保了输入数据完全符合 JSON 规范，从而排除了任何非 JSON 结构的代码。

3.2 语法分析器代码实现

我们将创建一个 Parser 类，它持有 Lexer 实例，并提供 parse() 方法作为入口。

// parser.js
// 假设 TokenType 和 Token 类在前面已经定义并导入

class Parser {
    constructor(lexer) {
        this.lexer = lexer;
        this.currentToken = this.lexer.getNextToken(); // Current token under examination
    }

    error(message) {
        throw new Error(`Parser Error: ${message} at line ${this.currentToken.line}, column ${this.currentToken.column}`);
    }

    eat(tokenType) {
        if (this.currentToken.type === tokenType) {
            this.currentToken = this.lexer.getNextToken();
        } else {
            this.error(`Expected token type ${tokenType}, but got ${this.currentToken.type}`);
        }
    }

    // --- Parsing Functions ---

    parseValue() {
        switch (this.currentToken.type) {
            case TokenType.STRING:
                const strValue = this.currentToken.value;
                this.eat(TokenType.STRING);
                return strValue;
            case TokenType.NUMBER:
                const numValue = this.currentToken.value;
                this.eat(TokenType.NUMBER);
                return numValue;
            case TokenType.BOOLEAN:
                const boolValue = this.currentToken.value;
                this.eat(TokenType.BOOLEAN);
                return boolValue;
            case TokenType.NULL:
                this.eat(TokenType.NULL);
                return null;
            case TokenType.LBRACE:
                return this.parseObject();
            case TokenType.LBRACKET:
                return this.parseArray();
            default:
                this.error(`Unexpected token type: ${this.currentToken.type}. Expected a JSON value.`);
        }
    }

    parseObject() {
        this.eat(TokenType.LBRACE); // Consume '{'
        const obj = {};

        // Handle empty object
        if (this.currentToken.type === TokenType.RBRACE) {
            this.eat(TokenType.RBRACE);
            return obj;
        }

        while (this.currentToken.type === TokenType.STRING) {
            const key = this.currentToken.value;
            this.eat(TokenType.STRING); // Consume key string

            this.eat(TokenType.COLON); // Consume ':'

            obj[key] = this.parseValue(); // Parse value

            if (this.currentToken.type === TokenType.COMMA) {
                this.eat(TokenType.COMMA); // Consume ','
            } else if (this.currentToken.type === TokenType.RBRACE) {
                break; // End of object
            } else {
                this.error("Expected ',' or '}' after key-value pair in object");
            }
        }
        this.eat(TokenType.RBRACE); // Consume '}'
        return obj;
    }

    parseArray() {
        this.eat(TokenType.LBRACKET); // Consume '['
        const arr = [];

        // Handle empty array
        if (this.currentToken.type === TokenType.RBRACKET) {
            this.eat(TokenType.RBRACKET);
            return arr;
        }

        while (this.currentToken.type !== TokenType.RBRACKET) {
            arr.push(this.parseValue());

            if (this.currentToken.type === TokenType.COMMA) {
                this.eat(TokenType.COMMA); // Consume ','
                if (this.currentToken.type === TokenType.RBRACKET) { // Trailing comma not allowed in strict JSON
                    this.error("Trailing comma not allowed in JSON array");
                }
            } else if (this.currentToken.type === TokenType.RBRACKET) {
                break; // End of array
            } else {
                this.error("Expected ',' or ']' after array element");
            }
        }
        this.eat(TokenType.RBRACKET); // Consume ']'
        return arr;
    }

    parse() {
        const result = this.parseValue();
        // After parsing the main value, the next token must be EOF
        if (this.currentToken.type !== TokenType.EOF) {
            this.error("Unexpected token after root JSON value");
        }
        return result;
    }
}

// jsonParseSafe.js (Main export)
function jsonParseSafe(input) {
    const lexer = new Lexer(input);
    const parser = new Parser(lexer);
    return parser.parse();
}

语法分析器的安全保障：

严格的结构验证： eat() 方法是核心，它确保当前标记的类型与期望的类型完全匹配。如果不匹配，就会立即抛出错误，阻止解析继续。
递归下降解析： parseObject()、parseArray()、parseValue() 构成了一个递归下降解析器，它严格按照 JSON 的语法规则来下降和匹配标记序列。
无代码执行路径： 在整个解析过程中，我们只进行类型检查、值提取和对象/数组的构建。没有任何一步涉及到将字符串作为代码执行，也因此杜绝了 eval() 带来的风险。
根值验证： parse() 方法在解析完根值后，会检查是否已到达 EOF。这意味着输入字符串不能包含任何额外的、非 JSON 的内容在有效 JSON 之后。

第四章：测试与案例分析

现在我们有了 jsonParseSafe 函数，让我们通过一些测试用例来验证它的功能和安全性。

// Example usage:

// 1. Valid JSON
const validJson1 = `{ "name": "Alice", "age": 30, "isStudent": false, "grades": [90, 85, 92], "address": null }`;
const validJson2 = `[1, "hello", true, {"key": "value"}]`;
const validJson3 = `"Just a string"`;
const validJson4 = `12345.67e-2`;
const validJson5 = `true`;

console.log("--- Valid JSON ---");
console.log("Valid JSON 1:", jsonParseSafe(validJson1));
console.log("Valid JSON 2:", jsonParseSafe(validJson2));
console.log("Valid JSON 3:", jsonParseSafe(validJson3));
console.log("Valid JSON 4:", jsonParseSafe(validJson4));
console.log("Valid JSON 5:", jsonParseSafe(validJson5));

// 2. Malicious/Invalid JSON (Code Injection Attempts)
const maliciousInput1 = `{ "name": "Bob", "action": alert('Pwned!') }`;
const maliciousInput2 = `[1, 2, 3, console.log('Evil code!')]`;
const maliciousInput3 = `{ "key": (function(){ return 'injected'; })() }`;
const maliciousInput4 = `{ "user": "attacker", "cmd": require('child_process').execSync('ls') }`; // Node.js

console.log("n--- Malicious/Invalid JSON (Expected Errors) ---");

function testError(input, description) {
    try {
        console.log(`Testing: ${description}`);
        const result = jsonParseSafe(input);
        console.log(`[FAIL] Expected error, but parsed:`, result);
    } catch (e) {
        console.log(`[PASS] Correctly rejected: ${e.message}`);
    }
}

testError(maliciousInput1, "alert() injection");
testError(maliciousInput2, "console.log() injection");
testError(maliciousInput3, "IIFE injection");
testError(maliciousInput4, "Node.js exec injection");

// 3. Syntactically Incorrect JSON (Non-JSON, e.g., unquoted keys, comments, trailing commas)
const invalidJson1 = `{ key: "value" }`; // Unquoted key
const invalidJson2 = `{ "key": "value", }`; // Trailing comma in object (strict JSON disallows)
const invalidJson3 = `[1, 2,]`; // Trailing comma in array (strict JSON disallows)
const invalidJson4 = `{ "key": 'value' }`; // Single quotes for string
const invalidJson5 = `{ /* comment */ "key": "value" }`; // Comments
const invalidJson6 = `{ "key": undefined }`; // undefined
const invalidJson7 = `{ "key": NaN }`; // NaN
const invalidJson8 = `{ "key": Infinity }`; // Infinity
const invalidJson9 = `{"a": 1}extra content`; // Extra content after root value

console.log("n--- Syntactically Incorrect JSON (Expected Errors) ---");
testError(invalidJson1, "Unquoted key");
testError(invalidJson2, "Trailing comma in object");
testError(invalidJson3, "Trailing comma in array");
testError(invalidJson4, "Single quotes string");
testError(invalidJson5, "Comments");
testError(invalidJson6, "undefined value");
testError(invalidJson7, "NaN value");
testError(invalidJson8, "Infinity value");
testError(invalidJson9, "Extra content after root value");

// 4. Lexer-level errors
const lexerError1 = `{"key": "valuex"}`; // Invalid escape
const lexerError2 = `{"key": "value`; // Unterminated string
const lexerError3 = `{ "key": ~value }`; // Unexpected character
const lexerError4 = `{ "key": truex }`; // Malformed keyword

console.log("n--- Lexer-level Errors (Expected Errors) ---");
testError(lexerError1, "Invalid escape sequence");
testError(lexerError2, "Unterminated string");
testError(lexerError3, "Unexpected character '~'");
testError(lexerError4, "Malformed keyword 'truex'");

运行上述测试代码，你会看到所有合法 JSON 都被正确解析，而所有恶意或不符合 JSON 规范的输入都被我们的解析器以清晰的错误信息拒绝。这正是我们想要达到的安全目标。

第五章：与原生 `JSON.parse()` 的比较及适用场景

5.1 原生 `JSON.parse()` 的优势

JavaScript 引擎内置的 JSON.parse() 函数是高度优化和经过实战检验的。

性能： 原生实现通常由 C++ 等底层语言编写，并经过大量优化，速度远超任何 JavaScript 手写实现。
健壮性： 它已经处理了各种复杂的边缘情况、Unicode 字符、内存管理等，并且通过了大量的符合性测试。
安全性： 最重要的一点是，JSON.parse() 本身是安全的。它不会使用 eval()，而是通过严格的词法和语法分析来解析 JSON 字符串，并构建一个 JavaScript 对象。它只会解析数据，不会执行代码。因此，只要你使用原生 JSON.parse()，就无需担心 eval() 带来的代码注入风险。

5.2 何时考虑自定义解析器的概念？

那么，我们费力手写一个解析器，它的意义何在呢？

深入理解解析原理： 这是最重要的目的。通过亲手实现，我们能够透彻理解词法分析、语法分析如何工作，以及它们如何共同保障数据解析的安全性。这种知识是构建任何编译器、解释器或DSL（领域特定语言）解析器的基础。
教育与学习： 这是一个极佳的学习工具，可以帮助我们理解编程语言理论和安全编程的实践。
实现超严格或自定义的 JSON 变体： 尽管原生 JSON.parse() 已经很严格，但有时你可能需要：
- 更严格的 JSON： 例如，拒绝任何非ASCII字符（虽然不推荐，因为 JSON 规范支持 Unicode）。
- JSON-like 格式： 你的应用可能需要一种类似 JSON 但又允许一些额外特性（如注释，或特定的宏），或者需要拒绝原生 JSON.parse 允许的一些非严格 JSON 特性（例如，某些浏览器可能允许 u0000 字符在字符串中，而你可能想禁止它）。在这种情况下，你需要一个自定义解析器。
极度受限的环境： 在一些非常罕见的、没有 JSON.parse() 可用的极简 JavaScript 环境中，你可能需要自己实现一个。
安全审计与验证： 理解解析器的工作方式，有助于我们更好地审计和理解其他解析器的安全性。

关键提醒： 在绝大多数实际生产场景中，你都应该优先使用 JavaScript 原生的 JSON.parse()。只有当你面临上述特殊需求，并且对自定义解析器的性能、健壮性和安全性有充分的信心和测试覆盖时，才应考虑自定义实现。我们的 jsonParseSafe 更多的是一个教学示例，用于演示解析器如何从原理上实现安全。

第六章：总结：安全解析的普适原则

通过本次讲座和实践，我们深入探讨了 JSON 解析的安全性问题，特别是 eval() 带来的代码注入风险。我们手写实现了一个安全的 JSON 解析器，并展示了如何通过严格的词法分析和语法分析来彻底杜绝这种风险。

核心原则可以概括为：

绝不使用 eval() 解析不可信数据。 这是最基本的安全准则，违反它几乎一定会导致漏洞。
遵循“白名单”原则： 在数据解析中，只接受明确已知和允许的结构和值。任何不符合规范的输入都应被拒绝，而不是尝试去“修复”或“猜测”。
分阶段解析： 词法分析（将输入分解为标记）和语法分析（验证标记的结构）是构建健壮解析器的关键步骤。每个阶段都有责任确保输入的有效性。
错误处理至关重要： 当遇到不符合规范的输入时，解析器必须抛出明确的错误，而不是默默地失败或尝试执行不安全的操作。

理解这些原则不仅对 JSON 解析至关重要，它也是所有数据解析、输入验证和安全编程的基石。希望这次讲座能帮助大家在未来的编程实践中，构建出更加安全、健壮的应用程序。

第一章：JSON 的本质与 eval() 的陷阱