JS `Frida` (`Dynamic Instrumentation Toolkit`)：Hooking Native/JS 函数与内存修改 - 智猿学院-前后端，数据库，人工智能，云计算等领域前沿技术讲座

Alright, alright, settle down folks! Welcome, welcome! Glad to see so many eager faces ready to dive into the wild world of Frida. Today, we’re going to wrangle this beast and learn how to hook native and JS functions, and even mess around with memory, all while keeping it (relatively) legal and ethical. Think of Frida as your digital scalpel – powerful, precise, but requiring a steady hand and a healthy dose of respect.

So, what is Frida? Simply put, it’s a dynamic instrumentation toolkit. Imagine you have a program you can’t see the source code for, but you want to understand what it’s doing, or even change its behavior on the fly. That’s where Frida comes in. It lets you inject snippets of JavaScript code into a running process and interact with its internal workings.

Frida: The Basics

First, let’s get some terminology out of the way:

Target: The process you want to hook.
Hook: The point where you intercept a function call.
Interception: The act of catching a function call.
Agent: The JavaScript code you inject into the target process.
Frida Server: A background process running on the target device that allows Frida to connect and execute commands.
Frida CLI: Command-line interface for interacting with Frida.

Setting up the Stage

Before we start wielding our scalpel, we need to set up our environment.

Install Frida:
```
pip install frida frida-tools
```
This command installs both the Frida Python library and the frida command-line tool.
Frida Server on the Target (if needed):

For mobile devices (Android, iOS), you’ll need to push the Frida server to the device and run it. The Frida server versions need to match the Frida Python library version.
- Download the appropriate server binary from Frida’s releases page: https://github.com/frida/frida/releases.
- Push it to your device (e.g., /data/local/tmp/frida-server).
- Make it executable: chmod +x /data/local/tmp/frida-server.
- Run it: /data/local/tmp/frida-server &.
For desktop applications, the Frida server typically isn’t required, as Frida can directly attach to the process.

Hooking JavaScript Functions (in a JS Environment – e.g., React Native, Electron)

Let’s start with something relatively simple: hooking JavaScript functions. This is incredibly useful for understanding how JavaScript code is behaving in environments like React Native or Electron applications.

Here’s a basic example:

// JavaScript agent code (agent.js)
rpc.exports = {
  hookFunction: function(moduleName, functionName) {
    console.log(`Hooking ${moduleName}.${functionName}`);
    try {
      const targetFunction = Module.findExportByName(moduleName, functionName);

      if (!targetFunction) {
          console.log(`Function ${moduleName}.${functionName} not found`);
          return false;
      }

      Interceptor.attach(targetFunction, {
        onEnter: function(args) {
          console.log(`[${moduleName}.${functionName}] Called with:`);
          for (let i = 0; i < args.length; i++) {
            console.log(`  arg[${i}] = ${args[i]}`);
          }
          console.log(`Thread ID: ${Process.getCurrentThreadId()}`);
          this.startTime = Date.now();
        },
        onLeave: function(retval) {
          console.log(`[${moduleName}.${functionName}] Returned: ${retval}`);
          console.log(`Execution time: ${Date.now() - this.startTime}ms`);
        }
      });
      return true;
    } catch (e) {
      console.log(`Error hooking ${moduleName}.${functionName}: ${e}`);
      return false;
    }
  }
};

# Python script (hook.py)
import frida
import sys

def on_message(message, data):
    if message['type'] == 'send':
        print("[*] {0}".format(message['payload']))
    else:
        print(message)

def main(target_process, module_name, function_name):
    try:
        session = frida.attach(target_process)
        script = session.create_script(open("agent.js").read())
        script.on('message', on_message)
        script.load()
        script.exports.hook_function(module_name, function_name) #Call the export
        print(f"Hooking {module_name}.{function_name} in {target_process}")
        sys.stdin.read() # Keep the script running
    except frida.ProcessNotFoundError as e:
        print(f"Process '{target_process}' not found.  Is it running?")
    except Exception as e:
        print(f"Error: {e}")

if __name__ == '__main__':
    if len(sys.argv) != 4:
        print("Usage: python hook.py <process_name> <module_name> <function_name>")
        sys.exit(1)

    target_process = sys.argv[1]
    module_name = sys.argv[2]
    function_name = sys.argv[3]
    main(target_process, module_name, function_name)

Explanation:

agent.js: This is the JavaScript agent that gets injected into the target process.
- rpc.exports: This makes the hookFunction available for the Python script to call.
- Module.findExportByName: This finds the address of the function within the specified module. In JS environments, modules are a typical way to organize code (e.g. React Native bundles)
- Interceptor.attach: This is the core of Frida’s hooking mechanism. It intercepts calls to the specified function.
- onEnter: This function is called before the original function is executed. We print the arguments passed to the function.
- onLeave: This function is called after the original function is executed. We print the return value.
hook.py: This is the Python script that controls Frida.
- frida.attach: This attaches Frida to the target process.
- session.create_script: This loads the JavaScript agent code.
- script.load: This executes the JavaScript agent code.
- script.exports.hookFunction: This calls the hookFunction in the JavaScript agent.
- The on_message function handles messages sent from the JavaScript agent (e.g., our console logs).
- The sys.stdin.read() keeps the python script running, which keeps the Frida script active.

How to Run:

Make sure the target process (e.g., a React Native app) is running.
Run the Python script:
```
python hook.py <process_name> <module_name> <function_name>
```
Replace <process_name> with the name of the process (e.g., com.example.myapp). Replace <module_name> with the name of the Javascript module, and <function_name> with the name of the function you want to hook.

Example:

Let’s say you have a React Native app called com.example.myapp and a JavaScript module called MyModule with a function called myFunction. You would run:

python hook.py com.example.myapp MyModule myFunction

This will hook the myFunction function, and you’ll see the arguments and return value printed in the console when it’s called.

Hooking Native Functions (C/C++)

Now let’s move on to hooking native functions, which are written in C or C++. This is where Frida’s power truly shines. Native function hooking is essential for analyzing and modifying the behavior of compiled code, which is common in mobile apps and other applications.

// JavaScript agent code (native_agent.js)
rpc.exports = {
    hookNativeFunction: function(moduleName, functionName, argsCount) {
        console.log(`Hooking native function ${moduleName}!${functionName}`);
        try {
            const functionAddress = Module.findExportByName(moduleName, functionName);

            if (!functionAddress) {
                console.log(`Function ${moduleName}!${functionName} not found`);
                return false;
            }

            Interceptor.attach(functionAddress, {
                onEnter: function(args) {
                    console.log(`[${moduleName}!${functionName}] Called!`);
                    for (let i = 0; i < argsCount; i++) {
                        console.log(`  arg[${i}] = ${args[i].toString()}`); // Convert arguments to strings for display
                    }
                    this.startTime = Date.now();
                },
                onLeave: function(retval) {
                    console.log(`[${moduleName}!${functionName}] Returned: ${retval.toString()}`); // Convert return value to string for display
                    console.log(`Execution time: ${Date.now() - this.startTime}ms`);
                }
            });
            return true;
        } catch (e) {
            console.log(`Error hooking ${moduleName}!${functionName}: ${e}`);
            return false;
        }
    },

    replaceNativeFunction: function(moduleName, functionName, replacementCode) {
        console.log(`Replacing native function ${moduleName}!${functionName}`);
        try {
            const functionAddress = Module.findExportByName(moduleName, functionName);

            if (!functionAddress) {
                console.log(`Function ${moduleName}!${functionName} not found`);
                return false;
            }

            Memory.patchCode(functionAddress, 16, replacementCode); // Patch the code

            console.log(`Function ${moduleName}!${functionName} replaced successfully!`);
            return true;
        } catch (e) {
            console.log(`Error replacing ${moduleName}!${functionName}: ${e}`);
            return false;
        }
    }

};

# Python script (native_hook.py)
import frida
import sys
import base64

def on_message(message, data):
    if message['type'] == 'send':
        print("[*] {0}".format(message['payload']))
    else:
        print(message)

def main(target_process, module_name, function_name, args_count, replace=False, replacement_code=None):
    try:
        session = frida.attach(target_process)
        script = session.create_script(open("native_agent.js").read())
        script.on('message', on_message)
        script.load()

        if replace:
            print(f"Replacing {module_name}!{function_name} in {target_process}")
            script.exports.replace_native_function(module_name, function_name, base64.b64decode(replacement_code))
        else:
            print(f"Hooking {module_name}!{function_name} in {target_process}")
            script.exports.hook_native_function(module_name, function_name, int(args_count)) #Calling the export.

        sys.stdin.read() # Keep the script running
    except frida.ProcessNotFoundError as e:
        print(f"Process '{target_process}' not found.  Is it running?")
    except Exception as e:
        print(f"Error: {e}")

if __name__ == '__main__':
    if len(sys.argv) < 5:
        print("Usage: python native_hook.py <process_name> <module_name> <function_name> <args_count> [--replace <replacement_code_base64>]")
        print("Example: python native_hook.py myapp libnative.so my_function 2")
        print("Example: python native_hook.py myapp libnative.so my_function 2 --replace $(echo 'B801000000C3' | base64)") # mov eax, 1; ret
        sys.exit(1)

    target_process = sys.argv[1]
    module_name = sys.argv[2]
    function_name = sys.argv[3]
    args_count = sys.argv[4]
    replace = False
    replacement_code = None

    if len(sys.argv) > 5 and sys.argv[5] == "--replace":
        replace = True
        replacement_code = sys.argv[6]

    main(target_process, module_name, function_name, args_count, replace, replacement_code)

Explanation:

native_agent.js:
- Module.findExportByName: Finds the address of the native function in memory. Note the different syntax; we’re using ! to separate the module and function name, which is a convention for native code.
- Interceptor.attach: Attaches to the native function.
- The onEnter and onLeave functions work similarly to the JavaScript hooking example, but you might need to be more careful with how you access and interpret the arguments and return values, as they are raw memory addresses.
- Memory.patchCode: This function allows you to directly modify the code in memory at the function’s address. This is how you can replace a function’s implementation.
native_hook.py:
- The Python script is similar to the JavaScript hooking example, but it now takes additional arguments for the module name and function name, as well as an argument count for the number of arguments the native function takes. This is crucial for correctly interpreting the arguments within the onEnter function.
- It supports function replacement by patching the code directly with Memory.patchCode. The replacement code is provided as a base64 encoded string.

Important Considerations for Native Hooking:

Address Space Layout Randomization (ASLR): ASLR randomizes the base address of modules in memory each time the process is started. You need to account for this. Frida handles this under the hood with Module.findExportByName, so you don’t need to manually calculate offsets.
Calling Conventions: Native functions use different calling conventions (e.g., x86, x64, ARM, ARM64), which determine how arguments are passed to the function. You need to understand the calling convention to correctly interpret the arguments in your onEnter function. This can be tricky and often requires reverse engineering.
Data Types: You need to know the data types of the arguments and return value to correctly interpret them. For example, a pointer might be represented as a number, but you need to know that it’s a pointer to a string or an object to access its data.

Example (Native):

Let’s say you have a native library called libnative.so in an Android app, and it has a function called calculateSum that takes two integers as arguments.

// Example C code (compiled into libnative.so)
int calculateSum(int a, int b) {
  return a + b;
}

To hook this function, you would run:

python native_hook.py com.example.myapp libnative.so calculateSum 2

This will hook the calculateSum function, and you’ll see the arguments and return value printed in the console when it’s called.

To replace the function to always return 1, you first need to find the machine code for mov eax, 1; ret. Using a disassembler, you would get B8 01 00 00 00 C3 (for x86 architecture). Then, base64 encode this:

echo 'B801000000C3' | base64

This outputs uAEAAAAAzA==. Now run:

python native_hook.py com.example.myapp libnative.so calculateSum 2 --replace uAEAAAAAzA==

Now, whenever calculateSum is called, it will always return 1.

Memory Modification

Frida also allows you to directly read and write to memory. This is useful for changing variables, bypassing security checks, or modifying data structures.

// JavaScript agent code (memory_agent.js)
rpc.exports = {
    readMemory: function(address, length) {
        const buf = Memory.readByteArray(ptr(address), length);
        return Array.from(new Uint8Array(buf)); // Convert to array for JSON serialization
    },

    writeMemory: function(address, data) {
        const buf = new Uint8Array(data);
        Memory.writeByteArray(ptr(address), buf.buffer);
        return true;
    },

    findMemory: function(moduleName, pattern) {
        const module = Process.getModuleByName(moduleName);
        if (!module) {
            console.log(`Module ${moduleName} not found`);
            return null;
        }

        const results = Memory.scanSync(module.base, module.size, pattern);
        if (results.length > 0) {
            return results[0].address.toString(); // Return first match
        }
        return null;
    }
};

# Python script (memory_hook.py)
import frida
import sys
import base64

def on_message(message, data):
    if message['type'] == 'send':
        print("[*] {0}".format(message['payload']))
    else:
        print(message)

def main(target_process):
    try:
        session = frida.attach(target_process)
        script = session.create_script(open("memory_agent.js").read())
        script.on('message', on_message)
        script.load()

        # Example Usage:
        module_name = "libnative.so"
        pattern = "41424344"  # ABCD in hex

        address = script.exports.find_memory(module_name, pattern)

        if address:
            print(f"Found pattern at address: {address}")
            # Read 16 bytes from that address
            data = script.exports.read_memory(address, 16)
            print(f"Read data: {data}")

            # Write new data (example: replace with EFGHIJKL)
            new_data = [0x45, 0x46, 0x47, 0x48, 0x49, 0x4A, 0x4B, 0x4C]  # EFGHIJKL in hex
            script.exports.write_memory(address, new_data)
            print("Memory written successfully!")

            # Verify the change
            data = script.exports.read_memory(address, 16)
            print(f"New data: {data}")
        else:
            print("Pattern not found.")

        sys.stdin.read() # Keep the script running
    except frida.ProcessNotFoundError as e:
        print(f"Process '{target_process}' not found.  Is it running?")
    except Exception as e:
        print(f"Error: {e}")

if __name__ == '__main__':
    if len(sys.argv) != 2:
        print("Usage: python memory_hook.py <process_name>")
        sys.exit(1)

    target_process = sys.argv[1]
    main(target_process)

Explanation:

memory_agent.js:
- Memory.readByteArray: Reads a block of memory at a given address.
- Memory.writeByteArray: Writes a block of memory at a given address.
- Memory.scanSync: Scans a memory region for a specific byte pattern. This is useful for finding variables or data structures in memory.
memory_hook.py:
- The Python script demonstrates how to use the JavaScript agent to read and write memory.
- It first finds a memory address using findMemory with a specified byte pattern.
- Then, it reads the contents of that memory address using readMemory.
- Finally, it writes new data to that memory address using writeMemory, and verifies the change by reading the memory again.

Ethical Considerations and Legality

Now for the serious part. Frida is a powerful tool, and with great power comes great responsibility. Using Frida without proper authorization can have serious consequences, both legal and ethical.

Reverse Engineering: Reverse engineering software is often legal for personal use and research, but distributing modified versions or using Frida to bypass licensing restrictions is generally illegal.
Security Testing: Using Frida for security testing is acceptable if you have explicit permission from the owner of the software or system.
Privacy: Be mindful of privacy regulations (e.g., GDPR, CCPA) when analyzing applications that handle personal data.
Terms of Service: Always review the terms of service of the application you’re analyzing. Some terms of service explicitly prohibit reverse engineering or modification of the software.

Best Practices:

Start Small: Begin with simple hooks and gradually increase complexity.
Use Debugging Tools: Frida has a built-in debugger that can help you troubleshoot your scripts.
Test Thoroughly: Always test your hooks in a safe environment before deploying them to a production system.
Document Your Code: Clearly document your hooks and the reasons for making them.

Troubleshooting

Failed to attach: unable to find process with name...: The process name is incorrect or the process isn’t running.
TypeError: Module.findExportByName(...) is null: The function doesn’t exist or the module name is incorrect.
Segmentation fault: Your hook is causing the target process to crash. This can happen if you’re accessing memory incorrectly or modifying code in a way that causes the program to become unstable.
Stale RPC request: The agent script and the python script are out of sync. Try restarting Frida and the process.
TypeError: cannot call method 'toString' of undefined: The argument or return value is undefined. Make sure you’re correctly handling null values.

Advanced Techniques

Stalker: Frida’s Stalker API allows you to trace the execution of code dynamically. This is helpful for understanding the control flow of a program.
Memory Access Monitoring: Frida can be used to monitor memory access patterns, which can help you identify vulnerabilities or understand how data is being used.
Custom Gadgets: You can create custom "gadgets" (small pieces of code) that are injected into the target process to perform specific tasks.
Code Generation: Frida can be used to generate code dynamically, which can be useful for creating custom exploit payloads.

Conclusion

Frida is a powerful and versatile tool that can be used for a wide range of tasks, from reverse engineering and security testing to debugging and performance analysis. By understanding the basics of Frida and following best practices, you can harness its power to gain valuable insights into the inner workings of software and systems. Remember to use your newfound skills responsibly and ethically.

And that’s a wrap, folks! I hope you found this informative and, dare I say, a little bit fun. Now go forth and hook… responsibly!

发表回复 取消回复

发表回复取消回复